当前位置：文档之家› The interaction of the explicit and the implicit in skill learning A dual-process approach

The interaction of the explicit and the implicit in skill learning A dual-process approach

The Interaction of the Explicit and the Implicit in Skill Learning:

A Dual-Process Approach

Ron Sun

Rensselaer Polytechnic Institute

Paul Slusarz

University of Missouri—Columbia

Chris Terry

University of Alabama

This article explicates the interaction between implicit and explicit processes in skill learning,in contrast to the tendency of researchers to study each type in isolation.It highlights various effects of the interaction on learning (including synergy effects).The authors argue for an integrated model of skill learning that takes into account both implicit and explicit processes.Moreover,they argue for a bottom-up approach (first learning implicit knowledge and then explicit knowledge)in the integrated model.A variety of qualitative data can be accounted for by the approach.A computational model,CLARION,is then used to simulate a range of quantitative data.The results demonstrate the plausibility of the model,which provides a new perspective on skill learning.

The role of implicit learning in skill acquisition and the distinc-tion between implicit and explicit learning have been widely recognized in recent years (see,e.g.,Cleeremans,Destrebecqz,&Boyer,1998;Proctor &Dutta,1995;Reber,1989;Seger,1994;Stadler &Frensch,1998).However,although implicit learning has been actively investigated,complex and multifaceted interaction between the implicit and the explicit and the importance of this interaction have not been universally recognized (though with a few notable exceptions even early on,e.g.,Mathews et al.,1989).1Similar oversight is also evident in computational simulation mod-els of implicit learning (with a few exceptions such as Cleeremans,1993,and Sun,Merrill,&Peterson,2001).

Likewise,in the development of cognitive architectures (e.g.,Anderson,1983,1993;Meyer &Kieras,1997;Newell,1990),the distinction between procedural and declarative knowledge has been adopted by many (Anderson,1983,1993).The distinction maps roughly onto that between explicit and implicit knowledge,because procedural knowledge is generally inaccessible whereas declarative knowledge is generally accessible and thus explicit.However,the focus has been mostly on top-down models (i.e.,learning first explicit knowledge and then implicit knowledge);the

bottom-up direction (i.e.,learning first implicit knowledge and then explicit knowledge or learning both in parallel)has been largely ignored,paralleling the related neglect of the interaction of explicit and implicit processes in the skill acquisition literature.Despite such problems,it has been gaining recognition that it is difficult to find a situation in which only one type of learning is engaged (Mishkin,Malamut,&Bachevalier,1984;Reber,1989;Seger,1994;Sun et al.,2001;Willingham,1998;but see Lewicki,Czyzewska,&Hoffman,1987).Our review of existing data (see the Existence of Interaction section)has indicated that although one can manipulate conditions to emphasize one or the other type,in most situations,both types of learning are involved,with vary-ing amounts of contributions from each.

Many issues arise that we need to examine to better understand the interaction between implicit and explicit processes:

How can we capture implicit and explicit processes in com-putational terms?

How do the two types of knowledge develop alongside each other and influence each other’s development (e.g.,top down versus bottom up)?

How can bottom-up learning be realized computationally?How do the two types of knowledge interact during skilled performance,and what is the impact of that interaction on performance?2

In the following section,Existence of Interaction,we present evidence that points to a complex,multifaceted interaction be-1

By the explicit,we mean processes involving some form of generalized (or generalizable)knowledge that is consciously accessible.2

For example,the synergy of the two may result,as described in Sun et al.(2001).Ron Sun,Cognitive Science Department,Rensselaer Polytechnic Insti-tute;Paul Slusarz,Department of Computer Science,University of Mis-souri—Columbia;Chris Terry,Department of Computer Science,Univer-sity of Alabama.

This work was supported in part by Office of Naval Research Grant N00014-95-1-0440and Army Research Institute Contract DASW01-00-K-0012.Thanks to Helen Gigley,Susan Chipman,Michael Drillings,Paul Gade,and Jonathan Kaplan for their support.Thanks to Ed Merrill,Jeff Shrager,Jack Gelfand,Axel Cleeremans,David Roskos-Ewoldsen,and Robert Mathews for discussions and comments.Todd Peterson developed the initial simulator,and Xi Zhang worked on its enhancement.

Correspondence concerning this article should be addressed to Ron Sun,Cognitive Science Department,Rensselaer Polytechnic Institute,110Eighth Street,Carnegie 302A,Troy,NY 12180.E-mail:rsun@https://www.doczj.com/doc/5d4805331.html,

Psychological Review

159

tween implicit and explicit processes.On the basis of this infor-mation,in the section A Model,we present a theoretical model of skill learning incorporating both types of processes and emphasiz-ing bottom-up learning.In the section Some Details of the Model, we present details of computational implementation of the model. Then,in the Analysis of Interaction section,we show how the model can account for or predict phenomena in skill learning qualitatively.We perform a detailed match of human and model data in the Simulation of Human Skill Learning Data section. Finally,in the General Discussion,we discuss a number of issues and examine existing models of skill learning.The discussion points to the uniqueness of the model and highlights the fact that it synthesizes a variety of data and provides a coherent,plausible interpretation based on the implicit–explicit interaction.

Existence of Interaction

Mathews et al.(1989)proposed that“subjects draw on two different knowledge sources to guide their behavior in complex cognitive tasks;one source is based on their explicit conceptual representation;the second,independent source of information is derived from memory-based processing,which automatically ab-stracts patterns of family resemblance through individual experi-ences”(p.1098).Likewise,Sun(1994)pointed out that“cognitive processes are carried out in two distinct levels with qualitatively different mechanisms”(p.44),although“the two sets of knowl-edge may overlap substantially”(p.44).Reber(1989)pointed out that nearly all complex skills in the real world(as opposed to laboratory settings)involve a mixture of explicit and implicit processes interacting in complex ways(Mishkin et al.,1984; Willingham,1998).The same point can also be found in Sun et al. (2001).Even commonly used implicit learning tasks likely involve a combination of explicit and implicit learning processes(Sun et al.,2001;see also the simulations in the Simulation of Human Skill Learning Data section).

Interaction of implicit and explicit knowledge has been demon-strated in the implicit learning literature.A brief review of three common tasks of implicit learning is as follows.The serial reac-tion time(SRT)tasks(Nissen&Bullemer,1987)tapped subjects’ability to learn a repeating sequence.On each trial,one of the four lights on a screen was illuminated,and subjects were to press the button corresponding to the illuminated light.The lights were shown in a repeating10-trial sequence.It was found that there was a significant reduction in response time to repeating sequences, attributed to the learning of the sequence.However,subjects were sometimes unaware that a repeating sequence was involved.Am-nesic patients with Korsakoff’s syndrome were found to undergo similar learning.Likewise,dynamic control(DC)tasks(Berry& Broadbent,1988)examined subjects’ability to learn a relation between the input and output variables of a controllable system through interacting with the system dynamically.Subjects were required to control an output variable by manipulating an input variable.Although they often did not recognize the relation be-tween input and output explicitly,subjects reached a high level of performance.Similarly,in artificial grammar learning tasks (Reber,1989),subjects were presented strings of letters generated in accordance with a finite state grammar.After memorization, subjects showed an ability to distinguish between new strings that conformed to the grammar and those that did not.Although sub-jects might not be explicitly aware of the underlying grammar (barring some fragmentary knowledge),they performed signifi-cantly beyond the chance level.In all,these tasks share the characteristic of heavily involving implicit processes,3which is also shared by some other tasks(such as some concept learning, automatization,and conditioning tasks).

Various demonstrations of interaction exist in which these tasks were used.For instance,Stanley,Mathews,Buss,and Kotler-Cope (1989)and Berry(1983)found that under some circumstances concurrent verbalization could help to improve subjects’perfor-mance in DC tasks.Reber and Allen(1978)similarly showed in artificial grammar learning that verbalization could help.Ahlum-Heath and DiVesta(1986)also found that verbalization led to better performance in learning the Tower of Hanoi task.In the same vein,although no verbalization was used,Willingham,Nis-sen,and Bullemer(1989)showed that those subjects who demon-strated more awareness of the regularities in the stimuli(i.e.,those who had more explicit knowledge)performed better in an SRT task,which likewise seemed to show the helpful effects of explicit processes.

However,as Reber(1976,1989)pointed out,verbalization and the resulting explicit knowledge might also hamper(implicit) learning under some circumstances.This may happen when too much verbalization induces an overly explicit learning mode in subjects performing a task that is not suitable for learning in an explicit way(e.g.,when learning a complex artificial grammar). Similarly,in a minefield navigation task,Sun,Merrill,and Peter-son(1998,2001)reported that too much verbalization was detri-mental to performance.Roussel(1999)showed that explicit re-flection sometimes hurt performance.Dulaney,Carlson,and Dewey(1984)showed that correct and potentially useful explicit knowledge,when given at an inappropriate time,could hamper learning.

As variously demonstrated by Berry and Broadbent(1984, 1988),Stanley et al.(1989),and Reber,Kassin,Lewis,and Cantor (1980),verbal instructions(given prior to learning)could facilitate or hamper task performance too.One type of instruction was to encourage explicit search for regularities in stimuli that might aid in task performance.For example,Reber et al.(1980)found that depending on ways stimuli were presented,explicit search might help or hamper performance.Owen and Sweller(1985)and Schooler,Ohlsson,and Brooks(1993)found that explicit search hindered learning.Another type of instruction was explicit how-to instructions that specifically informed subjects how the tasks should be performed(including providing detailed information concerning regularities in stimuli).Stanley et al.(1989)and Berry and Broadbent(1988)found that this type of instruction helped to improve performance significantly.See also Boyd and Weistein (2001).

There is evidence that implicit and explicit knowledge may develop independently under some circumstances.Willingham et 3Some have disputed the existence of implicit processes,on the basis of the imperfection and incompleteness of tests for explicit knowledge(e.g., Shanks&St.John,1994).We do not wish to engage in a methodological debate here.We refer readers to Sun et al.(2001)for relevant arguments in terms of the overwhelming evidence for the distinction between implicit and explicit processes.See also the Potential Controversies section.

160SUN,SLUSARZ,AND TERRY

al.(1989)reported data that were consistent with the parallel development of implicit and explicit knowledge.By using two different measures for assessing the two types of knowledge re-spectively,they compared the time course of implicit and explicit learning.It was shown that implicit knowledge might be acquired in the absence of explicit knowledge and vice versa.The data ruled out the possibility that one type of knowledge was always pre-ceded by the other type.Rabinowitz and Goldberg(1995)similarly demonstrated parallel development of procedural and declarative knowledge in an alphabetic arithmetic task.

However,a subject’s performance typically improves faster than explicit knowledge that is verbalized by the subject.For instance, in DC tasks,although the performance of subjects quickly rose to a high level,their verbal knowledge improved far slower:The subjects could not provide usable verbal knowledge until near the end of their training(Stanley et al.,1989).Bowers,Regehr,Bal-thazard,and Parker(1990)also showed delayed acquisition of explicit knowledge.When subjects were given patterns to com-plete,they first showed implicit recognition of proper completion. Their implicit recognition improved over time until eventually an explicit recognition was achieved.This phenomenon was also demonstrated by Reber and Lewis(1977)in artificial grammar learning.Sun et al.(1998,2001)focused on this type of learning in a minefield navigation task.In all of these cases,because explicit knowledge lags behind but improves with implicit knowl-edge,explicit knowledge is in a way“extracted”from implicit knowledge of these tasks(as suggested by Seger,1994;Stanley et al.,1989;Sun,1997).That is,learning of explicit knowledge is through the(delayed)explication of implicit knowledge.

Given the voluminous evidence of complex interaction between implicit and explicit processes,the key questions now are:(a)how do we account for the interaction and(b)how do we explain the demonstrated effects of interaction?

A Model

Below,we discuss a model that incorporates both implicit and explicit processes(Sun,1997;Sun et al.,2001).

Representation

Let us first consider the representations of a possible model incorporating the distinction between implicit and explicit pro-cesses.There is evidence that human action decision making in skilled performance is a largely implicit process(e.g.,Ben-Zur, 1998;Reber,1989;Seger,1994).We note that the inaccessible nature of implicit knowledge is suitably captured by subsymbolic distributed representations provided by a backpropagation network (Rumelhart,McClelland,&the PDP Research Group,1986).This is because representational units in a distributed representation (e.g.,in the hidden layer of a backpropagation network)are capa-ble of accomplishing tasks but are subsymbolic and generally not individually meaningful(see Rumelhart et al.,1986;Sun,1994). This characteristic of distributed representation accords well with the inaccessibility of implicit knowledge.(However,it is generally not the case that distributed representations are not accessible at all,but they are definitely less accessible and not as direct and immediate as localist–symbolic representations.Distributed repre-sentations may be accessed through indirect,transformational pro-cesses.)In contrast,explicit knowledge may be best captured in computational modeling by a symbolic or localist representation (Clark&Karmiloff-Smith,1993),in which each unit is more easily interpretable and has a clearer conceptual meaning.This characteristic captures the property of explicit knowledge being accessible and manipulable(Smolensky,1988;Sun,1994).This radical difference in the representations of the two types of knowl-edge leads to a two-level model,CLARION(which stands for Connectionist Learning with Adaptive Rule Induction ONline; initially proposed in Sun,1997,and Sun,1999),whereby each “level”using one kind of representation captures one correspond-ing type of process(either implicit or explicit).Sun(1994,1995, 1999),Dreyfus and Dreyfus(1987),and Smolensky(1988)pre-sented relevant theoretical arguments for such two-level models. At each level of the model,there may be multiple modules(both action-centered modules and non-action-centered modules; Moscovitch&Umilta,1991;Schacter,1990).At the bottom level, action-centered knowledge is highly modular:A number of back-propagation networks coexist,each adapted to a specific modality, task,or input stimulus type.This is consistent with the well-known modularity claim(Cosmides&Tooby,1994;Fodor,1983; Karmiloff-Smith,1986)and is also similar to Shallice’s(1972) idea of a multitude of“action systems”competing with each other. The non-action-centered modules(at both levels)represent more static,declarative,and generic types of knowledge.The knowledge there includes what is commonly referred to as seman-tic memory(i.e.,general knowledge about the world in a concep-tual,symbolic form;Quillian,1968;Tulving,1972).We do not get into these modules in this work.

The reason for having both action-centered and non-action-centered modules(at each level)is because,as it should be obvious,action-centered knowledge(roughly,procedural knowl-edge)is not necessarily inaccessible(directly)and non-action-centered knowledge(roughly,declarative knowledge)is not nec-essarily accessible(directly).(Although it has been argued by some that all procedural knowledge is inaccessible and all declar-ative knowledge is accessible,such a clean mapping of the two dichotomies is untenable in our view.)

Learning

Only the learning of action-centered knowledge(which is most relevant to skill learning)is dealt with here.The learning of implicit action-centered knowledge at the bottom level can be done in a variety of ways consistent with the nature of distributed representations.In the learning settings in which correct input–output mappings are available,straight backpropagation(a super-vised learning algorithm)can be used for each network(Rumelhart et al.,1986).Such supervised learning procedures require the a priori determination of a uniquely correct output for each input.In the learning settings in which there is no input–output mapping externally provided,reinforcement learning can be used(Watkins, 1989),especially Q-learning(Watkins,1989)implemented with backpropagation networks(see the Same Details of the Model section).Such learning methods are cognitively justified:Shanks (1993)showed that a simple type of skill learning was best captured by associative models(i.e.,neural networks),compared with a variety of rule-based models.Cleeremans(1997)argued that implicit learning could not be captured by symbolic models but could be captured by neural networks.

161

THE INTERACTION OF THE EXPLICIT AND THE IMPLICIT

The action-centered explicit knowledge at the top level can also be learned in a variety of ways in accordance with the localist–symbolic representation used.Because of the representational characteristics,one-shot learning based on hypothesis testing (Bruner,Goodnow,&Austin,1956;Busemeyer &Myung,1992;Mitchell,1998;Nosofsky,Palmeri,&McKinley,1994;Sun et al.,1998,2001)is needed.With such learning,individuals explore the world and dynamically acquire representations and modify them as needed,reflecting the dynamic,ongoing nature of skill learning (Heidegger,1927/1962;Sun et al.,1998,2001;Vygotsky,1962).In so doing,the implicit knowledge already acquired in the bottom level can be used in learning explicit knowledge (through bottom-up learning;Sun et al.,1998,2001).The basic idea of bottom-up learning is as follows:If an action chosen (by the bottom level)is successful (i.e.,it satisfies a certain criterion),then a rule is extracted and set up at the top level.Then,in subsequent interactions with the world,the rule is refined by considering the outcome of applying the rule:If the outcome is successful,the conditions of the rule may be generalized to make it more univer-sal;if the outcome is not successful,then the conditions of the rule should be made more specific and exclusive of the current case.This is an online version of hypothesis testing processes studied (in different contexts)by,for example,Bruner et al.(1956)and Nosofsky et al.(1994).Other types of learning are also possible,such as hypothesis testing without the help of the bottom level (for capturing independent rule learning,as discussed in the previous section).

Thus,the bottom level develops implicit skills using the Q-learning-backpropagation algorithm (QBP,for short;to be ex-plained later),whereas the top level extracts explicit rules using the rule-extraction-refinement algorithm (RER,for short;to be ex-plained later)and possibly others.The learning difference of the two levels is somewhat analogous to that between the corticostria-tal “habit”system and the corticolimbic “memory”system as proposed by Mishkin et al.(1984).See the next section for imple-mentational details.

Here is a list of the basic theoretical hypotheses of the model (see Figure 1):

Representational difference:The two levels use two different types of representations and thus have different degrees of accessibility.

Learning difference:Different learning methods are used for the two levels.

Bottom-up learning:When there is no sufficient a priori knowledge available,learning is bottom up.

Some Details of the Model

As proposed in Sun (1997),a high-level description of the operation of CLARION (see Figure 1)is as follows:

1.Observe the current state x.

Compute in the bottom level the “value”of each of the possible actions (a i s)associated with the state x:Q (x,a 1),Q (x,a 2),...,Q (x,a n ).

Find out all the possible actions (b 1,b 2,...,b m )at the top level,based on the state x (which goes up from the bottom level)and the existing action rules in place at the top level.

4.Choose an appropriate action a stochastically,based on

combining the values of a i s (at the bottom level)and b j s (which are sent down from the top level).5.Perform the action a,and observe the next state y and

(possibly)the reinforcement r.6.Update the bottom level in accordance with the

Q-learning-backpropagation algorithm (or QBP;to be detailed later),based on the feedback information.7.Update the top level using the rule-extraction-refinement

algorithm (or RER,for constructing,refining,and delet-ing rules;to be detailed

later).

Figure 1.The CLARION architecture.

162

SUN,SLUSARZ,AND TERRY

8.Go back to Step 1.

The following implementational details may be skipped on first reading.The reader may choose to go directly to the example in the An Example section.

The Bottom Level

In the bottom level,a Q-value is an evaluation of the “quality”of an action in a given state:Q (x,a )indicates how desirable action a is in state x.We can choose an action based on Q-values.At each step,given the input x,we first compute the Q-values for all the possible actions (i.e.,Q (x,a )for all a s).We then use the Q-values to decide probabilistically on an action to be performed,which is done by a Boltzmann distribution of Q-values:

p ?a ?x ??e Q ?x,a ?/?

￥i e Q ?x,a i ?/?

(1)

Here ?controls the degree of randomness (temperature)of the decision-making process.4

A four-layered connectionist network is used for implementing Q-learning (see the bottom half of Figure 2),in which the first three layers form a (either recurrent or feedforward)backpropa-gation network for computing Q-values and the fourth layer (with only one node)performs the aforementioned Boltzmann decision making.The network is internally subsymbolic and uses implicit representation in accordance with our previous considerations of representational forms.The output of the third layer (i.e.,the output layer of the backpropagation network)indicates the Q-value of each action (represented by an individual node),and the node in the fourth layer determines probabilistically the action to be per-formed based on the Boltzmann distribution.5

To acquire the Q-values,supervised and/or reinforcement learn-ing methods may be applied (as mentioned earlier).Particularly

applicable is the Q-learning algorithm (Watkins,1989;a reinforce-ment learning algorithm).In the algorithm,Q (x,a )estimates the maximum (discounted)total reinforcement that can be received from the current state x on

max

??i ?0

??i r i ?

(2)

where ?is a discount factor that favors reinforcement received sooner relative to that received later and r i is the reinforcement received at step i (which may be none).The updating of Q (x,a )is based on

?Q ?x,a ????r ??e ?y ??Q ?x,a ??,

(3)

where ?is a discount factor,y is the new state resulting from action a in state x,and e (y )?max b Q (y,b ).Thus,the updating is based on the temporal difference in evaluating the current state and the action chosen:In the above formula,Q (x,a )estimates,before action a is performed,the maximum (discounted)total reinforce-ment to be received if action a is performed,and r ??e (y )estimates the maximum (discounted)total reinforcement to be received,after action a is performed;so,their difference (the temporal difference in evaluating an action)enables the learning of Q-values that approximate the maximum (discounted)total https://www.doczj.com/doc/5d4805331.html,ing Q-learning allows sequential behavior to emerge.

Applying Q-learning,the training of the backpropagation net-work is based on minimizing the following error at each step:

err i ??

r ??e ?y ??Q ?x,a ?

if a i ?a 0

otherwise ,

(4)

where i is the index for an output node representing the action a i .On the basis of the above error measures,the backpropagation algorithm is applied to adjust internal weights (which are randomly initialized before training).When a correct input–output mapping is available for a step,the above error measure can be simplified to the desired output minus the actual output:r ?Q (x,a ).This learning process (using Q-learning plus backpropagation)enables the development of implicit skills potentially solely based on exploring the world on a continuous and ongoing basis (without a priori knowledge).

The Top Level

In the top level (see Figure 2),explicit knowledge is captured in a propositional rule form.To facilitate correspondence with the bottom level (and to encourage uniformity and integration;Clark &Karmiloff-Smith,1993),a localist connectionist network is used for implementing these rules (e.g.,Sun,1992),in accordance with our previous considerations.Basically,we translate the structure of

This method is also known as Luce’s choice axiom (Watkins,1989).It is cognitively well justified and found to match psychological data in a variety of domains.5

The calculation of Q-values for the current input with respect to all the possible actions is done in a connectionist fashion through parallel spread-ing activation and is thus highly efficient.Such spreading of activation is assumed to be implicit as,for example,in Hunt and Lansman (1986),Cleeremans and McClelland (1991),and G.Bower

(1996).

Figure 2.The implementation of CLARION (the action-oriented mod-ules).The top level contains localist encoding of propositional rules.The bottom level contains backpropagation networks.The information flows are indicated with arrows.

163

THE INTERACTION OF THE EXPLICIT AND THE IMPLICIT

a set of rules into that of a network.Assume that an input state x

is made up of a number of dimensions(e.g.,x

1,x

,...,x

).Each

dimension can have a number of possible values(e.g.,v

1,v

,...,

v m).6Rules are in the following form:current-state-specifica-tion3action,where the left-hand side is a conjunction of indi-

vidual elements,each of which refers to a dimension x

of the input

state x,specifying a value or a value range(i.e.,x

i?[v i,v i]or x i ?[v i1,v i2]),and the right-hand side is an action recommendation

a.7Each element in the left-hand side is represented by an indi-vidual node.For each rule,a set of links are established,each of which connects a node representing an element in the left-hand side of a rule to the node representing the conclusion in the right-hand side of the rule.Further implementational details of rule representations are irrelevant to the discussion that follows and thus omitted(Sun&Peterson,1998a,contains the full details). Among other algorithms that we developed,we devised the rule-extraction-refinement algorithm(RER)for learning rules us-ing information in the bottom level(to capture the bottom-up learning process):

1.Update the rule statistics(to be explained later).

2.Check the current criterion for rule extraction,generalization,

and specialization:

2.1.If the result is successful according to the current criterion

and there is no rule matching the state and the action

taken,then perform extraction of a new rule:state3

action.Add the extracted rule to the rule network.

2.2.If the result is unsuccessful according to the current

criterion,revise all the matching rules through

specialization:

2.2.1.Remove the matching rules from the rule network.

2.2.2.Add the revised(specialized)versions of the rules

to the rule network.

2.3.If the result is successful according to the current criterion,

then generalize the matching rules through generalization:

2.3.1.Remove the matching rules from the rule network.

2.3.2.Add the generalized rules to the rule network.8 Let us discuss the details of the operations used in the above algorithm and the criteria measuring whether a result is successful (which are used in deciding whether to apply some of these operators).The criteria were mostly determined by an information gain(IG)measure that compares the quality of two candidate rules.

To calculate the IG measure,we do the following.At each step, we examine the following information:(x,y,r,a),where x is the state before action a is performed,y is the new state after an action a is performed,and r is the reinforcement received after action a. On the basis of this information,we update(in Step1of the above algorithm)the positive and negative match counts for each rule condition and each of its minor variations(i.e.,the rule condi-tion?1possible value in one of the input dimensions),both of which are denoted as C,with regard to the action a performed:

That is,PM

(C)(i.e.,positive match)equals the number of times that an input matches the condition C,action a is performed,and

the result is positive;NM

(C)(i.e.,negative match)equals the number of times that an input matches the condition C,action a is performed,and the result is negative.Positivity–negativity may be determined on the basis of domain-specific information such as feedback from environments or information from the bottom level (see,e.g.,Sun&Peterson,1998a,1998b).Each statistic is updated with the following formula:stat:?stat?1(where stat stands for PM or NM);at the end of each episode,it is discounted by stat:?stat?0.90.9On the basis of these statistics,we calculate the IG measure;that is,

IG?A,B??log2

PM a?A??1

PM a?A??NM a?A??2

?log2

PM a?B??1

PM a?B??NM a?B??2,(5) where A and B are two different conditions that lead to the same action a.The measure compares essentially the percentage of positive matches under different conditions A and B(with the Laplace estimator;Lavrac&Dzeroski,1994).If A can improve the percentage to a certain degree over B,then A is considered better than B.In the algorithm,if a rule is better compared with the corresponding match-all rule(i.e.,the rule with the same action but with the condition that matches all possible input states),then the rule is considered successful.10

We decide on whether to construct a rule on the basis of a simple criterion that is determined by the current step(x,y,r,a):

Extraction:If the current step is positive and if there is no rule that covers this step in the top level,set up a rule C3a, where C specifies the values of all the input dimensions exactly as in x.11

However,the criterion for applying the generalization and spe-cialization operators is based on the aforementioned IG measure. Generalization amounts to adding an additional value to one input dimension in the condition of a rule,so that the rule will have more opportunities of matching input,and specialization amounts to removing one value from one input dimension in the condition of a rule,so that it will have less opportunities of matching input. Here are the detailed descriptions of these two operators:

Generalization:If IG(C,all)?threshold1and max

IG(C?,

C)?0,where C is the current condition of a matching rule,

all refers to the corresponding match-all rule(with regard to the same action specified by the rule),and C?is a modified condition such that C??C plus one value(i.e.,C?has one

6Each dimension is either ordinal(discrete or continuous)or nominal.In the following discussion,we focus on ordinal values;nominal values can be handled similarly.A binary dimension is a special case of a discrete ordinal dimension.

7Alternatively,rules can be in the forms of current-state-specifica-tion3action new-state or current-state-specification action3new-state. 8We also merge rules whenever possible:If rule extraction,generaliza-tion,or specialization is performed at the current step,check to see if the conditions of any two rules are close enough and thus if the two rules may be combined:If one rule is covered completely by another,put it on the children list of the other.If one rule is covered by another except for one dimension,produce a new rule that covers both.

9The results are time-weighted statistics,which are useful in nonsta-tionary situations.

10This is a commonly used method,especially in inductive logic pro-gramming,and well justified on the empirical ground.See,for example, Lavrac and Dzeroski(1994).

11Potentially,we might use attention to focus on fewer input dimen-sions,although such attention is not part of the current model.

164SUN,SLUSARZ,AND TERRY

more value in one of the input dimensions;i.e.,if the current

rule is successful and a generalized condition is potentially better),then set C??argmax

IG(C?,C)as the new(gen-eralized)condition of the rule.Reset all the rule statistics.

Any rule covered by the generalized rule will be placed on its children list.12

Specialization:If IG(C,all)?threshold2and max

IG(C?,

C)?0,where C is the current condition of a matching rule,

all refers to the corresponding match-all rule(with regard to the same action specified by the rule),and C?is a modified condition such that C??C minus one value(i.e.,C?has one less value in one of the input dimensions;i.e.,if the current rule is unsuccessful,but a specialized condition is better),

then set C??argmax

IG(C?,C)as the new(specialized) condition of the rule.13Reset all the rule statistics.Restore those rules on the children list of the original rule that are not covered by the specialized rule and the other existing rules.If specializing the condition makes it impossible for a rule to match any input state,delete the rule.

In addition,the density parameter determines the minimum frequency of repetition necessary to keep a rule.For example,if this parameter is set at1/n,then at least one encounter of an input that matches a rule is necessary every n trials to maintain the rule. Otherwise,the rule will not be maintained(and will thus be deleted).

A variation of the above algorithm is independent rule learning (IRL)without using the bottom level for the initial extraction.In IRL,either completely randomly or in a particular domain-specific order(see examples later in the Simulation of Human Skill Learn-ing Data section),rules of various forms are independently gen-erated(hypothesized and wired up)at the top level.Then,these rules are tested through experience using the IG measure.When appropriate,generalization and specialization can also be per-formed on the basis of IG.

Given the explicit knowledge in the form of rules,a variety of explicit operations can be performed at the top level.These oper-ations include backward chaining reasoning,forward chaining reasoning,and counterfactual reasoning(these operations will not be needed for the experiments reported in this article;see Sun et al.,2001,for details).

Combining the Two Levels

In the overall algorithm,Step4is for making the final decision on which action to take at each step by incorporating outcomes from both levels.Specifically,we combine the corresponding values for an action from the two levels by a weighted sum;that is,

if the top level indicates that action a has an activation value v

a (which should be0or1as rules are binary)and the bottom level indicates that a has an activation value q

(the Q-value),then the

final outcome is w

?v a?w2?q a.Stochastic decision making with Boltzmann distribution(based on the weighted sums)is then

performed to select an action out of all the possible actions.(w

1 and w

may be preset or automatically determined through prob-ability matching.)14Stochastic decision making allows different operational modes:for example,relying only on the bottom level, relying only on the top level,or combining the outcomes from both levels and weighing them differently.Figure2shows the imple-mentation of the two levels of the model.

An Example

Let us examine the simulation of a hypothetical SRT task.In an SRT task,a repeating sequence of X marks,each in one of the four possible positions,is presented to subjects(Curran&Keele,1993). The subjects are instructed to press the key corresponding to the position of each X mark.Subjects learn to predict new positions on the basis of preceding positions through learning the sequential relations embedded in the sequence and thereby speed up their responses.

In the model,learning experience at the bottom level promotes the formation of implicit knowledge.Learning is done through iterative weight updating(the procedure of backpropagation was specified before).The resulting weights specify a function relating preceding positions(input)and the current position(output).For example,the iteratively updated weights may end up specifying the sequence123242.The sequence is specified in an implicit way(i.e.,embedded in two layers of weights used for the sigmoi-dal computation within a backpropagation network).

The acquired sequential knowledge at the bottom level can lead to the extraction of explicit knowledge at the top level.The initial extraction step creates a rule that corresponds to the current input and the output(as determined by the bottom level).The general-ization step adds more possible values(that may match new input) to the condition of the rule so that it may have more chances of matching new input.The specialization step,conversely,adds constraints to the rule(by removing possible matching values in the condition of the rule)to make it more specific and less likely to match new input.The applicability of these steps is determined on the basis of the IG measure described before.For example,the following rule may be initially extracted:123234.Generali-zation steps may lead to a simplified rule:*23234(provided that the IG measure calculated allows generalization,where*stands for“don’t care”).Further generalization may lead to*2*234,and so on.Continued revision(generalization and specialization)is likely to happen,as determined by the IG measure(which is in turn determined by the performance of the rule).

Analysis of Interaction

Below,we discuss a number of issues and phenomena observed or hypothesized regarding human skill learning relevant to the implicit–explicit interaction.The validity of our approach,as embodied in CLARION,lies in accounting for,or predicting, many such issues and phenomena.

Dissociation

Can implicit and explicit learning really be separated as two different processes empirically?Human skill learning data indicate that the answer to this question is yes(Cleeremans et al.,1998; 12The children list of a rule is created to keep aside and make inactive those rules that are more specific(thus fully covered)by the current rule. It is useful because if later on the rule is deleted or specialized,some or all of those rules on its children list may be reactivated if they are no longer covered.

13Clearly,we should have threshold2?threshold1to avoid oscillation. 14As shown by Willingham et al.(1989),explicit knowledge can influ-ence skilled performance in humans.

165

THE INTERACTION OF THE EXPLICIT AND THE IMPLICIT

Stadler&Frensch,1998).Berry and Broadbent(1988)demon-strated this point using two DC tasks that differed in the degree to which the pattern of correct responding was salient to subjects.15 Subjects in the two different conditions learned the tasks in dif-ferent ways:Subjects in the nonsalient condition learned the task implicitly,whereas subjects in the salient condition learned the task more explicitly(as demonstrated by the difference in ques-tionnaire scores).Lee(1995)showed a similar difference,but in relation to task complexity instead of saliency.A.Cohen,Ivry,and Keele(1990)described a similar situation in SRT tasks.When complex hierarchical relations were needed to predict next posi-tions,subjects tended to use implicit learning,whereas explicit learning was more evident when simpler relations were involved. In artificial grammar learning tasks,there were also dissociations between learning simple(pairwise)relations and learning complex hierarchical relations(Mathews et al.,1989;Reber,1989).Lee (1995)further demonstrated a reverse dissociation between im-plicit and explicit learning across two conditions:one with explicit instructions and the other without.With instructions,subjects did better on explicit tests than on implicit tests;without instructions, subjects did better on implicit tests than on explicit tests.There were also other corroborative findings to such dissociation results. For example,in learning physics,subjects could show different knowledge in different contexts:They showed incorrect knowl-edge while making explicit judgments;they nevertheless showed correct knowledge while actually performing a task(Seger,1994). Another line of evidence resulted from contrastive manipula-tions of implicit and explicit processes.First,through explicit search instructions,a more(or completely)explicit mode of learn-ing may be used by subjects.The effect of such a mode change varies,depending on task difficulty(i.e.,salience of stimulus materials).In the case of salient relations and/or regularities in stimuli,explicit search may be more successful and thus may improve performance(A.Cohen et al.,1990;Reber et al.,1980). In the case of nonsalient relations,explicit search may fail and thus lead to worsened performance(Berry&Broadbent,1988;Owen& Sweller,1985;see more discussion later).This contrast accentu-ates the differences between the two types of processes:Some tasks are more amenable to explicit processes than others.Second, through verbalization,a more explicit learning mode may likewise be used,as verbalization focuses subjects’attention explicitly on the relations and regularities in stimulus materials and thereby brings them out explicitly.Verbalization usually leads to compa-rable or improved performance(Gagne&Smith,1962;Reber& Allen,1978;Stanley et al.,1989;Sun et al.,1998)but may sometimes lead to worsened performance(Schooler et al.,1993; Sun et al.,2001).Third,through the use of dual-task conditions,a more implicit mode of learning may be attained.This is because, as has been argued in the literature,such manipulation affects explicit processes more than implicit processes(Cleeremans,1993; Dienes&Berry,1997;Hayes&Broadbent,1988;Sun et al.,2001; Szymanski&MacLeod,1996).Dual-task conditions often lead to worsened performance,which may mainly reflect the reduction of contributions from explicit processes(for alternative interpreta-tions,see Frensch,Wenke,and Ruenger,1999;Nissen&Bulle-mer,1987;Stadler,1995;more details later).Moreover,under dual-task conditions,the performance difference between subjects with explicit knowledge and those without may disappear(Curran &Keele,1993).Contrasting these manipulations,we see the role played by explicit processes:Enhancing explicit processes often helps performance,and weakening explicit processes hinders per-formance.Similar changes may be attained by using other methods (e.g.,a wider range of task settings in training instead of a narrow focus on a single task setting),some of which may also serve to illustrate the separation of the two types of processes.There are also methods that involve process dissociation procedures(Destre-becqz&Cleeremans,2001).

Posner,DiGirolamo,and Fernandez-Duque(1997)reviewed evidence from brain imaging research that indicated the possibility of different brain regions being associated with implicit and ex-plicit learning and memory(see also Aizenstein et al.,2000; Poldrack et al.,2001).Such evidence relates biological findings to psychological constructs and lends additional support to the hy-pothesized separation of implicit and explicit processes.Keele, Ivry,Mayr,Hazeltine,and Heuer(2003)reviewed evidence from brain imaging studies that further delineated the nature of two separate learning processes.Mishkin et al.(1984)studied the contrast between(explicit)one-shot learning and(implicit)re-peated habit formation training using brain lesioned primates, which showed physiological separation of these two types of processes.The studies involving brain damaged amnesic patients (such as by Nissen&Bullemer,1987)indicated also that such patients were able to learn as well as normal subjects in task settings where implicit learning was dominant but not in settings where explicit learning was required,which also lent support for the physiological separation of the two types of processes.(How-ever,the results of Boyd&Weistein,2001,and Muente,Panning, Piepenbrock,&Muente,2001,seemed different.)

This above interpretation of human data accords well with the CLARION model.With the model,one can select one type of learning or the other by engaging or disengaging the top level(and its learning mechanisms)or the bottom level(and its learning mechanisms).The question now is how a subject“decides”this on the fly and how the model accomplishes this“decision.”

Division of Labor

A general pattern discernible from human data(especially those from the implicit learning literature)is that if to-be-learned rela-tions are simple and the number of input dimensions is small(in other words,if the relations are salient to subjects),explicit learn-ing usually prevails;if more complex relations and a larger number of input dimensions are involved(Halford,Wilson,&Philips, 1998),implicit learning becomes more prominent(Kemler-Nelson,1984;Mathews et al.,1989;Reber,1989;Seger,1994; Sun et al.,2001).This pattern has been demonstrated in artificial grammar learning tasks,SRT tasks,and DC tasks,as reviewed earlier.Seger(1994)further pointed out that implicit learning was biased toward structures with a system of statistical relations(as indicated by the results of Kersten&Billman,1992,and Stadler, 1992).The upshot of this is that the implicit learning mechanism appears more structurally sophisticated and able to handle more 15In the salient version,the computer responded in accordance with the subjects’immediately preceding response.In the nonsalient version,the computer responded in accordance with the response prior to the immedi-ately preceding response.

166SUN,SLUSARZ,AND TERRY

complex situations(Kemler-Nelson,1984;Lewicki,Hill,& Czyzewska,1992).

This observation can be predicted by CLARION.Although explicitness of knowledge at the top level allows a variety of explicit processing(as described in the A Model section),it does not lend itself easily to the learning of complex structures because of crisp representation and selective hypothesis-testing learning process(see Hayes&Broadbent,1988,regarding selectivity). However,in the bottom level,the distributed representation in the backpropagation network(that incorporates gradedness and tem-poral information through Q-learning)handles complex relations (including complex sequential relations)and high dimensionality better.A specific instance of this complexity difference effect is as follows.In correspondence with the psychological findings that implicit learning of sequences is biased toward sequences with a high degree of statistical structure(Stadler,1992),as has been shown by Elman(1990)and Cleeremans and McClelland(1991), backpropagation networks(either feedforward or recurrent)per-form well in capturing complex sequences(e.g.,in SRT tasks and DC tasks).However,also in correspondence with available human data,the rule learning mechanism at the top level of CLARION has trouble handling complex stochastic sequences,because of the limitations of hypothesis testing and crisp representation(as we demonstrate below in the Simulation of Human Skill Learning Data section).Therefore,in such circumstances,although both levels are present,the bottom level prevails.This correspondence supports the two-level framework of CLARION.(Other instances of this difference abound,such as in terms of number of sequences, number of input dimensions,or distance of dependencies.)

This explanation implies that it is not necessary to deliberately and a priori decide when to use implicit or explicit learning.When complex relations and high dimensionalities are involved and the top level fails to learn(or is slow to learn),then we can expect a reliance on implicit learning at the bottom level.When the stim-ulus materials involved are simple,the top level may handle them better and therefore be more readily utilized.This accords well with the fact that in most situations,both types of learning are involved,with varying amounts of contributions from each(Seger, 1994;Sun et al.,1998,2001).

There have also been other views concerning division of labor, for example,in terms of procedural versus declarative processes (Anderson,1983,1993;Anderson&Lebiere,1998),in terms of nonselective versus selective processing(Hayes&Broadbent, 1988),in terms of algorithms versus instances(Stanley et al., 1989),or in terms of unidimensional versus multidimensional systems(Keele et al.,2003).However,these alternative views are not as generically applicable as the implicit–explicit distinction, although each of them may be supported in specific contexts(see the General Discussion).

Bottom-Up Learning

As reviewed earlier,subjects’ability to verbalize is often inde-pendent of their performance on implicit learning(Berry&Broad-bent,1988).Furthermore,performance typically improves earlier than explicit knowledge that can be verbalized by subjects(Stanley et al.,1989).For instance,in DC tasks,although the performance of subjects quickly rose to a high level,their verbal knowledge improved far slower:Subjects could not provide usable verbal knowledge until near the end of their training(e.g.,as shown by Stanley et al.,1989).This phenomenon has also been demonstrated by Reber and Lewis(1977)in artificial grammar learning.A more recent study of bottom-up learning was performed by Sun et al. (2001),who used a more complex minefield navigation task.In all of these tasks,it appears easier to acquire implicit skills than explicit knowledge(hence the delay in the development of explicit knowledge).In addition,the delay indicates that explicit learning may be triggered by implicit learning,and the process may be describe as delayed explication of implicit knowledge.Explicit knowledge is in a way“extracted”from implicit skills.Explicit learning can thus piggyback on implicit learning.(However,in some tasks,notably in artificial grammar learning,explicit and implicit knowledge appear to be more closely associated;see,e.g., Johnstone&Shanks,2001.)

In the context of discovery tasks,Bowers et al.(1990)also showed evidence of explication of implicit knowledge.When subjects were given patterns to complete,they showed implicit recognition of what a proper completion might be,even though they did not have explicit recognition of a correct completion.The implicit recognition improved over time and eventually an explicit recognition was achieved.Siegler and Stern(1998)also showed in an arithmetic problem that children’s strategy shifts often occurred several trials earlier than their explicit recognition of their strategy changes.Stanley et al.(1989)and Seger(1994)suggested that because explicit knowledge lagged behind but improved along with implicit knowledge,explicit knowledge could be viewed as obtained from implicit knowledge.Cleeremans and McClel-land(1991)also pointed out this possibility in analyzing their data.

Several developmental theorists have considered a similar de-layed explication process in child development.Karmiloff-Smith (1986)suggested that developmental changes involved“represen-tational redescription”:In children,first low-level implicit repre-sentations of stimuli were formed;then,when more knowledge was accumulated and stable behavior patterns developed,through a redescription process,more abstract representations were formed that transformed low-level representations and made them more explicit.This redescription process repeated itself a number of times,and a verbal form of representation emerged.Karmiloff-Smith(1986)proposed four representational forms:implicit,pri-mary explicitation,secondary explicitation,and tertiary explicita-tion.A three-phase process was hypothesized,in which the first phase used only implicit representations and focused on input–output mappings,the second phase focused instead on gaining control over internal representations and resulted in gains in ac-cessibility of representations,and the third phase was characterized by a balance between external information and internal knowledge (thus,a U curve might result).Mandler(1992)proposed a different kind of redescription:From perceptual stimuli,relatively abstract “image-schemas”were extracted that coded several basic types of movements.Then,on top of such image-schemas,concepts were formed using information therein.On the basis of data on percep-tual analysis and categorization in infants,she suggested that an infant gradually formed“theories”of how his or her sensorimotor procedures worked and thereby gradually made such processes explicit and accessible.Although the mechanism of explicit rep-resentations had always been there,it was only with increasingly detailed perceptual analysis that such representations became de-tailed enough to allow conscious access.In a similar vein,Keil (1989)viewed conceptual representations as composed of an as-

167

THE INTERACTION OF THE EXPLICIT AND THE IMPLICIT

sociative component(with frequency and correlational informa-tion;Hasher&Zacks,1979)and a theory component(with explicit knowledge;Murphy&Medin,1985).Developmen-tally,there was a shift from associative to theory-based repre-sentations.In the data concerning learning concepts of both natural and nominal kinds under a variety of conditions,simple similarity-based(or prototype)representations seemed to dom-inate at first,but gradually explicit and focused theories devel-oped and became prominent.Keil(1989)pointed out that it was unlikely that theories developed independently,but rather they developed somehow from associative information already available.These theories and data testify to the ubiquity of the implicit-to-explicit transition.

CLARION captures this kind of bottom-up process.In the model,the bottom level develops implicit skills,on its own,using QBP,whereas the top level extracts explicit rules using RER(see the Some Details of the Model section).Thus,delayed bottom-up learning naturally falls out of the model.

As reviewed earlier,there is also evidence that explicit knowl-edge may develop independently(with little or no correlation with implicit skills),under some circumstances.Willingham et al. (1989)reported data that were consistent with the parallel devel-opment of implicit skills and explicit knowledge.Correspondingly, in CLARION,explicit hypothesis testing can be used for learning rules in the top level,independent of the bottom level(as explained in the A Model section;see also the Simulation of Human Skill Learning Data section),when to-be-learned materials are not too complex.16

Differences in Representation of Resulting Knowledge Although there have been suggestions that implicit and explicit learning and performance may be based on the same knowledge base,most data support the position that the two have separate, though contentwise overlapping,knowledge bases(see Aizenstein et al.,2000;Faulkner&Foster,2002;Poldrack et al.,2001;Sun, 1994,1997).Such a separation was hypothesized in our model(see the A Model section).

Reber(1989)argued that implicit learning functioned by the induction of an underlying tacit representation that mirrored the structure intrinsic to the interaction of subjects with the world. Seger(1994)argued that in implicit learning,subjects developed abstract but instantiated representations.Being instantiated meant that representations were often tied to sensory–motor modalities of stimuli;being abstract meant that knowledge represented could be transferred or generalized to novel stimuli.However,the represen-tation of explicit knowledge is almost universally accepted to be symbolic(Dulaney et al.,1984;Smolensky,1988;Stanley et al., 1989;Sun,1994,1999;Sun et al.,2001).

CLARION predicts these differing characteristics of the two types of representations.At the bottom level,an instantiated rep-resentation is used for the input and the output layer of a back-propagation network(G.Bower,1996).This type of network is also abstract in Seger’s(1994)sense because distributed represen-tations in the network provide a generalization ability(Rumelhart et al.,1986;Sun et al.,1998,2001).The network is also tacit because of the lack of direct interpretability of distributed repre-sentations involved.However,at the top level of CLARION,rules that are more abstract,less tacit,and less tied to specific sensory modalities are learned(because of generalization in rule learning; see the Some Details of the Model section)and represented sym-bolically,which corresponds to subjects’explicit knowledge.

Differences in Accessibility of Resulting Knowledge There have been disagreements concerning what experimentally constitutes accessibility.It is also difficult to distinguish between explicit knowledge that is consciously used when a task is being performed and explicit knowledge that is retroactively attributed to task performance(e.g.,when verbal reports are given;Nisbett& Wilson,1977).Despite such difficulties,it is generally agreed that at least some part of skilled performance is not conscious under normal circumstances(by whatever experimental measures and whatever operational definitions).Reber(1989)pointed out that “although it is misleading to argue that implicitly acquired knowledge is completely unconscious,it is not misleading to argue that implicitly acquired epistemic contents of mind are always richer and more sophisticated than what can be expli-cated”(p.229).

Consistent with the idea of bottom-up learning,Reber(1989) further pointed out,“Knowledge acquired from implicit learning procedures is knowledge that,in some raw fashion,is always ahead of the capability of its possessor to explicate it”(p.229).For example,Mathews et al.(1989)asked their subjects in a dynamic control task to periodically explicate the rules that they used, and the information was then given to yoked subjects who were then tested on the same task(see also Roussel,1999;Stanley et al.,1989).Over time,the original subjects improved their performance as well as their explicit knowledge(as evidenced by the improvement of performance of the yoked subjects). However,yoked subjects never caught up with original sub-jects,thus suggesting both the hypothesis of delayed explication (as discussed before)and that of relative inexplicability of implicit skills.

Such results are explained mechanistically by CLARION,in which delayed explication is the result of bottom-up learning and relative inexplicability of implicit skills is the result of distributed representations in backpropagation networks,which are always richer and more complex than crisp explicit representations used in the top level.

Differences in Flexibility,Generalizability,and

Robustness

It has been argued that implicit learning often results in knowl-edge that is less flexible in some respects.As mentioned above, implicit learning often results in knowledge that is more tied to specifics of learning environments(Dienes&Berry,1997;Seger, 1994),less reflective(e.g.,lacking metaknowledge;Chan,1992; Dienes&Perner,1999),and often less adaptable to changing 16Such hypothesis testing,as mentioned before,is similar to models proposed by Bruner et al.(1956),Haygood and Bourne(1965),and more recently Nosofsky et al.(1994).In artificial intelligence research,many symbolic models for learning rules using hypothesis testing were proposed, such as in Michalski(1983),which may also be adapted for the top level of CLARION.

168SUN,SLUSARZ,AND TERRY

situations(Hayes&Broadbent,1988).(There are different views on this issue;see,e.g.,Willingham et al.,1989,and Willingham, 1998.)On the basis of psycholinguistic data,Karmiloff-Smith (1986)observed that with the growth of explicit representations, more flexibility was shown by subject children.

Consistent with the above view,CLARION predicts that explicit knowledge often entails a higher degree of flexibility in certain respects:With symbolic–localist representations at the top level of CLARION,a variety of explicit manipulations can be performed that are not available to the bottom level.For example,backward and forward chaining reasoning,counterfactual reasoning,explicit hypothesis testing,and so on can be used individually or in combination.These capacities lead to heightened flexibility of the top level in these respects.Thus,CLARION explains the claimed difference in flexibility between the two types of processes in those respects.

As observed in many experiments,following implicit learning, subjects were often able to handle novel stimuli or,in other words, to generalize.In artificial grammar learning,Reber(1967,1976) found good transfer to strings involving different letters but based on the same grammar.Berry and Broadbent(1988)showed that subjects trained on one dynamic control task could transfer to another that had a similar cover story and an identical underlying relation.17As shown by Vokey and Brooks(1992)and others,both similarity between a novel stimulus and learned stimuli and more abstract commonalities between them(such as grammars and common category assignments)were used in generalization. (There were of course also demonstrations of failure of generali-zation after implicit learning.)

The bottom level of CLARION,which contains backpropaga-tion networks,has the ability to capture generalization exhibited in skill learning.Generalization has been amply demonstrated in backpropagation networks in various contexts:Elman(1990)re-ported good generalization by recurrent backpropagation networks in grammar learning;Pollack(1991)found generalization of such networks to arbitrarily long sequences;Cleeremans and McClel-land(1991)and Dienes(1992)modeled data of artificial grammar learning with such networks.As in human learning,generalization in networks is based in part on similarity of old and new sequences but also in part on certain structures exhibited by the sequences. Explicit processes in the top level of CLARION can also gener-alize,albeit following a different style via explicit hypothesis testing.This alternative style of generalization has been investi-gated to some extent by hypothesis-testing psychology,for exam-ple,in studies by Bruner et al.(1956),Haygood and Bourne (1965),and Dominowski(1972).

It has also been argued that implicit processes are often more robust than explicit processes(e.g.,Reber,1989)in the face of internal disorder and malfunctioning.For example,Hasher and Zacks(1979)found that encoding of frequency information(an implicit process)was correctly performed by clinically depressed patients,even though they could not perform explicit tasks.War-rington and Weiskrantz(1982)found that amnesic patients were more successful in performing implicit rather than explicit mem-ory tasks.Implicit processes are also often more robust in the face of dual-task distractions,as shown by Curran and Keele(1993), although they can be affected as well.(In general,this robustness is limited;there have been demonstrations of impaired implicit processes in the literature.)

This view of differing degrees of robustness can be predicted within the dual-representation framework of CLARION:Whereas the top level uses crisp symbolic–localist representations and is thus more vulnerable to malfunctioning,the bottom level uses distributed representations that are more resilient,as demonstrated in connectionist modeling(e.g.,Plaut&Shallice,1994;Rumelhart et al.,1986).However,this difference may not be absolute,and the bottom level may be impaired under many circumstances as well (Plaut&Shallice,1994).

Knowledge Interaction

Interactions of various sorts between implicit and explicit pro-cesses exist.With regard to the use of explicit knowledge to affect implicit processes(top-down information),the existing literature suggests that explicit knowledge may help subjects to learn when it directs subjects to focus on relevant features or when it heightens subjects’sensitivity to relevant information(see,e.g.,Reber et al., 1980,and Howard&Ballas,1980).Explicit knowledge may also help subjects to deal with high-order relations(Premack,1988). However,as Reber(1976,1989)pointed out,explicit knowledge may also hamper implicit learning,especially(a)when explicit prior instructions induce an explicit learning mode in a task that is not suitable for explicit learning(e.g.,Schooler et al.,1993)or(b) when explicit knowledge conflicts with the implicit learning that subjects are undergoing and the implicit representations used by the subjects(Dulaney et al.,1984;Roussel,1999).Owen and Sweller’s(1985)findings that learning might be hampered when certain explicit processes(such as means–ends analysis)were encouraged also supported this idea.(However,Jimenez&Men-dez,2001,argued that implicit learning was unaffected by explicit knowledge.DeShon&Alexander,1996,argued that explicit goal setting could not affect implicit learning.)

We know less about how implicit knowledge is used to affect explicit processes(bottom-up information).We posit that implicit knowledge is used in explicit learning and explicit performance (e.g.,certain verbalization),and this reverse influence,given proper circumstances,can be strong.As indicated in the Division of Labor section,implicit processes often handle more complex relations(Berry&Broadbent,1988;Lewicki et al.,1992)and can thus help explicit processes by providing them with relevant in-formation.Implicit processes are also better at keeping track of statistical information and may use them to aid explicit processes in useful ways(Gluck&Bower,1988;Hasher&Zacks,1979; Lewicki et al.,1987).Bottom-up information becomes pronounced in certain brain damaged patients who are capable of certain tasks only implicitly(e.g.,blind sight or amnesic patients;Schacter, 1990).

However,during explicit tasks(e.g.,verbal reasoning),subjects might ignore information from implicit processes when it contra-dicts their explicit knowledge or explicit mental models(Seger, 1994).For example,Murphy and Medin(1985)demonstrated that concept formation was not merely a feature similarity based(im-plicit)process.Prior theory played an important part and could overwrite feature similarity based decisions.Rips(1989)demon-17However,Berry and Broadbent(1988)also showed that if the cover story was completely different,there was no transfer exhibited by subjects.

169

THE INTERACTION OF THE EXPLICIT AND THE IMPLICIT

strated a similar point.Wisniewski and Medin(1994)further delineated the possible ways in which such interactions might happen.Stanley et al.(1989)interpreted the difficulty their sub-jects had with verbalizing their knowledge used in task perfor-mance as the interference of subjects’explicit,prior domain knowledge(mental models)on the explication of their implicit knowledge.18

As explained in the A Model section,the interaction of explicit and implicit processes is embodied in CLARION with the two-level dual-representation framework and the interlevel interaction mechanism,which allows a proper mixture of the top and the bottom level that captures the“superimposition”of the effects of the two levels.Conflicts between the two types of knowledge occur either when the development of explicit knowledge lags behind that of implicit knowledge(see the Bottom-Up Learning section)or when the top level acquires explicit knowledge inde-pendent of the bottom level(when independent hypothesis testing learning methods are used;see The Top Level section).When conflicts occur,the interlevel interaction mechanism of CLARION may ignore the bottom level(in explicit reasoning)or the top level (in implicit skilled performance),in ways consistent with the above interpretation of human data.

Synergy

Why are there two separate(although interacting)levels? There need to be reasons other than mere redundancy(e.g.,for the sake of fault tolerance).The discussion in the Division of Labor section concludes that there is a division of labor be-tween explicit and implicit processes.We further hypothesize that there may be a synergy between the two types of processes (partially on the basis of dissociation between the two types of processes as discussed in the Dissociation section).Such a synergy may show up,under right circumstances,by speeding up learning,improving learned performance,and facilitating transfer of learned skills.

As mentioned earlier,there is indeed some evidence in support of this hypothesis.In terms of speeding up learning,Willingham et al.(1989)found that those subjects who acquired more explicit knowledge in SRT tasks appeared to learn faster.Stanley et al. (1989)reported that in a DC task,subjects’learning improved if they were asked to generate verbal instructions for other subjects during learning.That is,a subject is able to speed up his or her own learning through an explication process that generates explicit knowledge.Sun et al.(2001)showed a similar effect of verbal-ization in a minefield navigation task,and Reber and Allen(1978) showed a similar effect in artificial grammar learning.Mathews et al.(1989)showed that a better performance could be attained if a proper mix of implicit and explicit learning was used(in their case, through devising an experimental condition in which first implicit learning and later explicit learning was encouraged).19

In addition,in terms of learned performance,Stanley et al. (1989)found that subjects who verbalized while performing SRT tasks were able to attain a higher level of performance than those who did not verbalize,because the requirement that they verbal-ized their knowledge prompted the formation and utilization of explicit knowledge.Squire and Frambach(1990)reported that initially,amnesic and normal subjects performed comparably in a DC task and equally lacked explicit knowledge.However,with more training,normal subjects achieved better performance than amnesic subjects and also better scores on explicit knowledge measures,which pointed to the conjecture that it was because normal subjects were able to learn better explicit knowledge that they achieved better performance.Consistent with this view,Estes (1986)suggested that implicit learning alone could not lead to optimal levels of performance.Even in high-level skill acquisition, similar effects were observed.Gick and Holyoak(1980)found that good problem solvers could better state rules that described their actions in problem solving.A.Bower and King(1967)showed that verbalization improved performance in classification rule learn-ing.Gagne and Smith(1962)showed the same effect of ver-balization in learning the Tower of Hanoi task.This phenom-enon may be related,to some extent,to the self-explanation effect reported in the cognitive skill acquisition literature(Chi, Bassok,Lewis,Reimann,&Glaser,1989):Subjects who ex-plained examples in physics textbooks more completely did better in solving new problems.In all these cases,it could be the explication process and the use of explicit knowledge that helped the performance.

In terms of facilitating transfer of skills,Willingham et al. (1989)showed some suggestive evidence that explicit knowledge facilitates transfer of skilled performance.They reported that(a) subjects who acquired explicit knowledge in a training task tended to have faster response times in a transfer task;(b)these subjects were also more likely to acquire explicit knowledge in the transfer task;and(c)these subjects who acquired explicit knowledge responded more slowly when the transfer task was unrelated to the training task,suggesting that the explicit knowledge of the previ-ous task might have interfered with the performance of the transfer task.Sun et al.(2001)showed similar effects.In high-level do-mains,Ahlum-Heath and DiVesta(1986)found that the subjects who were required to verbalize while solving Tower of Hanoi problems performed better on a transfer task after training than did the subjects who were not required to verbalize.Berry(1983) similarly showed in Watson’s selection task that verbalization during learning improved transfer performance.Nokes and Ohls-son(2001)showed related results as well.

Note that synergy effects are dependent on experimental set-tings.They are not universal.For example,Roussel(1999)showed that under some circumstances,explicit reflection did not help to improve performance and might in fact hurt performance.Even so, it should be recognized that explicit processes play important cognitive functions.Cleeremans and McClelland(1991)and Gib-son,Fichman,and Plaut(1997)both pointed to the need to include explicit processes in modeling typically implicit learning tasks. Explicit processes also serve additional functions,such as facili-tating verbal communication and acting as gatekeepers(e.g.,en-abling conscious veto;Libet,1985).

It has been demonstrated(Sun et al.,1998,2001)that CLARION can produce similar synergy effects as described above 18As discussed by Nisbett and Wilson(1977),such mental models are composed of cultural rules,causal schemata of a particular culture or an

individual,causal hypotheses generated on the fly,and so on.

19This body of evidence,considered in its entirety,cannot be explained away by factors such as attention and effort(e.g.,induced by verbalization).

170SUN,SLUSARZ,AND TERRY

(as well as other effects in different settings),through comparing various training conditions,such as comparing the verbalization condition and the nonverbalization condition(whereby the verbal-ization condition encourages explicit learning)or comparing the dual-task condition and the single-task condition(whereby the dual-task condition discourages explicit learning;see justifications later).Although the idea of synergy has been mentioned before (e.g.,Mathews et al.,1989),CLARION demonstrated in com-putational terms the very process by which synergy was created (Sun et al.,2001;Sun&Peterson,1998a),which had a lot to do with differing characteristics of the two levels(see the Differ-ences in Representation of Resulting Knowledge;Differences in Accessibility of Resulting Knowledge;and Differences in Flex-ibility,Generalizability,and Robustness sections).For the de-tails of the synergy effects,see the next section(and also previous work by Sun et al.,1998,2001;Sun&Peterson, 1998a,1998b).

In summary,we have discussed the following issues:(a)the separation of two types of processes(in the Dissociation and Division of Labor sections)and how CLARION captures it,(b)the conjectured differing characteristics of the two types of processes (in the Differences in Representation of Resulting Knowledge, Differences in Accessibility of Resulting Knowledge;and Differ-ences in Flexibility,Generalizability,and Robustness sections)and how they might correspond with CLARION,(c)the possible ways of interaction between the two types of processes(in the Bottom-Up Learning and Knowledge Interaction sections)and how CLARION accommodates them,and finally,(d)the possible consequences of the interaction(in the Synergy section)and how CLARION also generates them.

Note that in the discussions above,in most cases,our view is but one possible interpretation,and our model corresponds to that particular view.We would caution against oversimplifica-tions of these broad issues and against any mistaken impression that these issues have been settled and our model accounts fully for them.

Simulation of Human Skill Learning Data

To substantiate the afore-discussed points,we conducted simu-lations using CLARION.Two kinds of tasks were chosen to be simulated:SRT tasks and DC tasks.These tasks were chosen because(a)they were representative of the tasks used in implicit learning research and(b)they showed most clearly the interaction between implicit and explicit processes.20We have also simulated a variety of other tasks,reported elsewhere,which are summarized briefly later(see Slusarz&Sun,2001;Sun et al.,2001;Sun& Zhang,2004).

In the simulations of these tasks,the same QBP and RER algorithms were used,at the bottom and the top level,respectively (although some details varied from task to task as appropriate). Although alternative encodings and algorithms(such as recurrent backpropagation in the bottom level)were also possible,we felt that our choices were reasonable ones,consistent with existing theories and data.

We focus on capturing the interaction of the two levels in the human data,whereby the respective contributions of the two levels are discernible through experimental manipulations of learning settings that place differential emphases on the two levels.We show how these data can be captured using an interactive and bottom-up perspective.To capture manipulations,we do the fol-lowing.(a)The explicit(how-to)instructions condition is modeled using explicit encoding of the given knowledge at the top level (prior to training).(b)The verbalization condition(in which sub-jects are asked to explain their thinking while or between perform-ing the task)is captured in simulation through changes in param-eter values that encourage more top-level activities,consistent with the existing understanding of the effect of verbalization(i.e., subjects become more explicit;Stanley et al.,1989;Sun et al., 1998).(c)The explicit search condition(in which subjects are told to perform an explicit search for regularities in stimuli)is captured through changes in parameter values that encourage more reliance on increased top-level rule learning activities,in correspondence with what we normally observe in subjects under the kind of instruction.(d)The dual-task condition is captured by changes in parameter values that reduce top-level activities,because we have reasons to believe that when distracted by a secondary task,top-level mechanisms will be less active in relation to the primary task (i.e.,this condition affects explicit processes more than implicit processes;Cleeremans,1993;Dienes&Berry,1997;Hayes& Broadbent,1988;Sun et al.,2001).21(e)Finally,given same conditions,subjects may differ in terms of the explicitness of their resulting knowledge after training(some are more aware of their knowledge,whereas others are less aware;Curran&Keele,1993; Willingham et al.,1989).Individual differences in degree of awareness are captured by different parameter settings:More aware subjects have parameter settings that allow more rule learn-ing to occur.(f)Many of these afore-enumerated manipulations lead to the synergy effects between implicit and explicit processes. By modeling these manipulations,we capture the synergy effects as well.Note that for modeling each of these manipulations, usually only one(or two)parameter values are changed(more details later).

Many parameters in the model were set uniformly as follows (see Table1):Network weights were randomly initialized between ?0.01and0.01;the Q-value discount rate was set at0.95;the temperature(randomness parameter)for stochastic decision mak-ing was set at0.01;the combination weights of the two levels were set at w

?0.2and w2?0.8.The density parameter was set at 1/50.In this work,these are not free parameters because they were set in an a priori manner(on the basis our experience in our previous work)and not varied to match the human data.(These parameter values are not described again when discussing individ-ual simulations later,except when deviations from the above settings are encountered.)

20In addition,these tasks are different from each other,so that different aspects of the model may be tested.

21Note that there have been alternative interpretations of dual-task conditions,for example,in terms of mixed modality sequences or altered organizations(Keele et al.,2003;Stadler,1995).There have also been arguments in favor of the view that dual-task conditions dampen implicit learning or expression of implicit knowledge as well(Frensch et al.,1999; Nissen&Bullemer,1987).However,despite such differences,it seems reasonable to assume that dual-task conditions affect negatively explicit learning more than implicit learning,judging from the literature(Hayes& Broadbent,1988;Curran&Keele,1993;Cleeremans,1993;Stadler,1995; Szymanski&MacLeod,1996;Dienes&Berry,1997;Cleeremans et al., 1998;Sun et al.,2001);hence,our conjectured change of learning param-eters at the top level is a reasonable approach of first approximation.

171

THE INTERACTION OF THE EXPLICIT AND THE IMPLICIT

Other parameters are domain specific because they are likely adjusted in relation to domain characteristics.For example,the numbers of input,output,and hidden units were set in a domain-specific way.The same goes for the rule learning thresholds,the backpropagation learning rate,and the momentum.22(These pa-rameters are described later with respect to individual tasks.)Most of these parameters were not varied to match different conditions of the same task.

A note about parameters and model fitting is in order here.At first glance,the model has too many parameters.However,on closer examination,the number of parameters is not significantly higher than it is in usual models such as backpropagation net-works.In addition to parameters of backpropagation networks(as in the bottom level of the model),at the top level,there are only three significant parameters concerning rule extraction and revi-sion.That is to say,our model is roughly comparable to back-propagation networks.Moreover,although values of all parame-ters affect performance,most of them were not changed throughout the simulations.Thus they are not free parameters. Even most of the domain-specific parameters were not varied to capture different experimental conditions in a task.Thus,they should not be viewed as free parameters either.They do not contribute to the degree of freedom that we have to match the change of performance of human subjects across different condi-tions(see McClelland,McNaughton,&O’Reilly,1995,for a similar point).The change of performance across different condi-tions in a task is accounted for by changes of one or two param-eters.Finally,it should be noted that our goal is to show the effect of the interaction of the two types of learning(in terms of general trends).Thus,capturing finer details of learning curves and other characteristics is not our goal here.23Many other researchers have taken a similar approach,for example,McClelland et al.(1995).

Simulating SRT Tasks

In this type of task,we aim to capture(a)the degree of awareness effect,(b)the salience difference effect,(c)the explicit (how-to)instructions effect,and(d)the dual-task effect.We use the data reported in Lewicki et al.(1987)and Curran and Keele (1993).

For this type of task,we used a simplified QBP in which temporal credit assignment was not used.This was because sub-jects predicted one position at a time,with immediate feedback, and thus there was no need for backward temporal credit assign-ment.Q(x,a)computes the likelihood of the next position a,given the information concerning the current and past positions x.The actual probability of choosing a as the current prediction(of the next position)is determined on the basis of the Boltzmann distri-bution(described earlier).The error signal used in the simplified QBP is as follows:

?Q?x,a????r??max b?y,b??Q?x,a?????r?Q?x,a??, where x is the input,a is one of the outputs(predictions),r?1if a is the correct prediction,r?0if a is not the correct prediction, and?max

(y,b)is set to zero.24

Simulating Lewicki et al.(1987)

The task.The task was based on matrix scanning:Subjects were to scan a matrix,determine the quadrant of a target digit(the digit6),and respond by pressing the key corresponding to that quadrant.Each block consisted of six identification trials followed by one matrix scanning trial.In identification trials,the target appeared in one of the quadrants,and the subject was to press the corresponding key.In matrix scanning trials,the target was em-bedded among36digits in a matrix,but the subject’s task was the same.See Figure3.In each block of seven trials,the actual location of the target in the seventh(matrix scanning)trial was determined by the sequence of the six preceding identification trials(out of which four were relevant).Twenty-four rules were used to determine the location of the target in the seventh trial. Each of these rules mapped the target quadrants in the six identi-fication trials to the target location in the seventh trials in each block.Twenty-four(out of a total of36)locations were possible for the target to appear.The major dependent variable was the reaction time in the seventh trial in each block.

The whole experiment consisted of48segments,each of which consisted of96blocks(so there were a total of4,608blocks). During the first42segments,the aforementioned rules were used to determine target locations.However,at the43rd segment,a switch occurred that reversed the outcomes of the rules:The upper left was replaced by the lower right,the lower left was replaced by upper right,and so on.The purpose was to separate unspecific learning(e.g.,motor learning)from prediction learning(i.e.,learn-ing to predict the target location in the seventh trial).

The data.The reaction time data of3subjects were obtained by Lewicki et al.(1987;see Figure4).Each curve showed a steady decrease of reaction times up until the switch point.At that point, there was a significant increase of reaction times.After that,the curve gradually lowered again.

The model setup.In this experiment,the simplified QBP was used.The input contained(a sequence of)six elements,with each element having four possible values(for four different quad-

22The last two parameters are partially determined by other domain-specific parameters and thus also set in a domain-specific way.

23Another reason why finer details are not captured here is that we do not have full human data of these tasks.Thus,a detailed statistical analysis of matching between model and human data is impossible,which makes finer matching uninteresting.

24This simplified version is in fact similar to straight backpropagation (except that we only update one of the output at each step—the output that is used as the current prediction).Note that we also tried the full-fledged QBP instead of the simplified version described above.We obtained comparable results from this alternative approach.

Table1

Non-Domain-Specific Parameters Used in CLARION

Parameter Value

Initial weights Between?0.01and0.01

Q-value discount0.95

Temperature0.01

Density1/50

Weights of two levels0.2–0.8

172SUN,SLUSARZ,AND TERRY

rants).25The output contained the prediction of the seventh ele-ment.Thus,24input units (representing six elements,with four values each),24output units (one for each possible location of the seventh element),and 18hidden units were used.The best learning rate was 0.5,with a momentum term of 0.2.The model was trained by presenting the stimulus materials in the same way as in the human experiment of Lewicki et al.(1987),without any further embellishments or repetitions of the materials.

Because in this experiment there were a total of 64possible sequences with each consisting of seven elements,the setting was too complex for subjects to discern the sequence structures explic-itly,as demonstrated through various explicit tests by Lewicki et al.(1987)(although some argued otherwise;see,e.g.,Perruchet &Gallego,1993).Computationally,no rule could be extracted in the model because the large number of sequences entailed the lack of sufficient repetitions of any sequence,which prevented the model from coming up with any rule.The density parameter was set to be 1/50;that is,at least one repetition (of a sequence)was necessary every 50blocks to maintain a rule.In this task,there were 4,608presentations of sequences and there were 64?1,296possible sequences.Thus,on average,the repetition rate of any sequence was significantly below 1/50.Therefore,almost no rule could be maintained.Our simulation thus involved the bottom level of the model (with QBP).26See Figure 5for a summary of parameter values used.

The match.The model was able to produce an error rate curve going downward,resembling Lewicki et al.’s (1987)reaction time curves.See Figure 5(which was averaged over 10runs to ensure representativeness).The model reached 100%accuracy before the switch.

The question is how we should translate error rates into reaction times.One way of translation is through a linear transformation from error rate to reaction time (as argued for by,e.g.,Mehwort,Braun,&Heathcote,1992,and as often used in existing work);that is,RT i ?a ?e i ?b,where RT i is the reaction time (in ms),e i is the error rate,and a and b are two free parameters.One interpretation of linear transformation is that it specifies the time needed by a subject to search for and then respond to a target item and the time needed by a subject to respond to a target item without searching (through correctly predicting its location);that is,

RT i ?ae i ?b ?b ?1?e i ???a ?b ?e i ,

where b is interpreted as the time needed to respond to an item without searching (because 1?e i is the probability of successfully

predicting the location of a target item)and a ?b is interpreted as the time needed to respond to an item by first searching for it and then responding to it.So,instead of relying on an additional power function (as in Ling &Marinov,1994),this method relies only on error rates to account for human reaction times.

Another way of generating reaction time from prediction accu-racy is through adding a power function (as used in previous simulations of this task,such as in Ling &Marinov,1994):

RT i ?t 1?1?e i ??t 2e i ?B ??t ,

where t 1is the time needed to respond when there is no search (using correct predictions),t 2is the time needed to respond when search is necessary,B is the initial motor response time,and ?is the rate at which the motor response time decreases.The third term is meant to capture unspecific practice effects (mostly resulting from motor learning).In other words,in this formula,we separate the motor response time from the search time and the prediction time (as represented by t 1and t 2,respectively).This formula takes into account the independent nature of motor learning,separate from the learning of target prediction.However,it involves two more free parameters.27

Using the linear transformation,we generated three sets of data from the error rate curve earlier,28one for matching each human subject in Lewicki et al.(1987),using different a and b values for each subject.29As shown in Figure 4,the model outcomes fit the human data well up to the point of switching (Segment 41).When the switch to a random sequence happened,the model’s reaction times became much worsened,whereas the subject’s reaction times suffered relatively slightly.30

Therefore,we added the power function,as done in previous simulations of this task (e.g.,Ling &Marinov,1994).The effect of adding the power function was that we reduced the

A sequence of six elements was assumed to be within the capacity of the short-term working memory.26

However,we did try simulations with the top level included.We found no significant difference in terms of match with human data.27

Note that if we set B ?0,we have t 1?b and t 2?a ?b,and thus this equation becomes the same as the previous one.28

When curve fitting,we used Microsoft Excel Solver to find the best parameter values (e.g.,a and b in a ?x ?b )such that the difference between the model data and the subjects’data was minimized.Microsoft Excel Solver uses the generalized reduced gradient nonlinear optimization algorithm (developed by Leon Lasdon,University of Texas at Austin,and Allan Waren,Cleveland State University).29

The error rate curve reported earlier was the best curve and happened to match all 3subjects approximately equally well after the transformation (with different parameters for each subject).Note that Ling and Marinov (1994)also used a single error rate curve to match different subjects with different parameters for transformation.30

We tried many different parameters but discovered that the size of the jump tended to vary little (unless the match as a whole was bad).It is clear,from experiments with different parameter settings,that if the model learns the sequences perfectly before the switch (as is the case with our model),the model data inevitably have huge jumps.However,the more of the sequences it does not learn,the flatter the curve and the smaller the jump.Although this may model Subjects 1and 3satisfac-torily,Subject 2has a large drop in reaction time early on,which is best matched by having the model increase its accuracy in a rapid

manner.

Figure 3.The displays for the identification trials (left)and the matrix scanning trials (right)in the experiments of Lewicki et al.(1987).Adapted from “Unconscious Acquisition of Complex Procedural Knowledge,”by P.Lewicki,M.Czyzewska,and H.Hoffman,1987,Journal of Experimental Psychology:Learning,Memory,and Cognition,13,p.525.Copyright 1987by the American Psychological Association.

173

THE INTERACTION OF THE EXPLICIT AND THE IMPLICIT

174SUN,SLUSARZ,AND TERRY

contribution from the model prediction (i.e.,the error rate e i )while we took into consideration the contribution from the power function.In this way,we obtained a much better match after the switch while maintaining a good match before the switch.This comparison suggested that the amount of benefit human subjects got from their predictions (i.e.,by lowering e i )was,although significant,relatively small.Significant benefit was also gained through the improvement of motor response as represented by the power function.This is the same conclusion reached in previous simulations of this task (e.g.,Ling &Marinov,1994).See Figure 6.

Note that the match between our model and the human data was excellent as measured by the sum-squared https://www.doczj.com/doc/5d4805331.html,pared with Ling and Marinov’s (1994)model,CLARION (with the power function)did better for 2of the 3subjects,using the same param-eter values for transformation as Ling and Marinov did.See

Table

Figure 5.Model parameters used in simulating data from Lewicki et al.(1987;top)and model prediction errors (bottom).RER ?rule-extraction-refinement algorithm;QBP ?Q-learning-backpropagation algorithm.

Figure 4.CLARION’s match of Lewicki et al.’s (1987)data with linear transformation.See Table 2for parameter values.The graphs of the subjects’data are adapted from “Unconscious Acquisition of Complex Procedural Knowledge,”by P.Lewicki,M.Czyzewska,and H.Hoffman,1987,Journal of Experimental Psychology:Learning,Memory,and Cognition,13,p.527.Copyright 1987by the American Psychological Association.

175

THE INTERACTION OF THE EXPLICIT AND THE IMPLICIT

176SUN,SLUSARZ,AND TERRY

2for a comparison.Note that in the table,we used exactly the same parameter settings (for a,b,B,and ?)as did Ling and Marinov.These parameters could be further optimized,which would lead to a slightly better fit but the difference would not be significant.

Discussion.The main feature of the model that the successful match with the human data could be attributed to was the back-propagation learning at the bottom level of the model (and the associated parameters).The top level was insignificant in this task setting.

Simulating Curran and Keele (1993)

The task.The task of Curran and Keele (1993)consisted of presentation of a repeating sequence of X marks,each in one of four possible positions.The subjects were instructed to press the key corresponding to the position of each X mark.Reaction times of the subjects were recorded.The experiment was divided into three phases (in succession):dual-task practice,single-task learn-ing,and dual-task transfer.In the first phase,which consisted of two blocks,the positions of X marks were purely random to allow subjects to get used to the task setting.Each block consisted of 120trials.The second phase was when learning occurred:There were five sequence blocks (Blocks 3,4,5,6,and 8),in which the positions followed a sequential pattern of length 6(e.g.,1,2,3,2,4,3),and one random block (Block 7).The third phase tested transfer to a dual-task condition (with a secondary tone counting task):Three random blocks (Blocks 9,10,and 12)and a single sequence block (Block 11)were presented.A total of 57subjects were tested.

The data.Three groups of subjects were identified in the

analysis of data:“less aware,”“more aware,”and “intentional.”Approximately one third of the subjects were found in each group.The intentional subjects were given explicit instructions about the exact sequence used before learning started.The more aware subjects were those who after the experiment correctly specified at least four out of six positions in the sequence used (which dem-onstrated their explicit knowledge),and the rest were less aware subjects.Analysis of variance (ANOVA;Intentional vs.More Aware vs.Less Aware ?Sequential vs.Random)and other analyses were carried out.The analysis showed that although the intentional group performed the best,the more aware group per-formed close to the intentional group,and both groups performed significantly better than the less aware group.There was a signif-icant interaction between group and block (p ?.01),demonstrat-ing the effect of explicit knowledge.However,during the transfer to the dual-task condition,all three groups performed poorly.There was notably no significant difference between the three groups (random vs.sequence did not interact significantly with group;p ?.67).See Figure 7for the reaction time data for the three groups.The finding of primary interest here is that the difference in explicit knowledge led to the difference in perfor-mance in Phase 2and the performance difference dissipated under the dual-task condition in Phase 3(which was known to suppress mostly explicit knowledge).

The model setup.The simplified QBP was used at the bottom level (as in the simulation of Lewicki et al.,1987).We used a network with 7?6input units and 5output units for handling both the SRT primary task and a secondary tone counting task.The seven groups of input units represented a moving window of seven steps preceding the current step (with seven being roughly the size

Table 2

Parameters (With and Without Power Functions)for Matching Lewicki et al.’s (1987)Reaction Time Data and Goodness of Fit for the CLARION Model (With and Without Power Functions)and Ling and Marinov’s (1994)Model

Subject Parameters (without power functions)

Parameters

(with power functions)a

Goodness of fit (SSE )

CLARION

Ling &Marinov (1994)a b

t 1t 2B ?Without power functions

With power functions

13204301503507000.330.300.070.1428606201503501,6000.33 1.850.250.433

170

530

100

210

800

0.19

0.14

0.05

0.04

Note.SSE ?sum-squared errors.a

As in Ling and Marinov (1994).

Figure 6.CLARION’s and Ling and Marinov’s (1994)matches of Lewicki et al.’s (1987)data with power functions.See Table 2for parameter values.The graphs of the subjects’data are adapted from “Uncon-scious Acquisition of Complex Procedural Knowledge,”by P.Lewicki,M.Czyzewska,and H.Hoffman,1987,Journal of Experimental Psychology:Learning,Memory,and Cognition,13,p.527.Copyright 1987by the American Psychological Association.The graphs of Ling and Marinov’s simulations are adapted from “A Symbolic Model of the Nonconscious Acquisition of Information,”by C.X.Ling and M.Marinov,1994,Cognitive Science,18,p.600.Copyright 1994by the Cognitive Science Society.Adapted with permission.

177

THE INTERACTION OF THE EXPLICIT AND THE IMPLICIT

of the working memory).31In each group,there were 6input units.The first four were used to encode input for the SRT task,one for each possible position of lights.The other two input units were used for the tone counting task (low tone vs.high tone).The output consisted of four units for predicting the next light position and also a unit for indicating whether a tone was detected.See Table 3for a summary of parameter values used.

Rule learning was used at the top level.The criterion for rule extraction was based on “reward”received after each button press,which was determined by whether the prediction made was cor-rect:If it was correct,a reward of 1was given;otherwise,0was

given.If 1was received,a rule might be extracted;otherwise,no rule could be extracted (i.e.,the threshold was 0.999).The posi-tivity criterion used in calculating IG measures was set to be the same.The rule generalization threshold was set at 3.0.The rule specialization threshold was set at 1.0.

The difference between less aware and more aware subjects was captured by the difference in rule learning thresholds.To simulate

We tried increasing the size of the working memory.The results remained essentially the

same.

Figure 7.Top:The reaction time data (in milliseconds)from Curran and Keele (1993).Bottom:The reaction time data (in milliseconds)from CLARION’s simulation.R ?random;S ?sequence.The top panel is adapted from “Attentional and Nonattentional Forms of Sequence Learning,”by T.Curran and S.W.Keele,1993,Journal of Experimental Psychology:Learning,Memory,and Cognition,19,p.192.Copyright 1993by the American Psychological Association.

178

SUN,SLUSARZ,AND TERRY

尊重的素材

尊重的素材（为人处世）思路人与人之间只有互相尊重才能友好相处要让别人尊重自己，首先自己得尊重自己尊重能减少人与人之间的摩擦尊重需要理解和宽容尊重也应坚持原则尊重能促进社会成员之间的沟通尊重别人的劳动成果尊重能巩固友谊尊重会使合作更愉快和谐的社会需要彼此间的尊重名言施与人，但不要使对方有受施的感觉。帮助人，但给予对方最高的尊重。这是助人的艺术，也是仁爱的情操。—刘墉卑己而尊人是不好的，尊己而卑人也是不好的。———徐特立知道他自己尊严的人，他就完全不能尊重别人的尊严。———席勒真正伟大的人是不压制人也不受人压制的。———纪伯伦草木是靠着上天的雨露滋长的，但是它们也敢仰望穹苍。———莎士比亚尊重别人，才能让人尊敬。———笛卡尔谁自尊，谁就会得到尊重。———巴尔扎克人应尊敬他自己，并应自视能配得上最高尚的东西。———黑格尔对人不尊敬，首先就是对自己的不尊敬。———惠特曼

每当人们不尊重我们时，我们总被深深激怒。然而在内心深处，没有一个人十分尊重自己。———马克·吐温忍辱偷生的人，绝不会受人尊重。———高乃依敬人者，人恒敬之。———《孟子》人必自敬，然后人敬之；人必自侮，然后人侮之。———扬雄不知自爱反是自害。———郑善夫仁者必敬人。———《荀子》君子贵人而贱己，先人而后己。———《礼记》尊严是人类灵魂中不可糟蹋的东西。———古斯曼对一个人的尊重要达到他所希望的程度，那是困难的。———沃夫格纳经典素材 1元和200元（尊重劳动成果）香港大富豪李嘉诚在下车时不慎将一元钱掉入车下，随即屈身去拾，旁边一服务生看到了，上前帮他拾起了一元钱。李嘉诚收起一元钱后，给了服务生200元酬金。这里面其实包含了钱以外的价值观念。李嘉诚虽然巨富，但生活俭朴，从不挥霍浪费。他深知亿万资产，都是一元一元挣来的。钱币在他眼中已抽象为一种劳动，而劳动已成为他最重要的生存方式，他的所有财富，都是靠每天20小时以上的劳动堆积起来的。200元酬金，实际上是对劳动的尊重和报答，是不能用金钱衡量的。富兰克林借书解怨（尊重别人赢得朋友）

那一刻我感受到了幸福_初中作文

那一刻我感受到了幸福本文是关于初中作文的那一刻我感受到了幸福，感谢您的阅读！每个人民的心中都有一粒幸福的种子，当它拥有了雨水的滋润和阳光的沐浴，它就会绽放出最美丽的姿态。那一刻，我们都能够闻到幸福的芬芳，我们都能够感受到幸福的存在。在寒假期间，我偶然在授索电视频道，发现(百家讲坛)栏目中大学教授正在解密幸福，顿然引起我的好奇心，我放下了手中的遥控器，静静地坐在电视前，注视着频道上的每一个字，甚至用笔急速记在了笔记本上。我还记得，那位大学教授讲到了一个故事：一位母亲被公司升职到外国工作，这位母亲虽然十分高兴，但却又十分无奈，因为她的儿子马上要面临中考了，她不能撇下儿子迎接中考的挑战，于是她决定拒绝这了份高薪的工作，当有人问她为什么放弃这么好的机会时，她却毫无遗憾地说，纵然我能给予儿子最贵的礼物，优异的生活环境，但我却无当给予他关键时刻的那份呵护与关爱，或许以后的一切会证明我的选择是正确的。听完这样一段故事，我心中有种说不出的感觉，刹那间，我仿拂感觉那身边正在包饺子的妈妈，屋里正在睡觉的爸爸，桌前正在看小说的妹妹给我带来了一种温馨，幸福感觉。正如教授所说的那种解密幸福。就要选择一个明确的目标，确定自已追求的是什么，或许那时我还不能完全诠释幸福。当幸福悄悄向我走来时，我已慢慢明白，懂得珍惜了。那一天的那一刻对我来说太重要了，原本以为出差在外的父母早已忘了我的生日，只有妹妹整日算着日子。我在耳边唠叨个不停，没想到当日我失落地回到家中时，以为心中并不在乎生日，可是眼前的一切，让我心中涌现的喜悦，脸上露出的微笑证明我是在乎的。

爸爸唱的英文生日快乐歌虽然不是很动听，但爸爸对我的那份爱我听得很清楚，妈妈为我做的长寿面，我细细的品尝，吃出了爱的味道。妹妹急忙让我许下三个愿望，嘴里不停的唠叨：我知道你的三个愿望是什么?我问：为什么呀!我们是一家人，心连心呀!她高兴的说。那一刻我才真正解开幸福的密码，感受到了真正的幸福，以前我无法理解幸福，即使身边有够多的幸福也不懂得欣赏，不懂得珍惜，只想拥有更好更贵的，其实幸福比物质更珍贵。那一刻的幸福就是爱的升华，许多时候能让我们感悟幸福不是名利，物质。而是在血管里涌动着的，漫过心底的爱。也许每一个人生的那一刻，就是我们幸运的降临在一个温馨的家庭中，而不是降临在孤独的角落里。家的感觉就是幸福的感觉，幸福一直都存在于我们的身边!

C++中explicit关键字的作用

C++中explicit关键字的作用 explicit用来防止由构造函数定义的隐式转换。要明白它的作用，首先要了解隐式转换：可以用单个实参来调用的构造函数定义了从形参类型到该类类型的一个隐式转换。例如: class things { public: things(const std::string&name = ""): m_name(name),height(0),weight(10){} int CompareTo(const things & other); std::string m_name; int height; int weight; }; 复制代码这里things的构造函数可以只用一个实参完成初始化。所以可以进行一个隐式转换，像下面这样： things a; ................//在这里被初始化并使用。 std::string nm = "book_1"; //由于可以隐式转换，所以可以下面这样使用 int result = https://www.doczj.com/doc/5d4805331.html,pareTo(nm); 复制代码这段程序使用一个string类型对象作为实参传给things的CompareTo函数。这个函数本来是需要一个tings对象作为实参。现在编译器使用string nm来构造并初始化一个things对象，新生成的临时的things对象被传递给CompareTo函数，并在离开这段函数后被析构。这种行为的正确与否取决于业务需要。假如你只是想测试一下a的重量与10的大小之比，这么做也许是方便的。但是假如在CompareTo函数中还涉及到了要除以初始化为0的height 属性，那么这么做可能就是错误的。需要在构造tings之后更改height属性不为0。所以要限制这种隐式类型转换。那么这时候就可以通过将构造函数声明为explicit，来防止隐式类型转换。

关于我的幸福作文八篇汇总

关于我的幸福作文八篇汇总幸福在每个人的心中都不一样。在饥饿者的心中，幸福就是一碗香喷喷的米饭；在果农的心中，幸福就是望着果实慢慢成熟；在旅行者的心中，幸福就是游遍世界上的好山好水。而在我的心中，幸福就是每天快快乐乐，无忧无虑；幸福就是朋友之间互相帮助，互相关心；幸福就是在我生病时，母亲彻夜细心的照顾我。幸福在世间上的每个角落都可以发现，只是需要你用心去感受而已。记得有一次，我早上出门走得太匆忙了，忘记带昨天晚上准备好的钢笔。老师说了：“今天有写字课，必须要用钢笔写字，不能用水笔。”我只好到学校向同学借了。当我来到学校向我同桌借时，他却说：“我已经借别人了，你向别人借吧！”我又向后面的同学借，可他们总是找各种借口说：“我只带了一枝。”问了三四个人，都没有借到，而且还碰了一鼻子灰。正当我急的像热锅上的蚂蚁团团转时，她递给了我一枝钢笔，微笑的对我说：“拿去用吧！”我顿时感到自己是多么幸福！在我最困难的时候，当别人都不愿意帮助我的时候，她向我伸出了援手。幸福也是无时无刻都在身旁。当我生病的时候，高烧持续不退时，是妈妈在旁边细心

的照顾我，喂我吃药，甚至一夜寸步不离的守在我的床边，直到我苏醒。当我看见妈妈的眼睛布满血丝时，我的眼眶在不知不觉地湿润了。这时我便明白我有一个最疼爱我的妈妈，我是幸福的！幸福就是如此简单！不过，我们还是要珍惜眼前的幸福，还要给别人带来幸福，留心观察幸福。不要等幸福悄悄溜走了才发现，那就真的是后悔莫及了！这就是我拥有的幸福，你呢？悠扬的琴声从房间里飘出来，原来这是我在弹钢琴。优美的旋律加上我很强的音乐表现力让一旁姥爷听得如醉如痴。姥爷说我是幸福的，读了《建设幸福中国》我更加体会到了这一点。儿时的姥爷很喜欢读书，但当时家里穷，据姥爷讲那时上学可不像现在。有点三天打鱼两天晒网，等地里农活忙了太姥爷就说：“别去念书了，干地里的活吧。”干活时都是牛马拉车，也没机器，效率特别低。还要给牲口拔草，喂草，拾柴火，看书都是抽空看。等农闲时才能背书包去学校，衣服更是老大穿了，打补丁老二再接着穿，只有盼到过年时才有能换上件粗布的新衣服。写字都是用石板，用一次擦一次，那时还没有电灯，爱学习的姥爷在昏暗的煤油灯下经常被灯火不是烧了眉毛就是燎了头发。没有电灯更没有电视，没有电视更没有见过钢琴，只知道钢琴是贵族家用的。

java关键字

Java 关键字速查表访问控制: private 私有的 protected 受保护的 public 公共的类、方法和变量修饰符abstract 声明抽象 class 类 extends 扩允,继承 final 终极,不可改变的implements实现 interface 接口 native 本地 new 新,创建 static 静态 strictfp 严格,精准synchronized 线程,同步transient 短暂 volatile 易失程序控制语句 break 跳出循环 continue 继续 return 返回 do 运行 while 循环 if 如果 else 反之 for 循环 instanceof 实例 switch 开关 case 返回开关里的结果 default 默认错误处理 catch 处理异常 finally 有没有异常都执行 throw 抛出一个异常对象throws 声明一个异常可能被抛出try 捕获异常包相关 import 引入 package 包

基本类型 boolean 布尔型 byte 字节型 char 字符型 double 双精度, float 浮点 int 整型 long 长整型 short 短整型 null 空 true 真 false 假变量引用 super 父类,超类 this 本类 void 无返回值 java关键字关键字是电脑语言里事先定义的,有特别意义的标识符,有时又叫保留字。Java的关键字对java的编译器有特殊的意义，他们用来表示一种数据类型，或者表示程序的结构等，关键字不能用作变量名、方法名、类名、包名。一个Java语言中的关键字，用在类的声明中来指明一个类是不能被实例化的，但是可以被其它类继承。一个抽象类可以使用抽象方法，抽象方法不需要实现，但是需要在子类中被实现break 一个Java的关键字，用来改变程序执行流程，立刻从当前语句的下一句开始执行从。如果后面跟有一个标签，则从标签对应的地方开始执行case Java语言的关键字，用来定义一组分支选择，如果某个值和switch中给出的值一样，就会从该分支开始执行。catch ：Java的一个关键字，用来声明当try语句块中发生运行时错误或非运行时异常时运行的一个块。char ：Java语言的一个关键字，用来定义一个字符类型 abstract boolean break byte case catch char class continue default do double else extends final finally float for if implements import instanceof int interface long native new package private protected public return short static super switch synchronized this throw throws transient try void volatile while 详细介绍

认识 C++ 中的 explicit 关键字

认识C++ 中的explicit 关键字 (Danny Kalev发表于2004-12-28 11:01:04) 带单一参数的构造函数在缺省情况下隐含一个转换操作符，请看下面的代码： class C { int i; //... public: C(int i);//constructor and implicit conversion operator //as well }; void f() { C c(0); c = 5; //将5 隐式转换为C 对象，然后赋值 } 编译器重新编辑上述例子代码，如下： ////////////////////////////////////////////////////////////////////////////////////////// //"c=5;" 被编译器转换成下面这个样子： ///////////////////////////////////////////////////////////////////////////////////////// C temp(5);// 实例化一个临时对象, c = temp; // 用= 赋值 temp.C::~C(); // temp 的析构函数被激活在很多情况下，这个转换是有意的，并且是正当的。但有时我们不希望进行这种自动的转换，例如： class String { int size; char *p; //.. public: String (int sz); //这里不希望进行隐式转换操作 }; void f () { String s(10); // 下面是一个程序员的编码；发生一个意想不到的转换：

小学生作文《感悟幸福》范文五篇汇总

小学生作文《感悟幸福》范文五篇小草说,幸福就是大地增添一份绿意;阳光说,幸福就是撒向人间的温暖;甘露说,幸福就是滋润每一个生命。下面是为大家带来的有关幸福650字优秀范文，希望大家喜欢。感悟幸福650字1 生活就像一部壮丽的交响曲，它是由一篇一篇的乐章组成的，有喜、有怒、有哀、有乐。每一个人都有自己丰富多彩的生活，我也有自己的生活。我原本以为，吃可口的牛排，打电脑游戏，和朋友开心玩乐就是幸福。可是，我错了，幸福并不仅仅如此。记得有一次，我放学回到家里放下书包就拿起一包饼干来吃。吃着吃着，突然我觉得牙齿痛了起来，而且越来越痛，痛得我连饼干也咬不动了。我放下饼干，连忙去拿了一面镜子来看。原来这又是那一颗虫牙在“作怪”。“哎哟哟，哎哟哟，痛死我了……”我不停地说着。渐渐地，那牙疼得越来越厉害，疼得我坐立不安，直打滚。后来在妈妈的陪伴下去了医院，治好了那颗虫牙。跨出医院大门时，我觉得心情出奇的好，天空格外的蓝，路边的樟树特别的绿，看什么都顺眼，才猛然一悟，幸福是简单而平凡的，身体健康就是一种幸福! 这学期我发现我的英语退步了，我决定要把这门功课学好，于是，我每天回

家做完作业后，都抽出半小时时间复习英语，在课上也听得特别认真，一遇到不懂的题目主动请教老师。经过一段时间的努力，终于，在上次考试的时候，我考了97分。妈妈表扬了我，我心里美滋滋的。我明白了经过自己的努力享受到成功的喜悦，这也是一种幸福。 …… 每个人都无一例外的渴望幸福。不同的人有不同的感受，其实，幸福就是那种能在平凡中寻找欢乐、能在困境中找到自信的一种心境。同学们，幸福其实很简单，就在我们的身边，触手可及。用心去认真地品味吧，它一直未曾离开我们身边! 感悟幸福650字2 有的人认为幸福就是腰缠万贯，有的人认为幸福就是找到意中人，“采菊东篱下，悠然见南山”是陶渊明对邪恶幸福，“从明天起做一个幸福人，喂马、劈柴、周游世界。从明天起，关心蔬菜和粮食，我有一所房子，面朝大海，春暖花开。”这是海子的幸福。一千种人就有一千种对幸福的理解。我对幸福的理解就是幸福使简单而平凡的，是无处不在的! 我的牙疼得奇怪而顽强不是这颗牙疼就是那颗牙疼;不是吃冷的疼就是吃热

关于以幸福为话题的作文800字记叙文5篇

关于以幸福为话题的作文800字记叙文5篇 ----WORD文档，下载后可编辑修改---- 下面是作者为各位家长学生收集整理的作文（日记、观后感等）范本，欢迎借鉴参考阅读，您的努力学习和创新是为了更美好的未来，欢迎下载！以幸福为话题的作文800字记叙文1：那是我生病后的第三天，妈妈从早上五点就起来为我准备早点。她蹑手蹑脚地走着“针步”，下楼煮早点，“啪”的一声，妈妈打开了煤气。在拿肉丝，打鸡蛋的她全然不知我正躲在楼梯口“监视”着她的一举一动。不一会儿，蛋炒好了。她开始切肉丝，一不小心，妈妈的手指切破皮了，鲜血正一滴一滴地流下来，为了不影响我的睡眠，她把手指放在嘴里吸了一下，坚持把剩下的肉丝切完。此时的我，心中犹如打翻了五味瓶，眼里的泪像断了线的珍珠般掉了下来，我再也忍不住了，一个劲地冲到妈妈面前，她赶紧把手背了过去，生怕让我知道了什么。她吃惊地问我：“妈妈太吵了，吵到你了?”“不，不，没有”她见我这么早起来就让我再回去补个觉。我关心地问：“妈，你的手没事吧?”她吱唔着说：“没事，擦破点皮，不碍事!”我仔细地帮她清洗了伤口，贴了一片创可贴。吃饭时，妈妈一直地往我碗里夹肉，“孩子，病刚好，多吃点!”可是我见她始终都没吃一块肉。我也夹了两块放在她的碗里。“儿子懂事了，你自己快点吃吧!补身体要紧!”我冲她点点头笑了笑，“嗯。” 这就是幸福，一份简简单单的幸福!我祈祷这幸福能伴我成长。以幸福为话题的作文800字记叙文2：在我眼中，成长就是记录我们长大过程中一点一滴的小事情的，而幸福就在这点点滴滴中。在我的成长记忆中，永不磨灭的是2017年11月的一天。妈妈要去云南，妈妈早上四点半要到指定地点集合，这么早，妈妈要两三点就起来，可是最近我咳嗽比较严重，所以天天给我煮萝卜汤喝。 “叮铃铃，叮铃铃”闹钟叫了起来，把我从睡梦中吵醒，一醒来，去找妈妈，

阅读教学中“关键词”的作用

阅读教学中“关键词”的作用【摘要】阅读教学在语文教学中占有较大比重，然而，如何有效地指导学生阅读却是我们每个语文老师都迫切需要解决却又无法彻底解决的一个问题。面对这一矛盾，笔者主张抓住一篇文章中的关键词，如题目、文眼、过渡段、小序等“点”，运用“抓关键词”法来指导学生实现轻松、高效地阅读目标。【关键词】关键词教学；阅读教学；选取；实效阅读教学向来是语文教师最为重视却又最不易取得实效的板块，我们往往通过五花八门的技巧来提高学生的阅读能力，殊不知，看似热闹的训练中，学生真正吸收的东西少之又少，如何在语文阅读教学中找到有力的“抓手”，让我们的阅读教学真正切实有效，笔者在教学实践中经过不断地摸索、尝试，认为可以使用“关键词教学”这“一发”来拎起解读文本的“全身”。 1 关键词教学大义教育部2011年最新颁布的《初中语文新课程标准》提出：欣赏文学作品，能有自己的情感体验，初步领悟作品的内涵，从中获得对自然、社会、人生的有益启示。对作品的思想感情倾向，能联系文化背景作出自己的评价；对作品中感人的情境和形象，能说出自己的体验；品味作品中富于表现力的语言。而在我们的语文课堂上，往往有的只是分段落、理大意、记主旨等传统的步骤。学生的阅读兴趣越来越小，阅读效率越来越低，他们对于文本的了解也只是限于教师传授的“死水”而已，很难在文章中得到有益的启示，形成自己的独特体验。于是，语文教学中就形成了学生怕阅读、不阅读的局面。如何在枯燥机械地阅读模式中另辟一条蹊径，无论是对于学生还是教师而言，都显得尤为重要。那么，何谓关键词教学呢？在这里，笔者先简单列举执教的几篇课例供大家参考。在人教版九年级上册的《孤独之旅》中，抓住“孤独”一词，就可以准确地把握全文的脉络，这时候，整个教学就可以划分成清晰的三大步：感受杜小康的孤独——理解孤独的真正含义——学会战胜孤独，当然，看似简单的三个步骤下其实还需要教师精心地备课。然而这样的关键词一旦找出，在课堂上，学生便不再像无头的苍蝇，乱撞、乱飞。例如人教版八年级上册的《湖心亭看雪》，属于文言文。拿到这样的文章，学生有时候连读的兴趣都没有，更不要说读懂、读透了。其实，只要我们换一种教法，引导学生抓住文中的关键字“痴”，就不难体会作者的写作意图了。此外，在古典文学名著《智取生辰纲》一文中，我带领学生着重去品味一个“智”字，将吴用的“智”和杨志的“智”作一个比较，不仅让学生在轻松的氛围中解读了文本，更培养了他们客观评价历史人物的意识。纵观以上几个课例，我们不难发现，无论是所谓的长文短教，还是短文长教，都绕不开文章中的一些“文眼”，准确地找到这个文眼，并利用现有的知识去挖掘它，拓展它，就成了我们语文教师需要去做的功课。那么，如何有效地选取这个阅读教学的关键词，就成了我们语文教师要去思考和斟酌的内容。 2 语文关键词的选取

尊重议论文

谈如何尊重人尊重他人，我们赢得友谊；尊重他人，我们收获真诚；尊重他人，我们自己也获得尊重；相互尊重，我们的社会才会更加和谐. ——题记尊重是对他人的肯定，是对对方的友好与宽容。它是友谊的润滑剂，它是和谐的调节器，它是我们须臾不可脱离的清新空气。“主席敬酒，岂敢岂敢？”“尊老敬贤，应该应该！”共和国领袖对自己老师虚怀若谷，这是尊重；面对许光平女士，共和国总理大方的叫了一声“婶婶”，这种和蔼可亲也是尊重。尊重不仅会让人心情愉悦呼吸平顺，还可以改变陌生或尖锐的关系，廉颇和蔺相如便是如此。将相和故事千古流芳:廉颇对蔺相如不满，处处使难，但蔺相如心怀大局，对廉颇相当的尊重，最后也赢得了廉颇的真诚心，两人结为好友，共辅赵王，令强秦拿赵国一点办法也没有。蔺相如与廉颇的互相尊重，令得将相和的故事千百年令无数后人膜拜。现在，给大家举几个例子。在美国，一个颇有名望的富商在散步时，遇到一个瘦弱的摆地摊卖旧书的年轻人，他缩着身子在寒风中啃着发霉的面包。富商怜悯地将8美元塞到年轻人手中，头也不回地走了。没走多远，富商忽又返回，从地摊上捡了两本旧书，并说：“对不起，我忘了取书。其实，您和我一样也是商人！”两年后，富商应邀参加一个慈善募捐会时，一位年轻书商紧握着他的手，感激地说：“我一直以为我这一生只有摆摊乞讨的命运，直到你亲口对我说，我和你一样都是商人，这才使我树立了自尊和自信，从而创造了今天的业绩??”不难想像，没有那一句尊重鼓励的话，这位富商当初即使给年轻人再多钱，年轻人也断不会出现人生的巨变，这就是尊重的力量啊可见尊重的量是多吗大。大家是不是觉得一个故事不精彩，不够明确尊重的力量，那再来看下一个故事吧！一家国际知名的大企业，在中国进行招聘，招聘的职位是该公司在中国的首席代表。经过了异常激烈的竞争后，有五名年轻人，从几千名应聘者中脱颖而出。最后的胜出者，将是这五个人中的一位。最后的考试是一场面试，考官们都作文话题素材之为人处世篇：尊重思路人与人之间只有互相尊重才能友好相处要让别人尊重自己，首先自己得尊重自己尊重能减少人与人之间的摩擦尊重需要理解和宽容尊重也应坚持原则尊重能促进社会成员之间的沟通尊重别人的劳动成果尊重能巩固友谊尊重会使合作更愉快和谐的社会需要彼此间的尊重名言施与人，但不要使对方有受施的感觉。帮助人，但给予对方最高的尊重。这是助人的艺术，也是仁爱的情操。———刘墉卑己而尊人是不好的，尊己而卑人也是不好的。———徐特立知道他自己尊严的人，他就完全不能尊重别人的尊严。———席勒真正伟大的人是不压制人也不受人压制的。———纪伯伦草木是靠着上天的雨露滋长的，但是它们也敢仰望穹苍。———莎士比亚

super关键字用法

使用super来引用父类的成分，用this来引用当前对象一、super关键字在JAVA类中使用super来引用父类的成分，用this来引用当前对象，如果一个类从另外一个类继承，我们new这个子类的实例对象的时候，这个子类对象里面会有一个父类对象。怎么去引用里面的父类对象呢？使用super来引用，this指的是当前对象的引用，super是当前对象里面的父对象的引用。 1.1.super关键字测试 1package cn.galc.test; 2 3/** 4 * 父类 5 * @author gacl 6 * 7*/ 8class FatherClass { 9public int value; 10public void f() { 11 value=100; 12 System.out.println("父类的value属性值="+value); 13 } 14 } 15 16/** 17 * 子类ChildClass从父类FatherClass继承 18 * @author gacl 19 * 20*/ 21class ChildClass extends FatherClass { 22/**

23 * 子类除了继承父类所具有的valu属性外，自己又另外声明了一个value属性， 24 * 也就是说，此时的子类拥有两个value属性。 25*/ 26public int value; 27/** 28 * 在子类ChildClass里面重写了从父类继承下来的f()方法里面的实现，即重写了f()方法的方法体。 29*/ 30public void f() { 31super.f();//使用super作为父类对象的引用对象来调用父类对象里面的f()方法 32 value=200;//这个value是子类自己定义的那个valu，不是从父类继承下来的那个value 33 System.out.println("子类的value属性值="+value); 34 System.out.println(value);//打印出来的是子类自定义的那个value的值，这个值是200 35/** 36 * 打印出来的是父类里面的value值，由于子类在重写从父类继承下来的f()方法时， 37 * 第一句话“super.f();”是让父类对象的引用对象调用父类对象的f()方法， 38 * 即相当于是这个父类对象自己调用f()方法去改变自己的value 属性的值，由0变了100。 39 * 所以这里打印出来的value值是100。 40*/ 41 System.out.println(super.value); 42 } 43 } 44 45/** 46 * 测试类 47 * @author gacl 48 * 49*/ 50public class TestInherit { 51public static void main(String[] args) { 52 ChildClass cc = new ChildClass();

感受幸福作文(15篇)

感受幸福作文(15篇) 感受幸福作文第1篇：幸福是什么？这是许多同学要问的问题。很小的时候，我就明白钱能够买来一大盒巧克力；钱能够买来玩具汽车；钱能够买许多的美丽的洋娃娃；钱能够买来一个大楼…… 我以为有钱就是幸福。倡我错了，钱虽然能够买来一屋子巧克力，但买了甜蜜，钱虽然能买到房子，可是却买来家庭幸福；钱虽然能买来药，可是却买来健康，钱虽然能买来闹钟，可是买来时间……那时，我又明白了有钱必须幸福。以前，我总是为了一条连衣裙而朝思暮想，盼望有一天能够穿上裙子，去放风筝。那时候，我以为拥有就是幸福。最终有一天，妈妈给我买了这条连衣裙，我高兴的一宿都没有睡觉。可是几天的新鲜劲没有了，穿上裙子后，我并没有什么改变，依然是一个黄毛丫头。于是把它扔到箱子里。几个月后，我又把它翻出来，可是已经小了，穿下了。我又明白了，虽然裙子很美，但都是暂时的，完美的时光总是转瞬消失。 “幸福是什么？”我依然没有感受到。几年后，我在街上看到了一对耄耋老人，他们虽然履蹒跚，可是互相搀扶，有时抬头看看天上的云卷云舒，有时望望西天如血的残阳，她们脸上洋溢着的是满足和幸福。噢，我明白幸福就是真情。虽然他们很穷，可是他们很

相爱。他们彼此珍惜，从感叹世界对他们的公平。往往有的有钱人，他们虽然很有钱，可是他们并幸福，因为他们的心总是被金钱和权势所占据了，根本享受了这天伦之乐。幸福其实很简单，就是和爸爸、妈妈吃一顿饭，和他在一齐聊聊天。感受幸福作文第2篇：夜，悄悄地打开了黑暗，散布着一如既往的宁静，天上的繁星披上了闪装，正对着我的眼，似乎害怕我听到它们之间的悄悄话。知何时，甘寂寞的虫儿起劲地奏起了动听的乐曲，清凉的微风夹杂着泥土的芳香悄悄地将白天的烦闷与喧嚣赶跑。夜，显得更加宁静而诗意了。静静的，左思，右想，就这样静静地坐在楼顶上，感受着夜馈赠我的美妙。就连天上偶尔飘过的云朵，也像是怕惊动了夜的宁静，如绒毛在平静水面滑过般，显得那么轻柔而迷人。今夜独处在空旷的夜空下，感受着夜带给我的美妙，幸福惬意溢满于心。原先自我一向以来苦苦追寻的幸福其实就在自我的身边。以往，有人努力打拼，渴望生活富裕来获得幸福，可一辈子的艰辛拼搏使自我逐渐沦为金钱的奴隶，苦苦追寻的幸福也越寻越远，最终留给自我的是岁月无情地染白的头发。其实，幸福并非是追寻能得到的，幸福是一种感受，仅有用心感受身边的一切，你就能发现，幸福无处在，譬如，管贫

Java super关键字用法源代码

public class HelloWorld { public static void main(String[] args) { Dog d=new Dog(); d.shout(); d.printName(); } } class Animal{ String name="动物"; void shout() { System.out.println("动物发出叫声"); } } class Dog extends Animal{ String name="犬类"; void shout() { super.shout();//使用super关键字访问父类的成员方法} public void printName() { System.out.println("它的名字是name="+https://www.doczj.com/doc/5d4805331.html,); //使用super关键字访问父类的成员变量

} public class HelloWorld { public static void main(String[] args) { Dog d=new Dog(); } } class Animal{ String name="动物"; public Animal(String name){ System.out.println("它是一只"+name); } void shout() { System.out.println("动物发出叫声"); } } class Dog extends Animal{ String name="犬类"; public Dog() { super("二哈");

C 中的EXPLICIT关键字

c++中的explicit关键字用来修饰类的构造函数，表明该构造函数是显式的，既然有"显式"那么必然就有"隐式"，那么什么是显示而什么又是隐式的呢？如果c++类的构造函数有一个参数，那么在编译的时候就会有一个缺省的转换操作：将该构造函数对应数据类型的数据转换为该类对象，如下面所示： class MyClass { public: MyClass(int num); } .... MyClass obj=10;//ok，convert int to MyClass 在上面的代码中编译器自动将整型转换为MyClass类对象，实际上等同于下面的操作： MyClass temp（10）； MyClass obj=temp；上面的所有的操作即是所谓的"隐式转换". 如果要避免这种自动转换的功能，我们该怎么做呢？嘿嘿这就是关键字explicit的作用了，将类的构造函数声明为"显示"，也就是在声明构造函数的时候前面添加上explicit即可，这样就可以防止这种自动的转换操作，如果我们修改上面的MyClass类的构造函数为显示的，那么下面的代码就不能够编译通过了，如下所示： class MyClass { public: explicit MyClass(int num);

} .... MyClass obj=10;//err，can‘t non-explict convert class isbn_mismatch:public std::logic_error{public:explicit isbn_missmatch(const std::string &s):std:logic_error(s){}isbn_mismatch(const std::string&s，const std::string&lhs，const std::string &rhs):std::logic_error(s)，left(lhs)，right(rhs){}const std::string left，right;virtual~isbn_mismatch() throw(){}}; Sales_item&operator+(const Sales_item&lhs，const Sales_item rhs){if(!lhs.same_isbn(rhs)) throw isbn_mismatch("isbn missmatch"，lhs.book()，rhs.book());Sales_item ret(lhs);ret+rhs;return ret;} Sales_item item1，item2，sum;while(cinitem1item2){try{sun=item1+item2;}catch(const isbn_mismatch&e){cerre.what()"left isbn is:"e.left"right isbn is:"e.rightendl;}} 用于用户自定义类型的构造函数，指定它是默认的构造函数，不可用于转换构造函数。因为构造函数有三种：1拷贝构造函数2转换构造函数3一般的构造函数（我自己的术语^_^）另：如果一个类或结构存在多个构造函数时，explicit修饰的那个构造函数就是默认的 class isbn_mismatch:public std::logic_error{public:explicit isbn_missmatch(const std::string &s):std:logic_error(s){}isbn_mismatch(const std::string&s，const std::string&lhs，const std::string &rhs):std::logic_error(s)，left(lhs)，right(rhs){}const std::string left，right;virtual~isbn_mismatch() throw(){}}; Sales_item&operator+(const Sales_item&lhs，const Sales_item rhs){if(!lhs.same_isbn(rhs)) throw isbn_mismatch("isbn missmatch"，lhs.book()，rhs.book());Sales_item ret(lhs);ret+rhs;return ret;}

尊重他人的写作素材

尊重他人的写作素材导读：——学生最需要礼貌著名数学家陈景润回厦门大学参加 60 周年校庆，向欢迎的人们说的第一句话是：“我非常高兴回到母校，我常常怀念老师。”被人誉为“懂得人的价值”的著名经济学家、厦门大学老校长王亚南，曾经给予陈景润无微不至的关心和帮助。陈景润重返母校，首先拜访这位老校长。校庆的第三天，陈景润又出现在向“哥德巴赫猜想”进军的启蒙老师李文清教授家中，陈景润非常尊重和感激他。他还把最新发表的数学论文敬送李教授审阅，并在论文扉页上工工整整写了以下的字：“非常感谢老师的长期指导和培养——您的学生陈景润。”陈景润还拜访了方德植教授，方教授望着成就斐然而有礼貌的学生，心里暖暖的。 ——最需要尊重的人是老师周恩来少年时在沈阳东关模范学校读书期间 , 受到进步教师高盘之的较大影响。他常用的笔名“翔宇”就是高先生为他取的。周恩来参加革命后不忘师恩 , 曾在延安答外国记者问时说：“少年时代我在沈阳读书 , 得山东高盘之先生教诲与鼓励 , 对我是个很大的促进。” 停奏抗议的反思 ——没有礼仪就没有尊重孔祥东是著名的钢琴演奏家。 1998 年 6 月 6 日晚，他在汕头

举办个人钢琴独奏音乐会。演出之前，节目主持人再三强调，场内观众不要随意走动，关掉 BP 机、手提电话。然而，演出的过程中，这种令人遗憾的场面却屡屡发生：场内观众随意走动， BP 机、手提电话响声不绝，致使孔祥东情绪大受干扰。这种情况，在演奏舒曼作品时更甚。孔祥东只好停止演奏，静等剧场安静。然而，观众还误以为孔祥东是在渴望掌声，便报以雷鸣般的掌声。这件事，令孔祥东啼笑皆非。演出结束后，孔祥东说：有个 BP 机至少响了 8 次，观众在第一排来回走动，所以他只得以停奏抗议。 “礼遇”的动力 ——尊重可以让人奋发日本的东芝公司是一家著名的大型企业，创业已经有 90 多年的历史，拥有员工 8 万多人。不过，东芝公司也曾一度陷入困境，土光敏夫就是在这个时候出任董事长的。他决心振兴企业，而秘密武器之一就是“礼遇”部属。身为偌大一个公司的董事长，他毫无架子，经常不带秘书，一个人步行到工厂车间与工人聊天，听取他们的意见。更妙的是，他常常提着酒瓶去慰劳职工，与他们共饮。对此，员工们开始都感到很吃惊，不知所措。渐渐地，员工们都愿意和他亲近，他赢得了公司上下的好评。他们认为，土光董事长和蔼可亲，有人情味，我们更应该努力，竭力效忠。因此，土光上任不久，公司的效益就大力提高，两年内就把亏损严重、日暮途穷的公司重新支撑起来，使东芝成为日本最优秀的公司之一。可见，礼，不仅是调节领导层之间关

感悟幸福(作文)(20篇)

感悟幸福(作文)(20篇) 篇一篇的乐章组成的，喜、怒、哀、乐。每一个人都自我丰富多彩的生活，我也自我的生活。我原本以为，吃可口的牛排，打电脑游戏，和朋友开心玩乐就是幸福。可是，我错了，幸福并不仅仅如此。记得一次，我放学回到家里放下书包就拿起一包饼干来吃。吃着吃着，突然我觉得牙齿痛了起来，并且越来越痛，痛得我连饼干也咬不动了。我放下饼干，连忙去拿了一面镜子来看。原先这又是那一颗虫牙在“作怪”。“哎哟哟，哎哟哟，痛死我了”我不停地说着。渐渐地，那牙疼得越来越厉害，疼得我坐立不安，直打滚。之后在妈妈的陪伴下去了医院，治好了那颗虫牙。跨出医院大门时，我觉得心境出奇的好，天空格外的蓝，路边的樟树异常的绿，看什么都顺眼，才猛然一悟，幸福是简单而平凡的，身体健康就是一种幸福！这学期我发现我的英语退步了，我决定要把这门功课学好，于是，我每一天回家做完作业后，都抽出半小时时间复习英语，在课上也听得异常认真，一遇到不懂的题目主动请教教师。经过一段时间的努力，最终，在上次考试的时候，我考了97分。妈妈表扬了我，我心里美滋滋的。我明白了经过自我的努力享受到成功的喜悦，这也是一种幸福。

每个人都无一例外的渴望幸福。不一样的人不一样的感受，其实，幸福就是那种能在平凡中寻找欢乐、能在困境中找到自信的一种心境。同学们，幸福其实很简单，就在我们的身边，触手可及。用心去认真地品味吧，它一向未曾离开我们身边！感悟幸福(作文) 第8篇：幸福无时没，幸福无处不在。儿时，你的幸福就是当你跌倒失声痛苦时，母亲抱起你，拍打着你背时的那句“别哭。”;成年时，你的幸福就是在你结婚时你身旁的那位拉起你的手问你的那句“你愿意吗”;老年时，你的幸福就是在椅子背后孙子的那句“奶奶我帮你敲背!”。家庭中，幸福是一家子的和乐融融;学校里，幸福是同学间的互相竞争;赛场上，幸福是队友间的默契配合。然而，幸福不全然是唾手可得的，想不付出任何代价而得到幸福，那是神话。张志新说;“苦换来的是知识真理，坚持真理，就应自觉的欣然理解痛苦，那时，也仅那时，痛苦才将化为幸福。”仅心身遭受过痛苦的洗涤，幸福才会来到，痛得越深，苦得越彻，幸福得就越甜。鲁迅说：“穿掘到灵魂的深处，使人受了精神的苦刑而得到的创伤，又即从这得伤和养伤的愈合中，得到苦的涤除，而上了苏生的路。”从而得到了幸福的真谛。拉罗什夫科说过：“幸福后面是灾祸，灾祸后面是幸福。”你想成为幸福的人吗愿你首先吃得

super关键字

如果子类中定义的成员变量和父类中成员变量同名时，子类就隐藏了从父类继承的成员变量。当子类中定义了一个方法，并且这个方法的名字、返回类型、参数个数和类型和父类的某个方法完全相同盟时，子类从父类继承的这个方法将被隐藏。如果在子类中想使用被隐藏的成员变量或方法就可以使用关键字 super。 1 使用关键字super调用父类的构造方法子类不继承父类的构造方法，因此，子类如果想使用父类的构造方法，必须在子类的构造方法中使用，并且必须使用关键字super来表示，而且super必须是子类构造方法中的关一条语句,如例子4.23所示. 例子4.23 class Student { int number; String name; public Student() { } public Student(int number,String name) { this.number=number; https://www.doczj.com/doc/5d4805331.html,=name; System.out.println(" I am "+name+" my number is "+ number); } } class Univer_Student extends Student { boolean marry; public Univer_Student(int number,String name,boolean b) { super(number,name); marry=b; System.out.println("婚否=" + marry); }

} public static void Example4_23 { public static void main(String args[]) { Univer_Student zhang=new Univer_Student(9901,"和晓林",false); } } 运行结果： I am 和晓林my number is 9901 婚否=false; 需要注意的是：如果在子类的构造方法中，没有使用关键字super调用父类的某个构造方法，那么默认有 super(); 语句，即调用父类的不带参数的构造方法。如果类时定义了一个或多个构造方法，那么Java不提供默认的构造方法（不带参数的构造方法），因此，当在父类中定义多个构造方法时，应当包括一个不带参数的构造方法，以防子类省略super时出现错误。

文档之家

The interaction of the explicit and the implicit in skill learning A dual-process approach

尊重的素材

那一刻我感受到了幸福_初中作文

C++中explicit关键字的作用

关于我的幸福作文八篇汇总

java关键字

认识 C++ 中的 explicit 关键字

小学生作文《感悟幸福》范文五篇汇总

关于以幸福为话题的作文800字记叙文5篇

阅读教学中“关键词”的作用

尊重议论文

最新整理高中关于幸福的议论文800字范文3篇

super关键字用法

感受幸福作文(15篇)

Java super关键字用法源代码

C 中的EXPLICIT关键字

尊重他人的写作素材

感悟幸福(作文)(20篇)

super关键字