MRB Nov(2014.10.21-2014.11.20)
- 格式:xls
- 大小:583.50 KB
- 文档页数:4
Discrete OptimizationScheduling flow shops using differential evolution algorithmGodfrey Onwubolu *,Donald DavendraDepartment of Engineering,The University of the South Pacific,P.O.Box 1168,Suva,FijiReceived 17January 2002;accepted 5August 2004Available online 21November 2004AbstractThis paper describes a novel optimization method based on a differential evolution (exploration)algorithm and its applications to solving non-linear programming problems containing integer and discrete variables.The techniques for handling discrete variables are described as well as the techniques needed to handle boundary constraints.In particular,the application of differential evolution algorithm to minimization of makespan ,flowtime and tardiness in a flow shop manufacturing system is given in order to illustrate the capabilities and the practical use of the method.Experiments were carried out to compare results from the differential evolution algorithm and the genetic algorithm,which has a reputation for being very powerful.The results obtained have proven satisfactory in solution quality when compared with genetic algorithm.The novel method requires few control variables,is relatively easy to implement and use,effec-tive,and efficient,which makes it an attractive and widely applicable approach for solving practical engineering prob-lems.Future directions in terms of research and applications are given.Ó2004Elsevier B.V.All rights reserved.Keywords:Scheduling;Flow shops;Differential evolution algorithm;Optimization1.IntroductionIn general,when discussing non-linear programming,the variables of the object function are usually as-sumed to be continuous.However,in practical real-life engineering applications it is common to have the problem variables under consideration being discrete or integer values.Real-life,practical engineering opti-mization problems are commonly integer or discrete because the available values are limited to a set of commercially available standard sizes.For example,the number of automated guided vehicles,the number of unit loads,the number of storage units in a warehouse operation are integer variables,while the size of a pallet,the size of billet for machining operation,etc.,are often limited to a set of commercially available 0377-2217/$-see front matter Ó2004Elsevier B.V.All rights reserved.doi:10.1016/j.ejor.2004.08.043*Corresponding author.Tel.:+679212034;fax:+679302567.E-mail address:onwubolu_g@usp.ac.fj (G.Onwubolu).European Journal of Operational Research 171(2006)674–692/locate/ejorG.Onwubolu,D.Davendra/European Journal of Operational Research171(2006)674–692675 standard sizes.Another class of interesting optimization problem isfinding the best order or sequence in which jobs have to be machined.None of these engineering problems has a continuous objective function; rather each of these engineering problems has either an integer objective function or discrete objective func-tion.In this paper we deal with the scheduling of jobs in aflow shop manufacturing system.Theflow shop scheduling-problem is a production planning-problem in which n jobs have to be pro-cessed in the same sequence on m machines.The assumptions are that there are no machine breakdowns and that all jobs are pre-emptive.This is commonly the case in many manufacturing systems where jobs are transferred from machine to machine by some kind of automated material handling systems.For large problem instances,typical of practical manufacturing settings,most researchers have focused on developing heuristic procedures that yield near optimal-solutions within a reasonable computation time. Most of these heuristic procedures focus on the development of permutation schedules and use makespan as a performance measure.Some of the well-known scheduling heuristics,which have been reported in the literature,include Palmer(1965),Campbell et al.(1970),Gupta(1971),Dannenbring(1977),Hundal and Rajagopal(1988)and Ho and Chang(1991).Cheng and Gupta(1989)and Baker and Scudder(1990)pre-sented a comprehensive survey of research work done inflow shop scheduling.In recent years,a growing body of literature suggests the use of heuristic search procedures for combi-natorial optimization problems.Several search procedures that have been identified as having great poten-tial to address practical optimization problems include simulated annealing(Kirkpatrick et al.,1983), genetic algorithms(Goldberg,1989),tabu search(Glover,1989,1990),and ant colony optimization(Dor-igo,1992).Consequently,over the past few years,several researchers have demonstrated the applicability of these methods,to combinatorial optimization problems such as theflow shop scheduling(see for example, Widmer and Hertz,1989;Ogbu and Smith,1990;Taillard,1990;Chen et al.,1995;Onwubolu,2000).More recently,a novel optimization method based on differential evolution(exploration)algorithm(Storn and Price,1995)has been developed,which originally focused on solving non-linear programming problems containing continuous variables.Since Storn and Price(1995)invented the differential evolution(explora-tion)algorithm,the challenge has been to employ the algorithm to different areas of problems other than those areas that the inventors originally focussed on.Although application of DE to combinatorial optimi-zation problems encountered in engineering is scarce,researchers have used DE to design complex digital filters(Storn,1999),and to design mechanical elements such as gear train,pressure vessels and springs (Lampinen and Zelinka,1999).This paper presents a new approach based on differential evolution algorithm for solving the problem of scheduling n jobs on m machines when all jobs are available for processing and the objective is to minimize the makespan.Other objective functions considered in the present work include meanflowtime and total tardiness.2.Problem formulationAflow shop scheduling is one in which all jobs must visit machines or work centers in the same sequence. Processing of a job must be completed on current machine before processing of the job is started on suc-ceeding machine.This means that initially all jobs are available and that each machine is restricted to pro-cessing only one job at any particular time.Since thefirst machine in the facility arrangement is thefirst to be visited by each job,the other machines are idle and other jobs are queued.Although queuing of jobs is prohibited in just-in-time(JIT)manufacturing environments,flow shop manufacturing continues tofind applications in electronics manufacturing,and space shuttle processing,and has attracted much research work(Onwubolu,2002).Theflow shop can be formatted generally by the sequencing of n jobs on m ma-chines under the precedence condition,with typical objective functions being the minimizing of average flowtime,minimizing the time required to complete all jobs or makespan,minimizing maximum tardiness,and minimizing the number of tardy jobs.If the number of jobs is relatively small,then the problem can be solved without using any generic optimizing algorithm.Every possibility can be checked to obtain results and then sequentially compared to capture the optimum value.But,more often,the number of jobs to be processed is large,which leads to big-O order of n !Consequently,some kind of algorithm is essential in this type of problem to avoid combinatorial explosion.The standard three-field notation (Lawler et al.,1995)used is that for representing a scheduling problem as a j b j F (C ),where a describes the machine environment,b describes the deviations from standard sched-uling assumptions,and F (C )describes the objective C being optimized.In the work reported in this paper,we are solving the n /m /F k F (C max )problem.Other problems solved include F ðC Þ¼F ðP C i Þand F ðC Þ¼F ðP T j Þ.Here a =n /m /F describes the multiple-machines flow shop problem,b =null,and F ðC Þ¼F ðC max ;P C i ;and P T j Þfor makespan,mean flowtime,and total tardiness,respectively.Stating these problem descriptions more elaborately,the minimization of completion time (makespan)for a flow shop schedule is equivalent to minimizing the objective function I :I ¼X n j ¼1C m ;j ;ð1Þs :t :C i ;j ¼max C i À1;j ;C i ;j À1ÀÁþP i ;j ;ð2Þwhere C m ,j =the completion time of job j ,C 1,1=k (any given value),C i ;j ¼P j k ¼1C 1;k ;C j ;i ¼P i k ¼1C k ;1,i )machine number,j )job in sequence,P i ,j )processing time of job j on machine i .For a given sequence,the mean flowtime,MFT =1P m i ¼1P n j ¼1c ij ,while the condition for tardiness is c m ,j >d j .The constraint of Eq.(2)applies to these two problem descriptions.3.Differential evolutionThe differential evolution (exploration)[DE]algorithm introduced by Storn and Price (1995)is a novel parallel direct search method,which utilizes NP parameter vectors as a population for each generation G .DE can be categorized into a class of floating-point encoded,evolutionary optimization algorithms .Currently,there are several variants of DE.The particular variant used throughout this investigation is the DE/rand/1/bin scheme.This scheme will be discussed here and more detailed descriptions are provided (Storn and Price,1995).Since the DE algorithm was originally designed to work with continuous variables,the opti-mization of continuous problems is discussed first.Handling discrete variables is explained later.Generally,the function to be optimized,I ,is of the form I ðX Þ:R D !R .The optimization target is to minimize the value of this objective function I ðX Þ,min ðI ðX ÞÞ;ð3Þby optimizing the values of its parameters X ={x 1,x 2,...,x D },X 2R D ,where X denotes a vector composed of D objective function ually,the parameters of the objective function are also subject to lower and upper boundary constraints,x (L )and x (U ),respectively,x ðL Þj P x j P x ðU Þj8j 2½1;D :ð4Þ3.1.InitializationAs with all evolutionary optimization algorithms,DE works with a population of solutions,not with a sin-gle solution for the optimization problem.Population P of generation G contains NP solution vectors called individuals of the population and each vector represents potential solution for the optimization problem 676G.Onwubolu,D.Davendra /European Journal of Operational Research 171(2006)674–692P ðG Þ¼X ðG Þi ¼x ðG Þj ;i ;i ¼1;...;NP ;j ¼1;...;D ;G ¼1;...;G max :ð5ÞIn order to establish a starting point for optimum seeking,the population must be initialized.Often there is no more knowledge available about the location of a global optimum than the boundaries of the problem variables.In this case,a natural way to initialize the population P (0)(initial population)is to seed it with random values within the given boundary constraints:P ð0Þ¼x ð0Þj ;i ¼x ðL Þj þrand j ½0;1 Âx ðU Þj Àx ðL Þj 8i 2½1;NP ;8j 2½1;D ;ð6Þwhere rand j [0,1]represents a uniformly distributed random value that ranges from zero to one.3.2.MutationThe self-referential population recombination scheme of DE is different from the other evolutionary algorithms.From the first generation onward,the population of the subsequent generation P (G +1)is obtained on the basis of the current population P (G ).First a temporary or trial population of candidate vectors for the subsequent generation,P 0ðG þ1Þ¼V ðG þ1Þ¼v ðG þ1Þj ;i ,is generated as follows:v ðG þ1Þj ;i ¼x ðG Þj ;r 3þF Âx ðG Þj ;r 1Àx ðG Þj ;r 2 ;if rand j ½0;1 <CR _j ¼k ;x ðG Þi ;j ;otherwise ;8<:ð7Þwhere i 2[1,NP];j 2[1,D ],r 1,r 2,r 32[1,NP],randomly selected,except:r 15r 25r 35i ,k =(int(rand i [0,1]·D )+1),and CR 2[0,1],F 2(0,1].Three randomly chosen indexes,r 1,r 2,and r 3refer to three randomly chosen vectors of population.They are mutually different from each other and also different from the running index i .New random values for r 1,r 2,and r 3are assigned for each value of index i (for each vector).A new value for the random num-ber rand[0,1]is assigned for each value of index j (for each vector parameter).3.3.CrossoverThe index k refers to a randomly chosen vector parameter and it is used to ensure that at least one vector parameter of each individual trial vector V (G +1)differs from its counterpart in the previous generation X (G ).A new random integer value is assigned to k for each value of the index i (prior to construction of each trial vector).F and CR are DE control parameters.Both values remain constant during the search process.Both values as well as the third control parameter,NP (population size),remain constant during the search pro-cess.F is a real-valued factor in range [0.0,1.0]that controls the amplification of differential variations.CR is a real-valued crossover factor in the range [0.0,1.0]that controls the probability that a trial vector will be selected form the randomly chosen,mutated vector,V ðG þ1Þj ;i instead of from the current vector,x ðG Þj ;i .Gener-ally,both F and CR affect the convergence rate and robustness of the search process.Their optimal values are dependent both on objective function characteristics and on the population size,ually,suitable values for F ,CR and NP can be found by experimentation after a few tests using different values.Practical advice on how to select control parameters NP,F and CR can be found in Storn and Price (1995,1997).3.4.SelectionThe selection scheme of DE also differs from the other evolutionary algorithms.On the basis of the cur-rent population P (G )and the temporary population P 0(G +1),the population of the next generation P (G +1)is created as follows:G.Onwubolu,D.Davendra /European Journal of Operational Research 171(2006)674–692677XðGþ1Þi ¼VðGþ1Þi;if I VðGþ1Þi6IðXðGÞiÞ;XðGÞi;otherwise:8<:ð8ÞThus,each individual of the temporary or trial population is compared with its counterpart in the current population.The one with the lower value of cost-function IðXÞto be minimized will propagate the pop-ulation of the next generation.As a result,all the individuals of the next generation are as good or better than their counterparts in the current generation.The interesting point concerning the DE selection scheme is that a trial vector is only compared to one individual vector,not to all the individual vectors in the cur-rent population.3.5.Boundary constraintsIt is important to notice that the recombination operation of DE is able to extend the search outside of the initialized range of the search space(Eqs.(6)and(7)).It is also worthwhile to notice that sometimes this is a beneficial property in problems with no boundary constraints because it is possible tofind the optimum that is located outside of the initialized range.However,in boundary-constrained problems,it is essential to ensure that parameter values lie inside their allowed ranges after recombination.A simple way to guarantee this is to replace parameter values that violate boundary constraints with random values generated within the feasible range:uðGþ1Þj;i ¼xðLÞjþrand j½0;1 ÂðxðUÞjÀxðLÞjÞ;if uðGþ1Þj;i<xðLÞj_uðGþ1Þj;i>xðUÞj;uðGþ1Þi;j;otherwise;(ð9Þwhere i2[1,NP];j2[1,D].This is the method that was used for this work.Another simple but less efficient method is to reproduce the boundary constraint violating values according to Eq.(7)as many times as is necessary to satisfy the boundary constraints.Yet another simple method that allows bounds to be approached asymptotically while minimizing the amount of disruption that results from resetting out of bound values(Price,1999) isuðGþ1Þj;i ¼ðxðGÞj;iþxðLÞjÞ=2;if uðGþ1Þj;i<xðLÞj;ðxðGÞj;iþxðUÞjÞ=2;if uðGþ1Þj;i>xðUÞj;uðGþ1Þj;i;otherwise:8>><>>:ð10Þ3.6.Conventional technique for integer and discrete optimization by DESeveral approaches have been used to deal with discrete variable optimization.Most of them round offthe variable to the nearest available value before evaluating each trial vector.To keep the population robust,successful trial vectors must enter the population with all of the precision with which they were generated(Storn and Price,1997).In its canonical form,the differential evolution algorithm is only capable of handling continuous vari-ables.Extending it for optimization of integer variables,however,is rather mpinen and Zelinka (1999)discuss how to modify DE for mixed variable optimization.They suggest that only a couple of sim-ple modifications are required.First,integer values should be used to evaluate the objective function,even though DE itself may still works internally with continuousfloating-point values.Thus, Iðy iÞ;i2½1;D ;ð11Þ678G.Onwubolu,D.Davendra/European Journal of Operational Research171(2006)674–692wherey i ¼x i for continuous variables;INTðx iÞfor integer variables;&wherey i ¼x i;INTðx iÞ: &x i2X:INT()is a function for converting a real-value to an integer value by truncation.Truncation is performed here only for purposes of cost-function value evaluation.Truncated values are not elsewhere assigned. Thus,DE works with a population of continuous variables regardless of the corresponding object variable type.This is essential for maintaining the diversity of the population and the robustness of the algorithm. Second,in case of integer variable,instead of Eq.(6),the population should be initialized as follows: Pð0Þ¼xð0Þj;i¼xðLÞjþrand j½0;1 ÂðxðUÞjÀxðLÞjþ1Þ8i2½1;NP ;8j2½1;D :ð12ÞAdditionally,instead of Eq.(9),the boundary constraint handling integer variables should be performed as follows:uðGþ1Þj;i ¼xðLÞjþrand j½0;1 ÂðxðUÞjÀxðLÞjþ1Þ;if INTðuðGþ1Þj;iÞ<xðLÞj_INTðuðGþ1Þj;iÞ>xðUÞj;uðGþ1Þi;ji;otherwise;(ð13Þwhere i2[1,NP];j2[1,D].They also discuss how discrete values can also be handled in a straightforward manner.Suppose that the subset of discrete variables,X(d),contains l elements that can be assigned to var-iable x:XðdÞ¼xðdÞi;i2½1;l ;ð14Þwhere xðdÞi<xðdÞiþ1.Instead of the discrete value x i itself,we may assign its index,i,to x.Now the discrete variable can be handled as an integer variable that is boundary constrained to range1,...,l.To evaluate the objective func-tion,the discrete value,x i,is used instead of its index i.In other words,instead of optimizing the value of the discrete variable directly,we optimize the value of its index i.Only during evaluation is the indicated discrete value used.Once the discrete problem has been converted into an integer one,the previously de-scribed methods for handling integer variables can be applied(Eqs.(11)–(13)).3.7.Forward transformation and backward transformation techniqueThe problem formulation is already discussed in Section2.Solving theflow shop-scheduling problem and indeed most combinatorial optimization problems requires discrete variables and ordered sequence, rather than relative position indexing.To achieve this,we developed two strategies known as forward and backward transformation techniques respectively.In this paper,we present a forward transformation method for transforming integer variables into continuous variables for the internal representation of vec-tor values since in its canonical form,the DE algorithm is only capable of handling continuous variables.G.Onwubolu,D.Davendra/European Journal of Operational Research171(2006)674–692679We also present a backward transformation method for transforming a population of continuous variablesobtained after mutation back into integer variables for evaluating the objective function(Onwubolu,2001). Both forward and backward transformations are utilized in implementing the DE algorithm used in the present study for theflow shop-scheduling problem.Fig.1shows how to deal with this inherent represen-tational problem in DE.Level0deals with integer numbers(which are used in discrete problems).At this level,initialization andfinal solutions are catered for.In the problem domain areas of scheduling,TSP,etc., we are not only interested in computing the objective function cost,we are also interested in the proper order of jobs or cities respectively.Level1of Fig.1deals withfloating point numbers,which are suited for DE.At this level,the DE operators(mutation,crossover,and selection)take place.To transform the integer at level0intofloating point numbers at level1for DEÕs operators,requires some specific kind of coding.This type of coding is highly used in mathematics and computing science.For the basics of trans-forming an integer number into its real number equivalence,interested readers may refer to Michalewicz (1994),and Onwubolu and Kumalo(2001)for its application to optimizing machining operations using genetic algorithms.3.7.1.Forward transformation(from integer to real number)In integer variable optimization a set of integer number is normally generated randomly as an initial solution.Let this set of integer number be represented asz0i2z0:ð15ÞLet the real number(floating point)equivalence of this integer number be z i.The length of the real number depends on the required precision,which in our case,we have chosen two places after the decimal point. The domain of the variable z i has length equal to5;the precision requirement implies that the range be [0...4].Although0is considered since it is not a feasible solution,the range[0.1,1,2,3,4]is chosen,which gives a range of5.We assign each feasible solution two decimal places and this gives us5·100=500.Accordingly,the equivalent continuous variable for z0iis given as100¼102<5Â1026103¼1000:ð16ÞThe mapping from an integer number to a real number z i for the given range is now straightforward,given asz i¼À1þz0iÂ510À1:ð17Þ680G.Onwubolu,D.Davendra/European Journal of Operational Research171(2006)674–692Eq.(17)results in most conversion values being negative;this does not create any accuracy problem any way.After some studies by Onwubolu(2001),the scaling factor f=100was found to be adequate for con-verting virtually all integer numbers into their equivalent positive real numbers.Applying this scaling factor of f=100givesz i¼À1þz0iÂfÂ510À1¼À1þz0iÂ50010À1:ð18ÞEq.(18)is used to transform any integer variable into an equivalent continuous variable,which is then used for the DE internal representation of the population of vectors.Without this transformation,it is not pos-sible to make useful moves towards the global optimum in the solution space using the mutation mecha-nism of DE,which works better on continuous variables.For example in afive-job scheduling problem, suppose the sequence is given as{2,4,3,1,5}.This sequence is not directly used in DE internal representa-tion.Rather,applying Eq.(18),the sequence is transformed into a continuous form.Thefloating-pointequivalence of thefirst entry of the given sequence,z0i ¼2,is z i¼À1þ2Â500103À1¼0:001001.Other valuesare similarly obtained and the sequence is therefore represented internally in the DE scheme as {0.001001,1.002,0.501502,À0.499499,and1.5025}.3.7.2.Backward transformation(from real number to integer)Integer variables are used to evaluate the objective function.The DE self-referential population muta-tion scheme is quite unique.After the mutation of each vector,the trial vector is evaluated for its objective function in order to decide whether or not to retain it.This means that the objective function values of the current vectors in the population need to be also evaluated.These vector variables are continuous(from the forward transformation scheme)and have to be transformed into their integer number equivalence. The backward transformation technique is used for convertingfloating point numbers to their integer num-ber equivalence.The scheme is given as follows:z0 i ¼ð1þz iÞÂð103À1Þ500:ð19ÞIn this present form the backward transformation function is not able to properly discriminate between variables.To ensure that each number is discrete and unique,some modifications are required as follows: a¼intðz0iþ0:5Þ;ð20Þb¼aÀz0i;ð21ÞzÃi ¼ðaÀ1Þ;if b>0:5;a;if b<0:5:&ð22ÞEq.(22)gives zÃi ,which is the transformed value used for computing the objective function.It should bementioned that the conversion scheme of Eq.(19),which transforms real numbers after DE operations into integer numbers is not sufficient to avoid duplication;hence,the steps highlighted in Eqs.(20)–(22)are important.In our studies,these modifications ensure that after mutation,crossover and selection opera-tions,the convertedfloating numbers into their integer equivalence in the set of jobs for a new scheduling solution,or set of cities for a new TSP solution,etc.,are not duplicated.As an example,we consider a set of trial vector,z i={À0.33,0.67,À0.17,1.5,0.84}obtained after mutation.The integer values corresponding to the trial vector values are obtained using Eq.(22)as follows:G.Onwubolu,D.Davendra/European Journal of Operational Research171(2006)674–692681z0 1¼ð1À0:33ÞÂð103À1Þ=500¼1:33866;z02¼ð1þ0:67ÞÂð103À1Þ=500¼3:3367;z0 3¼ð1À0:17ÞÂð103À1Þ=500¼1:65834;z04¼ð1þ1:50ÞÂð103À1Þ=500¼4:9950;z05¼ð1þ0:84ÞÂð103À1Þ=500¼3:6763;a1¼intð1:333866þ0:5Þ¼2;b1¼2À1:33866¼0:66134>0:5;zÃ1¼2À1¼1;a2¼intð3:3367þ0:5Þ¼4;b2¼4À3:3367¼0:6633>0:5;zÃ2¼4À1¼3;a3¼intð1:65834þ0:5Þ¼2;b3¼2À1:65834¼0:34166<0:5;zÃ3¼2;a4¼intð4:995þ0:5Þ¼5;b4¼5À4:995¼0:005<0:5;zÃ4¼5;a5¼intð3:673þ0:5Þ¼4;b5¼4À3:673¼0:3237<0:5;zÃ5¼4:This can be represented schematically as shown in Fig.2.The set of integer values is given aszÃi ¼f1;3;2;5;4g.This set is used to obtain the objective function values.Like in GA,after mutation,crossover,and boundary checking operations,the trial vector obtained fromthe backward transformation is continuously checked until feasible solution is found.Hence,it is not nec-essary to bother about the ordered sequence,which is crucially important in the type of combinatorial opti-mization problems we are concerned with.Feasible solutions constitute about10–15%of the total trial vectors.3.8.DE strategiesPrice and Storn(2001)have suggested ten different working strategies of DE and some guidelines in applying these strategies for any given problem.Different strategies can be adopted in the DE algorithm depending upon the type of problem for which it is applied.Table1shows the ten different working strat-egies proposed by Price and Storn(2001).The general convention used in Table1is as follows:DE/x/y/z.DE stands for differential evolution algorithm,x represents a string denoting the vector to be perturbed,y is the number of difference vectors considered for perturbation of x,and z is the type of crossover being used(exp:exponential;bin:binomial). Thus,the working algorithm outline by Storn and Price(1997)is the seventh strategy of DE,that is,DE/ rand/1/bin.Hence the perturbation can be either in the best vector of the previous generation or in any ran-domly chosen vector.Similarly for perturbation,either single or two vector differences can be used.For perturbation with a single vector difference,out of the three distinct randomly chosen vectors,the weighted vector differential of any two vectors is added to the third one.Similarly for perturbation with two vector682G.Onwubolu,D.Davendra/European Journal of Operational Research171(2006)674–692。
a r X i v :0806.1256v 1 [p h y s i c s .s o c -p h ] 7 J u n 2008Understanding individual human mobility patternsMarta C.Gonz´a lez,1,2C´e sar A.Hidalgo,1and Albert-L´a szl´o Barab´a si 1,2,31Center for Complex Network Research and Department of Physics and Computer Science,University of Notre Dame,Notre Dame IN 46556.2Center for Complex Network Research and Department of Physics,Biology and Computer Science,Northeastern University,Boston MA 02115.3Center for Cancer Systems Biology,Dana Farber Cancer Institute,Boston,MA 02115.(Dated:June 7,2008)Despite their importance for urban planning [1],traffic forecasting [2],and the spread of biological [3,4,5]and mobile viruses [6],our understanding of the basic laws govern-ing human motion remains limited thanks to the lack of tools to monitor the time resolved location of individuals.Here we study the trajectory of 100,000anonymized mobile phone users whose position is tracked for a six month period.We find that in contrast with the random trajectories predicted by the prevailing L´e vy flight and random walk models [7],human trajectories show a high degree of temporal and spatial regularity,each individual being characterized by a time independent characteristic length scale and a significant prob-ability to return to a few highly frequented locations.After correcting for differences in travel distances and the inherent anisotropy of each trajectory,the individual travel patterns collapse into a single spatial probability distribution,indicating that despite the diversity of their travel history,humans follow simple reproducible patterns.This inherent similarity in travel patterns could impact all phenomena driven by human mobility,from epidemic prevention to emergency response,urban planning and agent based modeling.Given the many unknown factors that influence a population’s mobility patterns,ranging from means of transportation to job and family imposed restrictions and priorities,human trajectories are often approximated with various random walk or diffusion models [7,8].Indeed,early mea-surements on albatrosses,bumblebees,deer and monkeys [9,10]and more recent ones on marine predators [11]suggested that animal trajectory is approximated by a L´e vy flight [12,13],a random walk whose step size ∆r follows a power-law distribution P (∆r )∼∆r −(1+β)with β<2.While the L´e vy statistics for some animals require further study [14],Brockmann et al.[7]generalized this finding to humans,documenting that the distribution of distances between consecutive sight-ings of nearly half-million bank notes is fat tailed.Given that money is carried by individuals, bank note dispersal is a proxy for human movement,suggesting that human trajectories are best modeled as a continuous time random walk with fat tailed displacements and waiting time dis-tributions[7].A particle following a L´e vyflight has a significant probability to travel very long distances in a single step[12,13],which appears to be consistent with human travel patterns:most of the time we travel only over short distances,between home and work,while occasionally we take longer trips.Each consecutive sightings of a bank note reflects the composite motion of two or more indi-viduals,who owned the bill between two reported sightings.Thus it is not clear if the observed distribution reflects the motion of individual users,or some hitero unknown convolution between population based heterogeneities and individual human trajectories.Contrary to bank notes,mo-bile phones are carried by the same individual during his/her daily routine,offering the best proxy to capture individual human trajectories[15,16,17,18,19].We used two data sets to explore the mobility pattern of individuals.Thefirst(D1)consists of the mobility patterns recorded over a six month period for100,000individuals selected randomly from a sample of over6million anonymized mobile phone users.Each time a user initiates or receives a call or SMS,the location of the tower routing the communication is recorded,allowing us to reconstruct the user’s time resolved trajectory(Figs.1a and b).The time between consecutive calls follows a bursty pattern[20](see Fig.S1in the SM),indicating that while most consecutive calls are placed soon after a previous call,occasionally there are long periods without any call activity.To make sure that the obtained results are not affected by the irregular call pattern,we also study a data set(D2)that captures the location of206mobile phone users,recorded every two hours for an entire week.In both datasets the spatial resolution is determined by the local density of the more than104mobile towers,registering movement only when the user moves between areas serviced by different towers.The average service area of each tower is approximately3km2 and over30%of the towers cover an area of1km2or less.To explore the statistical properties of the population’s mobility patterns we measured the dis-tance between user’s positions at consecutive calls,capturing16,264,308displacements for the D1and10,407displacements for the D2datasets.Wefind that the distribution of displacements over all users is well approximated by a truncated power-lawP(∆r)=(∆r+∆r0)−βexp(−∆r/κ),(1)withβ=1.75±0.15,∆r0=1.5km and cutoff valuesκ|D1=400km,andκ|D2=80km(Fig.1c,see the SM for statistical validation).Note that the observed scaling exponent is not far fromβB=1.59observed in Ref.[7]for bank note dispersal,suggesting that the two distributions may capture the same fundamental mechanism driving human mobility patterns.Equation(1)suggests that human motion follows a truncated L´e vyflight[7].Yet,the observed shape of P(∆r)could be explained by three distinct hypotheses:A.Each individual follows a L´e vy trajectory with jump size distribution given by(1).B.The observed distribution captures a population based heterogeneity,corresponding to the inherent differences between individuals.C.A population based heterogeneity coexists with individual L´e vy trajectories,hence(1)represents a convolution of hypothesis A and B.To distinguish between hypotheses A,B and C we calculated the radius of gyration for each user(see Methods),interpreted as the typical distance traveled by user a when observed up to time t(Fig.1b).Next,we determined the radius of gyration distribution P(r g)by calculating r g for all users in samples D1and D2,finding that they also can be approximated with a truncated power-lawP(r g)=(r g+r0g)−βr exp(−r g/κ),(2) with r0g=5.8km,βr=1.65±0.15andκ=350km(Fig.1d,see SM for statistical validation). L´e vyflights are characterized by a high degree of intrinsic heterogeneity,raising the possibility that(2)could emerge from an ensemble of identical agents,each following a L´e vy trajectory. Therefore,we determined P(r g)for an ensemble of agents following a Random Walk(RW), L´e vy-Flight(LF)or Truncated L´e vy-Flight(T LF)(Figure1d)[8,12,13].Wefind that an en-semble of L´e vy agents display a significant degree of heterogeneity in r g,yet is not sufficient to explain the truncated power law distribution P(r g)exhibited by the mobile phone users.Taken together,Figs.1c and d suggest that the difference in the range of typical mobility patterns of indi-viduals(r g)has a strong impact on the truncated L´e vy behavior seen in(1),ruling out hypothesis A.If individual trajectories are described by a LF or T LF,then the radius of gyration should increase in time as r g(t)∼t3/(2+β)[21,22]while for a RW r g(t)∼t1/2.That is,the longer we observe a user,the higher the chances that she/he will travel to areas not visited before.To check the validity of these predictions we measured the time dependence of the radius of gyration for users whose gyration radius would be considered small(r g(T)≤3km),medium(20<r g(T)≤30km)or large(r g(T)>100km)at the end of our observation period(T=6months).Theresults indicate that the time dependence of the average radius of gyration of mobile phone users is better approximated by a logarithmic increase,not only a manifestly slower dependence than the one predicted by a power law,but one that may appear similar to a saturation process(Fig.2a and Fig.S4).In Fig.2b,we have chosen users with similar asymptotic r g(T)after T=6months,and measured the jump size distribution P(∆r|r g)for each group.As the inset of Fig.2b shows,users with small r g travel mostly over small distances,whereas those with large r g tend to display a combination of many small and a few larger jump sizes.Once we rescale the distributions with r g(Fig.2b),wefind that the data collapses into a single curve,suggesting that a single jump size distribution characterizes all users,independent of their r g.This indicates that P(∆r|r g)∼r−αg F(∆r/r g),whereα≈1.2±0.1and F(x)is an r g independent function with asymptotic behavior F(x<1)∼x−αand rapidly decreasing for x≫1.Therefore the travel patterns of individual users may be approximated by a L´e vyflight up to a distance characterized by r g. Most important,however,is the fact that the individual trajectories are bounded beyond r g,thus large displacements which are the source of the distinct and anomalous nature of L´e vyflights, are statistically absent.To understand the relationship between the different exponents,we note that the measured probability distributions are related by P(∆r)= ∞0P(∆r|r g)P(r g)dr g,whichsuggests(see SM)that up to the leading order we haveβ=βr+α−1,consistent,within error bars, with the measured exponents.This indicates that the observed jump size distribution P(∆r)is in fact the convolution between the statistics of individual trajectories P(∆r g|r g)and the population heterogeneity P(r g),consistent with hypothesis C.To uncover the mechanism stabilizing r g we measured the return probability for each indi-vidual F pt(t)[22],defined as the probability that a user returns to the position where it was first observed after t hours(Fig.2c).For a two dimensional random walk F pt(t)should follow ∼1/(t ln(t)2)[22].In contrast,wefind that the return probability is characterized by several peaks at24h,48h,and72h,capturing a strong tendency of humans to return to locations they visited before,describing the recurrence and temporal periodicity inherent to human mobility[23,24].To explore if individuals return to the same location over and over,we ranked each location based on the number of times an individual was recorded in its vicinity,such that a location with L=3represents the third most visited location for the selected individual.Wefind that the probability offinding a user at a location with a given rank L is well approximated by P(L)∼1/L, independent of the number of locations visited by the user(Fig.2d).Therefore people devote mostof their time to a few locations,while spending their remaining time in5to50places,visited with diminished regularity.Therefore,the observed logarithmic saturation of r g(t)is rooted in the high degree of regularity in their daily travel patterns,captured by the high return probabilities(Fig.2b) to a few highly frequented locations(Fig.2d).An important quantity for modeling human mobility patterns is the probabilityΦa(x,y)tofind an individual a in a given position(x,y).As it is evident from Fig.1b,individuals live and travel in different regions,yet each user can be assigned to a well defined area,defined by home and workplace,where she or he can be found most of the time.We can compare the trajectories of different users by diagonalizing each trajectory’s inertia tensor,providing the probability offinding a user in a given position(see Fig.3a)in the user’s intrinsic reference frame(see SM for the details).A striking feature ofΦ(x,y)is its prominent spatial anisotropy in this intrinsic reference frame(note the different scales in Fig3a),and wefind that the larger an individual’s r g the more pronounced is this anisotropy.To quantify this effect we defined the anisotropy ratio S≡σy/σx, whereσx andσy represent the standard deviation of the trajectory measured in the user’s intrinsic reference frame(see SM).Wefind that S decreases monotonically with r g(Fig.3c),being well approximated with S∼r−ηg,forη≈0.12.Given the small value of the scaling exponent,other functional forms may offer an equally goodfit,thus mechanistic models are required to identify if this represents a true scaling law,or only a reasonable approximation to the data.To compare the trajectories of different users we remove the individual anisotropies,rescal-ing each user trajectory with its respectiveσx andσy.The rescaled˜Φ(x/σx,y/σy)distribution (Fig.3b)is similar for groups of users with considerably different r g,i.e.,after the anisotropy and the r g dependence is removed all individuals appear to follow the same universal˜Φ(˜x,˜y)prob-ability distribution.This is particularly evident in Fig.3d,where we show the cross section of ˜Φ(x/σ,0)for the three groups of users,finding that apart from the noise in the data the curves xare indistinguishable.Taken together,our results suggest that the L´e vy statistics observed in bank note measurements capture a convolution of the population heterogeneity(2)and the motion of individual users.Indi-viduals display significant regularity,as they return to a few highly frequented locations,like home or work.This regularity does not apply to the bank notes:a bill always follows the trajectory of its current owner,i.e.dollar bills diffuse,but humans do not.The fact that individual trajectories are characterized by the same r g-independent two dimen-sional probability distribution˜Φ(x/σx,y/σy)suggests that key statistical characteristics of indi-vidual trajectories are largely indistinguishable after rescaling.Therefore,our results establish the basic ingredients of realistic agent based models,requiring us to place users in number propor-tional with the population density of a given region and assign each user an r g taken from the observed P(r g)ing the predicted anisotropic rescaling,combined with the density function˜Φ(x,y),whose shape is provided as Table1in the SM,we can obtain the likelihood offinding a user in any location.Given the known correlations between spatial proximity and social links,our results could help quantify the role of space in network development and evolu-tion[25,26,27,28,29]and improve our understanding of diffusion processes[8,30].We thank D.Brockmann,T.Geisel,J.Park,S.Redner,Z.Toroczkai and P.Wang for discus-sions and comments on the manuscript.This work was supported by the James S.McDonnell Foundation21st Century Initiative in Studying Complex Systems,the National Science Founda-tion within the DDDAS(CNS-0540348),ITR(DMR-0426737)and IIS-0513650programs,and the U.S.Office of Naval Research Award N00014-07-C.Data analysis was performed on the Notre Dame Biocomplexity Cluster supported in part by NSF MRI Grant No.DBI-0420980.C.A.Hi-dalgo acknowledges support from the Kellogg Institute at Notre Dame.Supplementary Information is linked to the online version of the paper at /nature.Author Information Correspondence and requests for materials should be addressed to A.-L.B.(e-mail:alb@)[1]Horner,M.W.&O’Kelly,M.E.S Embedding economies of scale concepts for hub networks design.Journal of Transportation Geography9,255-265(2001).[2]Kitamura,R.,Chen,C.,Pendyala,R.M.&Narayaran,R.Micro-simulation of daily activity-travelpatterns for travel demand forecasting.Transportation27,25-51(2000).[3]Colizza,V.,Barrat,A.,Barth´e l´e my,M.,Valleron,A.-J.&Vespignani,A.Modeling the WorldwideSpread of Pandemic Influenza:Baseline Case and Containment Interventions.PLoS Medicine4,095-0110(2007).[4]Eubank,S.,Guclu,H.,Kumar,V.S.A.,Marathe,M.V.,Srinivasan,A.,Toroczkai,Z.&Wang,N.Controlling Epidemics in Realistic Urban Social Networks.Nature429,180(2004).[5]Hufnagel,L.,Brockmann,D.&Geisel,T.Forecast and control of epidemics in a globalized world.Proceedings of the National Academy of Sciences of the United States of America101,15124-15129 (2004).[6]Kleinberg,J.The wireless epidemic.Nature449,287-288(2007).[7] D.Brockmann,D.,Hufnagel,L.&Geisel,T.The scaling laws of human travel.Nature439,462-465(2006).[8]Havlin,S.&ben-Avraham,D.Diffusion in Disordered Media.Advances in Physics51,187-292(2002).[9]Viswanathan,G.M.,Afanasyev,V.,Buldyrev,S.V.,Murphy,E.J.,Prince,P.A.&Stanley,H.E.L´e vyFlight Search Patterns of Wandering Albatrosses.Nature381,413-415(1996).[10]Ramos-Fernandez,G.,Mateos,J.L.,Miramontes,O.,Cocho,G.,Larralde,H.&Ayala-Orozco,B.,L´e vy walk patterns in the foraging movements of spider monkeys(Ateles geoffroyi).Behavioral ecol-ogy and Sociobiology55,223-230(2004).[11]Sims D.W.et al.Scaling laws of marine predator search behaviour.Nature451,1098-1102(2008).[12]Klafter,J.,Shlesinger,M.F.&Zumofen,G.Beyond Brownian Motion.Physics Today49,33-39(1996).[13]Mantegna,R.N.&Stanley,H.E.Stochastic Process with Ultraslow Convergence to a Gaussian:TheTruncated L´e vy Flight.Physical Review Letters73,2946-2949(1994).[14]Edwards,A.M.,Phillips,R.A.,Watkins,N.W.,Freeman,M.P.,Murphy,E.J.,Afanasyev,V.,Buldyrev,S.V.,da Luz,M.G.E.,Raposo,E.P.,Stanley,H.E.&Viswanathan,G.M.Revisiting L´e vyflightsearch patterns of wandering albatrosses,bumblebees and deer.Nature449,1044-1049(2007). [15]Sohn,T.,Varshavsky,A.,LaMarca,A.,Chen,M.Y.,Choudhury,T.,Smith,I.,Consolvo,S.,High-tower,J.,Griswold,W.G.&de Lara,E.Lecture Notes in Computer Sciences:Proc.8th International Conference UbiComp2006.(Springer,Berlin,2006).[16]Onnela,J.-P.,Saram¨a ki,J.,Hyv¨o nen,J.,Szab´o,G.,Lazer,D.,Kaski,K.,Kert´e sz,K.&Barab´a si A.L.Structure and tie strengths in mobile communication networks.Proceedings of the National Academy of Sciences of the United States of America104,7332-7336(2007).[17]Gonz´a lez,M.C.&Barab´a si,plex networks:From data to models.Nature Physics3,224-225(2007).[18]Palla,G.,Barab´a si,A.-L.&Vicsek,T.Quantifying social group evolution.Nature446,664-667(2007).[19]Hidalgo C.A.&Rodriguez-Sickert C.The dynamics of a mobile phone network.Physica A387,3017-30224.[20]Barab´a si,A.-L.The origin of bursts and heavy tails in human dynamics.Nature435,207-211(2005).[21]Hughes,B.D.Random Walks and Random Environments.(Oxford University Press,USA,1995).[22]Redner,S.A Guide to First-Passage Processes.(Cambridge University Press,UK,2001).[23]Schlich,R.&Axhausen,K.W.Habitual travel behaviour:Evidence from a six-week travel diary.Transportation30,13-36(2003).[24]Eagle,N.&Pentland,A.Eigenbehaviours:Identifying Structure in Routine.submitted to BehavioralEcology and Sociobiology(2007).[25]Yook,S.-H.,Jeong,H.&Barab´a si A.L.Modeling the Internet’s large-scale topology.Proceedings ofthe Nat’l Academy of Sciences99,13382-13386(2002).[26]Caldarelli,G.Scale-Free Networks:Complex Webs in Nature and Technology.(Oxford UniversityPress,USA,2007).[27]Dorogovtsev,S.N.&Mendes,J.F.F.Evolution of Networks:From Biological Nets to the Internet andWWW.(Oxford University Press,USA,2003).[28]Song C.M.,Havlin S.&Makse H.A.Self-similarity of complex networks.Nature433,392-395(2005).[29]Gonz´a lez,M.C.,Lind,P.G.&Herrmann,H.J.A system of mobile agents to model social networks.Physical Review Letters96,088702(2006).[30]Cecconi,F.,Marsili,M.,Banavar,J.R.&Maritan,A.Diffusion,peer pressure,and tailed distributions.Physical Review Letters89,088102(2002).FIG.1:Basic human mobility patterns.a,Week-long trajectory of40mobile phone users indicate that most individuals travel only over short distances,but a few regularly move over hundreds of kilometers. Panel b,displays the detailed trajectory of a single user.The different phone towers are shown as green dots,and the V oronoi lattice in grey marks the approximate reception area of each tower.The dataset studied by us records only the identity of the closest tower to a mobile user,thus we can not identify the position of a user within a V oronoi cell.The trajectory of the user shown in b is constructed from186 two hourly reports,during which the user visited a total of12different locations(tower vicinities).Among these,the user is found96and67occasions in the two most preferred locations,the frequency of visits for each location being shown as a vertical bar.The circle represents the radius of gyration centered in the trajectory’s center of mass.c,Probability density function P(∆r)of travel distances obtained for the two studied datasets D1and D2.The solid line indicates a truncated power law whose parameters are provided in the text(see Eq.1).d,The distribution P(r g)of the radius of gyration measured for the users, where r g(T)was measured after T=6months of observation.The solid line represent a similar truncated power lawfit(see Eq.2).The dotted,dashed and dot-dashed curves show P(r g)obtained from the standard null models(RW,LF and T LF),where for the T LF we used the same step size distribution as the onemeasured for the mobile phone users.FIG.2:The bounded nature of human trajectories.a,Radius of gyration, r g(t) vs time for mobile phone users separated in three groups according to theirfinal r g(T),where T=6months.The black curves correspond to the analytical predictions for the random walk models,increasing in time as r g(t) |LF,T LF∼t3/2+β(solid),and r g(t) |RW∼t0.5(dotted).The dashed curves corresponding to a logarithmicfit of the form A+B ln(t),where A and B depend on r g.b,Probability density function of individual travel distances P(∆r|r g)for users with r g=4,10,40,100and200km.As the inset shows,each group displays a quite different P(∆r|r g)distribution.After rescaling the distance and the distribution with r g(main panel),the different curves collapse.The solid line(power law)is shown as a guide to the eye.c,Return probability distribution,F pt(t).The prominent peaks capture the tendency of humans to regularly return to the locations they visited before,in contrast with the smooth asymptotic behavior∼1/(t ln(t)2)(solid line)predicted for random walks.d,A Zipf plot showing the frequency of visiting different locations.The symbols correspond to users that have been observed to visit n L=5,10,30,and50different locations.Denoting with(L)the rank of the location listed in the order of the visit frequency,the data is well approximated by R(L)∼L−1. The inset is the same plot in linear scale,illustrating that40%of the time individuals are found at theirfirsttwo preferred locations.FIG.3:The shape of human trajectories.a,The probability density functionΦ(x,y)offinding a mobile phone user in a location(x,y)in the user’s intrinsic reference frame(see SM for details).The three plots, from left to right,were generated for10,000users with:r g≤3,20<r g≤30and r g>100km.The trajectories become more anisotropic as r g increases.b,After scaling each position withσx andσy theresulting˜Φ(x/σx,y/σy)has approximately the same shape for each group.c,The change in the shape of Φ(x,y)can be quantified calculating the isotropy ratio S≡σy/σx as a function of r g,which decreases as S∼r−0.12(solid line).Error bars represent the standard error.d,˜Φ(x/σx,0)representing the x-axis cross gsection of the rescaled distribution˜Φ(x/σx,y/σy)shown in b.。
问题A:除非超车否则靠右行驶的交通规则在一些汽车靠右行驶的国家(比如美国,中国等等),多车道的高速公路常常遵循以下原则:司机必须在最右侧驾驶,除非他们正在超车,超车时必须先移到左侧车道在超车后再返回。
建立数学模型来分析这条规则在低负荷和高负荷状态下的交通路况的表现。
你不妨考察一下流量和安全的权衡问题,车速过高过低的限制,或者这个问题陈述中可能出现的其他因素。
这条规则在提升车流量的方面是否有效?如果不是,提出能够提升车流量、安全系数或其他因素的替代品(包括完全没有这种规律)并加以分析。
在一些国家,汽车靠左形式是常态,探讨你的解决方案是否稍作修改即可适用,或者需要一些额外的需要。
最后,以上规则依赖于人的判断,如果相同规则的交通运输完全在智能系统的控制下,无论是部分网络还是嵌入使用的车辆的设计,在何种程度上会修改你前面的结果?问题B:大学传奇教练体育画报是一个为运动爱好者服务的杂志,正在寻找在整个上个世纪的“史上最好的大学教练”。
建立数学模型选择大学中在一下体育项目中最好的教练:曲棍球或场地曲棍球,足球,棒球或垒球,篮球,足球。
时间轴在你的分析中是否会有影响?比如1913年的教练和2013年的教练是否会有所不同?清晰的对你的指标进行评估,讨论一下你的模型应用在跨越性别和所有可能对的体育项目中的效果。
展示你的模型中的在三种不同体育项目中的前五名教练。
除了传统的MCM格式,准备一个1到2页的文章给体育画报,解释你的结果和包括一个体育迷都明白的数学模型的非技术性解释。
ICM使用网络模型去测量影响力一种测量学术研究影响力的技术是建立和测量“引用”和相关作者的网络。
共同创作一个手稿通常意味着在研究者间存在很强的相关联系。
20世纪最著名的学术相关作者之一就是数学家Paul Erdös,他有超过500个相关作者,发表过1400篇论文。
这也许非常讽刺,Paul Erdös同样是跨学科科学网络研究的创始人之一,尤其通过他和Alfred Renyi的1959年《在随机图》这篇论文,Erdös作为一个合作者的角色在数学领域是至关重要的,数学家们经常会通过分析Erdös的令人惊奇的广大的相关作者网络测量他们和Erdös的亲密程度(closeness)(看这个网站see the website /enp/)。
s Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media atten-tion of late. What is all the excitement about?This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases are related both to each other and to related fields, such as machine learning, statistics, and databases. The article mentions particular real-world applications, specific data-mining techniques, challenges in-volved in real-world applications of knowledge discovery, and current and future research direc-tions in the field.A cross a wide variety of fields, data arebeing collected and accumulated at adramatic pace. There is an urgent need for a new generation of computational theo-ries and tools to assist humans in extracting useful information (knowledge) from the rapidly growing volumes of digital data. These theories and tools are the subject of the emerging field of knowledge discovery in databases (KDD).At an abstract level, the KDD field is con-cerned with the development of methods and techniques for making sense of data. The basic problem addressed by the KDD process is one of mapping low-level data (which are typically too voluminous to understand and digest easi-ly) into other forms that might be more com-pact (for example, a short report), more ab-stract (for example, a descriptive approximation or model of the process that generated the data), or more useful (for exam-ple, a predictive model for estimating the val-ue of future cases). At the core of the process is the application of specific data-mining meth-ods for pattern discovery and extraction.1This article begins by discussing the histori-cal context of KDD and data mining and theirintersection with other related fields. A briefsummary of recent KDD real-world applica-tions is provided. Definitions of KDD and da-ta mining are provided, and the general mul-tistep KDD process is outlined. This multistepprocess has the application of data-mining al-gorithms as one particular step in the process.The data-mining step is discussed in more de-tail in the context of specific data-mining al-gorithms and their application. Real-worldpractical application issues are also outlined.Finally, the article enumerates challenges forfuture research and development and in par-ticular discusses potential opportunities for AItechnology in KDD systems.Why Do We Need KDD?The traditional method of turning data intoknowledge relies on manual analysis and in-terpretation. For example, in the health-careindustry, it is common for specialists to peri-odically analyze current trends and changesin health-care data, say, on a quarterly basis.The specialists then provide a report detailingthe analysis to the sponsoring health-care or-ganization; this report becomes the basis forfuture decision making and planning forhealth-care management. In a totally differ-ent type of application, planetary geologistssift through remotely sensed images of plan-ets and asteroids, carefully locating and cata-loging such geologic objects of interest as im-pact craters. Be it science, marketing, finance,health care, retail, or any other field, the clas-sical approach to data analysis relies funda-mentally on one or more analysts becomingArticlesFALL 1996 37From Data Mining to Knowledge Discovery inDatabasesUsama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth Copyright © 1996, American Association for Artificial Intelligence. All rights reserved. 0738-4602-1996 / $2.00areas is astronomy. Here, a notable success was achieved by SKICAT ,a system used by as-tronomers to perform image analysis,classification, and cataloging of sky objects from sky-survey images (Fayyad, Djorgovski,and Weir 1996). In its first application, the system was used to process the 3 terabytes (1012bytes) of image data resulting from the Second Palomar Observatory Sky Survey,where it is estimated that on the order of 109sky objects are detectable. SKICAT can outper-form humans and traditional computational techniques in classifying faint sky objects. See Fayyad, Haussler, and Stolorz (1996) for a sur-vey of scientific applications.In business, main KDD application areas includes marketing, finance (especially in-vestment), fraud detection, manufacturing,telecommunications, and Internet agents.Marketing:In marketing, the primary ap-plication is database marketing systems,which analyze customer databases to identify different customer groups and forecast their behavior. Business Week (Berry 1994) estimat-ed that over half of all retailers are using or planning to use database marketing, and those who do use it have good results; for ex-ample, American Express reports a 10- to 15-percent increase in credit-card use. Another notable marketing application is market-bas-ket analysis (Agrawal et al. 1996) systems,which find patterns such as, “If customer bought X, he/she is also likely to buy Y and Z.” Such patterns are valuable to retailers.Investment: Numerous companies use da-ta mining for investment, but most do not describe their systems. One exception is LBS Capital Management. Its system uses expert systems, neural nets, and genetic algorithms to manage portfolios totaling $600 million;since its start in 1993, the system has outper-formed the broad stock market (Hall, Mani,and Barr 1996).Fraud detection: HNC Falcon and Nestor PRISM systems are used for monitoring credit-card fraud, watching over millions of ac-counts. The FAIS system (Senator et al. 1995),from the U.S. Treasury Financial Crimes En-forcement Network, is used to identify finan-cial transactions that might indicate money-laundering activity.Manufacturing: The CASSIOPEE trou-bleshooting system, developed as part of a joint venture between General Electric and SNECMA, was applied by three major Euro-pean airlines to diagnose and predict prob-lems for the Boeing 737. To derive families of faults, clustering methods are used. CASSIOPEE received the European first prize for innova-intimately familiar with the data and serving as an interface between the data and the users and products.For these (and many other) applications,this form of manual probing of a data set is slow, expensive, and highly subjective. In fact, as data volumes grow dramatically, this type of manual data analysis is becoming completely impractical in many domains.Databases are increasing in size in two ways:(1) the number N of records or objects in the database and (2) the number d of fields or at-tributes to an object. Databases containing on the order of N = 109objects are becoming in-creasingly common, for example, in the as-tronomical sciences. Similarly, the number of fields d can easily be on the order of 102or even 103, for example, in medical diagnostic applications. Who could be expected to di-gest millions of records, each having tens or hundreds of fields? We believe that this job is certainly not one for humans; hence, analysis work needs to be automated, at least partially.The need to scale up human analysis capa-bilities to handling the large number of bytes that we can collect is both economic and sci-entific. Businesses use data to gain competi-tive advantage, increase efficiency, and pro-vide more valuable services to customers.Data we capture about our environment are the basic evidence we use to build theories and models of the universe we live in. Be-cause computers have enabled humans to gather more data than we can digest, it is on-ly natural to turn to computational tech-niques to help us unearth meaningful pat-terns and structures from the massive volumes of data. Hence, KDD is an attempt to address a problem that the digital informa-tion era made a fact of life for all of us: data overload.Data Mining and Knowledge Discovery in the Real WorldA large degree of the current interest in KDD is the result of the media interest surrounding successful KDD applications, for example, the focus articles within the last two years in Business Week , Newsweek , Byte , PC Week , and other large-circulation periodicals. Unfortu-nately, it is not always easy to separate fact from media hype. Nonetheless, several well-documented examples of successful systems can rightly be referred to as KDD applications and have been deployed in operational use on large-scale real-world problems in science and in business.In science, one of the primary applicationThere is an urgent need for a new generation of computation-al theories and tools toassist humans in extractinguseful information (knowledge)from the rapidly growing volumes ofdigital data.Articles38AI MAGAZINEtive applications (Manago and Auriol 1996).Telecommunications: The telecommuni-cations alarm-sequence analyzer (TASA) wasbuilt in cooperation with a manufacturer oftelecommunications equipment and threetelephone networks (Mannila, Toivonen, andVerkamo 1995). The system uses a novelframework for locating frequently occurringalarm episodes from the alarm stream andpresenting them as rules. Large sets of discov-ered rules can be explored with flexible infor-mation-retrieval tools supporting interactivityand iteration. In this way, TASA offers pruning,grouping, and ordering tools to refine the re-sults of a basic brute-force search for rules.Data cleaning: The MERGE-PURGE systemwas applied to the identification of duplicatewelfare claims (Hernandez and Stolfo 1995).It was used successfully on data from the Wel-fare Department of the State of Washington.In other areas, a well-publicized system isIBM’s ADVANCED SCOUT,a specialized data-min-ing system that helps National Basketball As-sociation (NBA) coaches organize and inter-pret data from NBA games (U.S. News 1995). ADVANCED SCOUT was used by several of the NBA teams in 1996, including the Seattle Su-personics, which reached the NBA finals.Finally, a novel and increasingly importanttype of discovery is one based on the use of in-telligent agents to navigate through an infor-mation-rich environment. Although the ideaof active triggers has long been analyzed in thedatabase field, really successful applications ofthis idea appeared only with the advent of theInternet. These systems ask the user to specifya profile of interest and search for related in-formation among a wide variety of public-do-main and proprietary sources. For example, FIREFLY is a personal music-recommendation agent: It asks a user his/her opinion of several music pieces and then suggests other music that the user might like (<http:// www.ffl/>). CRAYON(/>) allows users to create their own free newspaper (supported by ads); NEWSHOUND(<http://www. /hound/>) from the San Jose Mercury News and FARCAST(</> automatically search information from a wide variety of sources, including newspapers and wire services, and e-mail rele-vant documents directly to the user.These are just a few of the numerous suchsystems that use KDD techniques to automat-ically produce useful information from largemasses of raw data. See Piatetsky-Shapiro etal. (1996) for an overview of issues in devel-oping industrial KDD applications.Data Mining and KDDHistorically, the notion of finding useful pat-terns in data has been given a variety ofnames, including data mining, knowledge ex-traction, information discovery, informationharvesting, data archaeology, and data patternprocessing. The term data mining has mostlybeen used by statisticians, data analysts, andthe management information systems (MIS)communities. It has also gained popularity inthe database field. The phrase knowledge dis-covery in databases was coined at the first KDDworkshop in 1989 (Piatetsky-Shapiro 1991) toemphasize that knowledge is the end productof a data-driven discovery. It has been popular-ized in the AI and machine-learning fields.In our view, KDD refers to the overall pro-cess of discovering useful knowledge from da-ta, and data mining refers to a particular stepin this process. Data mining is the applicationof specific algorithms for extracting patternsfrom data. The distinction between the KDDprocess and the data-mining step (within theprocess) is a central point of this article. Theadditional steps in the KDD process, such asdata preparation, data selection, data cleaning,incorporation of appropriate prior knowledge,and proper interpretation of the results ofmining, are essential to ensure that usefulknowledge is derived from the data. Blind ap-plication of data-mining methods (rightly crit-icized as data dredging in the statistical litera-ture) can be a dangerous activity, easilyleading to the discovery of meaningless andinvalid patterns.The Interdisciplinary Nature of KDDKDD has evolved, and continues to evolve,from the intersection of research fields such asmachine learning, pattern recognition,databases, statistics, AI, knowledge acquisitionfor expert systems, data visualization, andhigh-performance computing. The unifyinggoal is extracting high-level knowledge fromlow-level data in the context of large data sets.The data-mining component of KDD cur-rently relies heavily on known techniquesfrom machine learning, pattern recognition,and statistics to find patterns from data in thedata-mining step of the KDD process. A natu-ral question is, How is KDD different from pat-tern recognition or machine learning (and re-lated fields)? The answer is that these fieldsprovide some of the data-mining methodsthat are used in the data-mining step of theKDD process. KDD focuses on the overall pro-cess of knowledge discovery from data, includ-ing how the data are stored and accessed, howalgorithms can be scaled to massive data setsThe basicproblemaddressed bythe KDDprocess isone ofmappinglow-leveldata intoother formsthat might bemorecompact,moreabstract,or moreuseful.ArticlesFALL 1996 39A driving force behind KDD is the database field (the second D in KDD). Indeed, the problem of effective data manipulation when data cannot fit in the main memory is of fun-damental importance to KDD. Database tech-niques for gaining efficient data access,grouping and ordering operations when ac-cessing data, and optimizing queries consti-tute the basics for scaling algorithms to larger data sets. Most data-mining algorithms from statistics, pattern recognition, and machine learning assume data are in the main memo-ry and pay no attention to how the algorithm breaks down if only limited views of the data are possible.A related field evolving from databases is data warehousing,which refers to the popular business trend of collecting and cleaning transactional data to make them available for online analysis and decision support. Data warehousing helps set the stage for KDD in two important ways: (1) data cleaning and (2)data access.Data cleaning: As organizations are forced to think about a unified logical view of the wide variety of data and databases they pos-sess, they have to address the issues of map-ping data to a single naming convention,uniformly representing and handling missing data, and handling noise and errors when possible.Data access: Uniform and well-defined methods must be created for accessing the da-ta and providing access paths to data that were historically difficult to get to (for exam-ple, stored offline).Once organizations and individuals have solved the problem of how to store and ac-cess their data, the natural next step is the question, What else do we do with all the da-ta? This is where opportunities for KDD natu-rally arise.A popular approach for analysis of data warehouses is called online analytical processing (OLAP), named for a set of principles pro-posed by Codd (1993). OLAP tools focus on providing multidimensional data analysis,which is superior to SQL in computing sum-maries and breakdowns along many dimen-sions. OLAP tools are targeted toward simpli-fying and supporting interactive data analysis,but the goal of KDD tools is to automate as much of the process as possible. Thus, KDD is a step beyond what is currently supported by most standard database systems.Basic DefinitionsKDD is the nontrivial process of identifying valid, novel, potentially useful, and ultimate-and still run efficiently, how results can be in-terpreted and visualized, and how the overall man-machine interaction can usefully be modeled and supported. The KDD process can be viewed as a multidisciplinary activity that encompasses techniques beyond the scope of any one particular discipline such as machine learning. In this context, there are clear opportunities for other fields of AI (be-sides machine learning) to contribute to KDD. KDD places a special emphasis on find-ing understandable patterns that can be inter-preted as useful or interesting knowledge.Thus, for example, neural networks, although a powerful modeling tool, are relatively difficult to understand compared to decision trees. KDD also emphasizes scaling and ro-bustness properties of modeling algorithms for large noisy data sets.Related AI research fields include machine discovery, which targets the discovery of em-pirical laws from observation and experimen-tation (Shrager and Langley 1990) (see Kloes-gen and Zytkow [1996] for a glossary of terms common to KDD and machine discovery),and causal modeling for the inference of causal models from data (Spirtes, Glymour,and Scheines 1993). Statistics in particular has much in common with KDD (see Elder and Pregibon [1996] and Glymour et al.[1996] for a more detailed discussion of this synergy). Knowledge discovery from data is fundamentally a statistical endeavor. Statistics provides a language and framework for quan-tifying the uncertainty that results when one tries to infer general patterns from a particu-lar sample of an overall population. As men-tioned earlier, the term data mining has had negative connotations in statistics since the 1960s when computer-based data analysis techniques were first introduced. The concern arose because if one searches long enough in any data set (even randomly generated data),one can find patterns that appear to be statis-tically significant but, in fact, are not. Clearly,this issue is of fundamental importance to KDD. Substantial progress has been made in recent years in understanding such issues in statistics. Much of this work is of direct rele-vance to KDD. Thus, data mining is a legiti-mate activity as long as one understands how to do it correctly; data mining carried out poorly (without regard to the statistical as-pects of the problem) is to be avoided. KDD can also be viewed as encompassing a broader view of modeling than statistics. KDD aims to provide tools to automate (to the degree pos-sible) the entire process of data analysis and the statistician’s “art” of hypothesis selection.Data mining is a step in the KDD process that consists of ap-plying data analysis and discovery al-gorithms that produce a par-ticular enu-meration ofpatterns (or models)over the data.Articles40AI MAGAZINEly understandable patterns in data (Fayyad, Piatetsky-Shapiro, and Smyth 1996).Here, data are a set of facts (for example, cases in a database), and pattern is an expres-sion in some language describing a subset of the data or a model applicable to the subset. Hence, in our usage here, extracting a pattern also designates fitting a model to data; find-ing structure from data; or, in general, mak-ing any high-level description of a set of data. The term process implies that KDD comprises many steps, which involve data preparation, search for patterns, knowledge evaluation, and refinement, all repeated in multiple itera-tions. By nontrivial, we mean that some search or inference is involved; that is, it is not a straightforward computation of predefined quantities like computing the av-erage value of a set of numbers.The discovered patterns should be valid on new data with some degree of certainty. We also want patterns to be novel (at least to the system and preferably to the user) and poten-tially useful, that is, lead to some benefit to the user or task. Finally, the patterns should be understandable, if not immediately then after some postprocessing.The previous discussion implies that we can define quantitative measures for evaluating extracted patterns. In many cases, it is possi-ble to define measures of certainty (for exam-ple, estimated prediction accuracy on new data) or utility (for example, gain, perhaps indollars saved because of better predictions orspeedup in response time of a system). No-tions such as novelty and understandabilityare much more subjective. In certain contexts,understandability can be estimated by sim-plicity (for example, the number of bits to de-scribe a pattern). An important notion, calledinterestingness(for example, see Silberschatzand Tuzhilin [1995] and Piatetsky-Shapiro andMatheus [1994]), is usually taken as an overallmeasure of pattern value, combining validity,novelty, usefulness, and simplicity. Interest-ingness functions can be defined explicitly orcan be manifested implicitly through an or-dering placed by the KDD system on the dis-covered patterns or models.Given these notions, we can consider apattern to be knowledge if it exceeds some in-terestingness threshold, which is by nomeans an attempt to define knowledge in thephilosophical or even the popular view. As amatter of fact, knowledge in this definition ispurely user oriented and domain specific andis determined by whatever functions andthresholds the user chooses.Data mining is a step in the KDD processthat consists of applying data analysis anddiscovery algorithms that, under acceptablecomputational efficiency limitations, pro-duce a particular enumeration of patterns (ormodels) over the data. Note that the space ofArticlesFALL 1996 41Figure 1. An Overview of the Steps That Compose the KDD Process.methods, the effective number of variables under consideration can be reduced, or in-variant representations for the data can be found.Fifth is matching the goals of the KDD pro-cess (step 1) to a particular data-mining method. For example, summarization, clas-sification, regression, clustering, and so on,are described later as well as in Fayyad, Piatet-sky-Shapiro, and Smyth (1996).Sixth is exploratory analysis and model and hypothesis selection: choosing the data-mining algorithm(s) and selecting method(s)to be used for searching for data patterns.This process includes deciding which models and parameters might be appropriate (for ex-ample, models of categorical data are differ-ent than models of vectors over the reals) and matching a particular data-mining method with the overall criteria of the KDD process (for example, the end user might be more in-terested in understanding the model than its predictive capabilities).Seventh is data mining: searching for pat-terns of interest in a particular representa-tional form or a set of such representations,including classification rules or trees, regres-sion, and clustering. The user can significant-ly aid the data-mining method by correctly performing the preceding steps.Eighth is interpreting mined patterns, pos-sibly returning to any of steps 1 through 7 for further iteration. This step can also involve visualization of the extracted patterns and models or visualization of the data given the extracted models.Ninth is acting on the discovered knowl-edge: using the knowledge directly, incorpo-rating the knowledge into another system for further action, or simply documenting it and reporting it to interested parties. This process also includes checking for and resolving po-tential conflicts with previously believed (or extracted) knowledge.The KDD process can involve significant iteration and can contain loops between any two steps. The basic flow of steps (al-though not the potential multitude of itera-tions and loops) is illustrated in figure 1.Most previous work on KDD has focused on step 7, the data mining. However, the other steps are as important (and probably more so) for the successful application of KDD in practice. Having defined the basic notions and introduced the KDD process, we now focus on the data-mining component,which has, by far, received the most atten-tion in the literature.patterns is often infinite, and the enumera-tion of patterns involves some form of search in this space. Practical computational constraints place severe limits on the sub-space that can be explored by a data-mining algorithm.The KDD process involves using the database along with any required selection,preprocessing, subsampling, and transforma-tions of it; applying data-mining methods (algorithms) to enumerate patterns from it;and evaluating the products of data mining to identify the subset of the enumerated pat-terns deemed knowledge. The data-mining component of the KDD process is concerned with the algorithmic means by which pat-terns are extracted and enumerated from da-ta. The overall KDD process (figure 1) in-cludes the evaluation and possible interpretation of the mined patterns to de-termine which patterns can be considered new knowledge. The KDD process also in-cludes all the additional steps described in the next section.The notion of an overall user-driven pro-cess is not unique to KDD: analogous propos-als have been put forward both in statistics (Hand 1994) and in machine learning (Brod-ley and Smyth 1996).The KDD ProcessThe KDD process is interactive and iterative,involving numerous steps with many deci-sions made by the user. Brachman and Anand (1996) give a practical view of the KDD pro-cess, emphasizing the interactive nature of the process. Here, we broadly outline some of its basic steps:First is developing an understanding of the application domain and the relevant prior knowledge and identifying the goal of the KDD process from the customer’s viewpoint.Second is creating a target data set: select-ing a data set, or focusing on a subset of vari-ables or data samples, on which discovery is to be performed.Third is data cleaning and preprocessing.Basic operations include removing noise if appropriate, collecting the necessary informa-tion to model or account for noise, deciding on strategies for handling missing data fields,and accounting for time-sequence informa-tion and known changes.Fourth is data reduction and projection:finding useful features to represent the data depending on the goal of the task. With di-mensionality reduction or transformationArticles42AI MAGAZINEThe Data-Mining Stepof the KDD ProcessThe data-mining component of the KDD pro-cess often involves repeated iterative applica-tion of particular data-mining methods. This section presents an overview of the primary goals of data mining, a description of the methods used to address these goals, and a brief description of the data-mining algo-rithms that incorporate these methods.The knowledge discovery goals are defined by the intended use of the system. We can distinguish two types of goals: (1) verification and (2) discovery. With verification,the sys-tem is limited to verifying the user’s hypothe-sis. With discovery,the system autonomously finds new patterns. We further subdivide the discovery goal into prediction,where the sys-tem finds patterns for predicting the future behavior of some entities, and description, where the system finds patterns for presenta-tion to a user in a human-understandableform. In this article, we are primarily con-cerned with discovery-oriented data mining.Data mining involves fitting models to, or determining patterns from, observed data. The fitted models play the role of inferred knowledge: Whether the models reflect useful or interesting knowledge is part of the over-all, interactive KDD process where subjective human judgment is typically required. Two primary mathematical formalisms are used in model fitting: (1) statistical and (2) logical. The statistical approach allows for nondeter-ministic effects in the model, whereas a logi-cal model is purely deterministic. We focus primarily on the statistical approach to data mining, which tends to be the most widely used basis for practical data-mining applica-tions given the typical presence of uncertain-ty in real-world data-generating processes.Most data-mining methods are based on tried and tested techniques from machine learning, pattern recognition, and statistics: classification, clustering, regression, and so on. The array of different algorithms under each of these headings can often be bewilder-ing to both the novice and the experienced data analyst. It should be emphasized that of the many data-mining methods advertised in the literature, there are really only a few fun-damental techniques. The actual underlying model representation being used by a particu-lar method typically comes from a composi-tion of a small number of well-known op-tions: polynomials, splines, kernel and basis functions, threshold-Boolean functions, and so on. Thus, algorithms tend to differ primar-ily in the goodness-of-fit criterion used toevaluate model fit or in the search methodused to find a good fit.In our brief overview of data-mining meth-ods, we try in particular to convey the notionthat most (if not all) methods can be viewedas extensions or hybrids of a few basic tech-niques and principles. We first discuss the pri-mary methods of data mining and then showthat the data- mining methods can be viewedas consisting of three primary algorithmiccomponents: (1) model representation, (2)model evaluation, and (3) search. In the dis-cussion of KDD and data-mining methods,we use a simple example to make some of thenotions more concrete. Figure 2 shows a sim-ple two-dimensional artificial data set consist-ing of 23 cases. Each point on the graph rep-resents a person who has been given a loanby a particular bank at some time in the past.The horizontal axis represents the income ofthe person; the vertical axis represents the to-tal personal debt of the person (mortgage, carpayments, and so on). The data have beenclassified into two classes: (1) the x’s repre-sent persons who have defaulted on theirloans and (2) the o’s represent persons whoseloans are in good status with the bank. Thus,this simple artificial data set could represent ahistorical data set that can contain usefulknowledge from the point of view of thebank making the loans. Note that in actualKDD applications, there are typically manymore dimensions (as many as several hun-dreds) and many more data points (manythousands or even millions).ArticlesFALL 1996 43Figure 2. A Simple Data Set with Two Classes Used for Illustrative Purposes.。
LetterEffect of alloying elements on the microstructure and mechanical properties of nanostructured ferritic steels produced by spark plasmasinteringSomayeh Pasebani,Indrajit Charit ⇑Department of Chemical and Materials Engineering,University of Idaho,Moscow,ID 83844,USAa r t i c l e i n f o Article history:Received 23November 2013Received in revised form 23January 2014Accepted 29January 2014Available online 15February 2014Keywords:NanostructuresMechanical alloying Powder metallurgyTransmission electron microscopy High temperature alloya b s t r a c tSeveral Fe–14Cr based alloys with varying compositions were processed using a combined route of mechanical alloying and spark plasma sintering.Microstructural characteristics of the consolidated alloys were examined via transmission electron microscopy and atom probe tomography,and mechanical prop-erties evaluated using microhardness nthanum oxide (0.5wt.%)was added to Fe–14Cr leading to improvement in microstructural stability and mechanical properties mainly due to a high number den-sity of La–Cr–O-enriched nanoclusters.The combined addition of La,Ti (1wt.%)and Mo (0.3wt.%)to the Fe–14Cr base composition further enhanced the microstructural stability and mechanical properties.Nanoclusters enriched in Cr–Ti–La–O with a number density of 1.4Â1024m À3were found in this alloy with a bimodal grain size distribution.After adding Y 2O 3(0.3wt.%)along with Ti and Mo to the Fe–14Cr matrix,a high number density (1.5Â1024m À3)of Cr–Ti–Y–O-enriched NCs was also detected.For-mation mechanism of these nanoclusters can be explained through the concentrations and diffusion rates of the initial oxide species formed during the milling process and initial stages of sintering as well as the thermodynamic nucleation barrier and their enthalpy of formation.Ó2014Elsevier B.V.All rights reserved.1.IntroductionNanostructured ferritic steels (NFSs),a subcategory of oxide dis-persion strengthened (ODS)steels,have outstanding high temper-ature strength,creep strength [1,2]and excellent radiation damage resistance [3].These enhanced properties of NFSs have been attrib-uted to the high number density of Y–Ti–O-enriched nanoclusters (NCs)with diameter of 1–2nm [4].The Y–Ti–O-enriched NCs have been found to be stable under irradiation and effective in trapping helium [5].These NCs are formed due to the mechanical alloying (MA)of Fe–Cr–Ti powder with Y 2O 3during high energy ball milling followed by hot consolidation route such as hot isostatic pressing (HIP)or hot extrusion [6–8].Alinger et al.[4]have investigated the effect of alloying elements on the formation mechanism of NCs in NFSs processed by hot isostatic pressing (HIP)and reported both Ti and high milling energy were necessary for the formation of ler and Parish [9]suggested that the excellent creep properties in yttria-bearing NFSs result from the pinning of thegrain boundaries by a combined effect of solute segregation and precipitation.Although HIP and hot extrusion are commonly used to consoli-date the NFSs,anisotropic properties and processing costs are con-sidered challenging issues.Recently,spark plasma sintering (SPS)has been utilized to sinter the powder at a higher heating rate,low-er temperature and shorter dwell time.This can be done by apply-ing a uniaxial pressure and direct current pulses simultaneously to a powder sample contained in a graphite die [10].Except for a few studies on consolidation of simple systems such as Fe–9Cr–0.3/0.6Y 2O 3[11]and Fe–14Cr–0.3Y 2O 3[10],the SPS process has not been extensively utilized to consolidate the NFSs with complex compositions.Recently,the role of Ti and Y 2O 3in processing of Fe–16Cr–3Al–1Ti–0.5Y 2O 3(wt.%)via MA and SPS was investigated by Allahar et al.[12].A bimodal grain size distribution in conjunc-tion with Y–Ti–O-enriched NCs were obtained [12,13].In this study,Fe–14Cr (wt.%)was designed as the base or matrix alloy,and then Ti,La 2O 3and Mo were sequentially added to the ferritic matrix and ball milled.This approach allowed us to study the effect of individual and combined addition of solutes on the formation of NCs along with other microstructural evolutions.Furthermore,SPS instead of other traditional consolidation methods was used to consolidate the NFS powder.The mixture/10.1016/j.jallcom.2014.01.2430925-8388/Ó2014Elsevier B.V.All rights reserved.⇑Corresponding author.Tel.:+12088855964;fax:+12088857462.E-mail address:icharit@ (I.Charit).of Fe–Cr–Ti–Mo powder with Y2O3was also processed and characterized in a similar manner for comparison with the rest of the developed alloys.2.ExperimentalThe chemical compositions of all the developed alloys along with their identi-fying names in this study are given in Table1.High energy ball milling was per-formed in a SPEX8000M shaker mill for10h using Ar atmosphere with the milling media as steel balls of8mm in diameter and a ball to powder ratio(BPR) of10:1.A Dr.Sinter Lab SPS-515S was used to consolidate the as-milled powder at different temperatures(850,950and1050°C)for7min using the pulse pattern 12–2ms,a heating rate of100°C/min and a pressure of80MPa.The SPSed samples were in the form of disks with8mm in height and12mm in diameter.The density of the sintered specimens was measured by Archimedes’method. Vickers microhardness tests were performed using a Leco LM100microhardness tester operated at a load of1000g–f(9.8N).A Fischione Model110Twin-Jet Elec-tropolisher containing a mixture of CH3OH–HNO3(80:20by vol.%)as the electrolyte and operated at aboutÀ40°C was used to prepare specimens for transmission elec-tron microscopy(TEM).A FEI Tecnai TF30–FEG STEM operating at300kV was used. The energy dispersive spectroscopy(EDS)attached with the STEM was used to roughly examine the chemical composition of the particles.A Quanta3D FEG instrument with a Ga-ion source focused ion beam(FIB)was used to prepare spec-imens for atom probe tomography(APT)studies on14L,14LMT and14YMT sam-ples.The APT analysis was carried out using a CAMECA LEAP4000X HR instrument operating in the voltage mode at50–60K and20%of the standing volt-age pulse fraction.The atom maps were reconstructed using CAMECA IVAS3.6soft-ware and the maximum separation algorithm to estimate the size and chemical composition of NCs.This was applied to APT datasets each containing20–30million ions for each specimen.Lower evaporationfield of the nanoparticles and trajectory aberrations caused estimation of higher Fe atoms in the nanoclusters.Although the contribution of Fe atoms from the matrix was examined here,the matrix-correction was not addressed in this study.3.Results and discussionThe TEM brightfield micrographs for the various alloys SPSed at 950°C for7min are illustrated in Fig.1a–d.The microstructure of 14Cr alloy shown in Fig.1a revealed a complex microstructure with submicron subgrain-like structures,relatively high density of dislocations and low number density of oxide nanoparticles. The nanoparticles were larger(25–65nm)than the other SPSed al-loys and found to have chemical compositions close to Cr2O3and FeCr2O4as analyzed by energy dispersive spectroscopy.The microstructure of the consolidated14L alloy is shown in Fig.1b.The microstructure consisted of more ultrafine grains (<1l m but>100nm),a few nanograins with sharp boundaries and a higher number of nanoparticles mainly in the grain interiors. The number density of nanoparticles was higher than that of14Cr alloy shown in Fig.1a but lower than14LMT(Fig.1c)and14YMT (Fig.1d).In14L alloy,the nanoparticles with2–11nm in diameter were found inside the grains(hard to be observed at magnification given in Fig.1b and micrographs taken at higher magnifications was used for this purpose)whereas the nanoparticles with 50–80nm in diameter were located at the grain boundary regions. The particles on the boundaries are likely to be mainly Cr2O3and LaCrO3,but the chemical analysis of those smallest particles could not be done precisely due to the significant influence of the ferritic matrix.Fig.1c shows the microstructure of the SPSed14LMT alloy, consisting of both ultrafine grains(as defined previously)and nanograins(6100nm).The nanoparticles present in the micro-structure were complex oxides of Fe,Cr and Ti.The nanoparticles with faceted morphology and smaller than10nm in diameter were enriched in La and Ti.No evidence of stoichiometric La2TiO5or La2Ti2O7particles was observed based on the EDS and diffraction data.A similar type of microstructure was revealed in the SPSed 14YMT alloy as shown in Fig.1d.The particle size distribution histograms of the14Cr,14L, 14LMT and14YMT alloys are plotted in Fig.2a–d,respectively. Approximately1000particles were sampled from each alloy to de-velop the histograms.The average particle size decreased in order of14Cr,14L,14LMT and14YMT.The highest fraction of the particle size as shown in the histograms of14Cr,14L,14LMT and14YMT was found to be associated with25±5nm(18±2.5%),10±5nm (28±3%),5±1nm(40±6%)and5±1nm(46±5%)in diameter, respectively.The number density of nanoparticles smaller than 5±1nm was higher in14YMT than14LMT alloy.The3-D APT maps for14L alloy revealed a number density (%3Â1022mÀ3)of CrO–La–O-enriched NCs.The average Guinier radius of these NCs was1.9±0.6nm.The average composition of the NCs in14L was estimated by using the maximum separation algorithm to be Fe–17.87±3.4Cr–32.61±3.2O–8.21±1.1La(at.%).A higher number density(%1.4Â1024mÀ3)and smaller NCs with average Guinier radius of 1.43±0.20nm were observed in the APT maps for14LMT alloy as shown in Fig.3a.The NCs were Cr–Ti–La–O-enriched with the average composition of Fe–10.9±2.8Cr–30.9±3.1O–17.3±2.5Ti–8.2±2.2La(at.%).According to the LEAP measurements,the chemical composition of NCs dif-fered considerably from stoichiometric oxides.A large amount of Fe and Cr was detected inside the NCs,and La/Ti and La/O ratios were not consistent with La2TiO5or La2Ti2O7as expected based on thermodynamic calculations,rather the ratios were sub-stoichi-ometric.The3-D APT maps for14YMT alloy were similar to14YMT alloy as shown in Fig.3b.The NCs with an average radius of 1.24±0.2nm and a number density of1.5Â1024mÀ3were Cr–Ti–Y–O-enriched.The chemical composition of NCs was estimated close to Fe–8.52±3.1Cr–37.39±4.5O–24.52±3.1Ti–10.95±3.1Y (at.%).The matrix-corrected compositions are currently being ana-lyzed and will be reported in a full-length publication in near future.The relative density of various alloys sintered at850–1050°C is shown in Fig.4a.Generally,a higher density was obtained in the specimens sintered at higher temperatures.At850and950°C, the density of unmilled14Cr specimen(97.2%and97.5%)was higher than the milled/SPSed14Cr(92.8%and95.5%)because the unmilled powder particles were less hard(due to absence of strain hardening)and plastically deformed to a higher degree than the milled powder leading to a higher density.Adding0.5and 0.7wt.%of La2O3and0.3wt.%Y2O3to the14Cr matrix significantly decreased the density of the specimen,especially at850and 950°C;however,adding Ti to14L and14Y improved the density to some extent.The microhardness data of various alloys processed at different temperatures are shown in Fig.4b.In general,microhardness in-creased with increasing SPS temperatures up to950°C and then decreased.Both Y and La increased the hardness due to the disper-sion hardening effect.The hardness increased at the higher content of La due to the greater effect of dispersion hardening.Adding Ti separately to the14Cr matrix improved the hardness due to theTable1The alloy compositions and processing conditions(milled for10h and SPSed at850-1050°C for7min).Alloy ID Elements(wt.%)Cr Ti La2O3Y2O3Mo Fe14Cr-unmilled140000Bal.14Cr140000Bal.14T141000Bal.14L1400.500Bal.14Y14000.30Bal.14LM1400.500.3Bal.14LT1410.500Bal.14LMT(0.3)1410.300.3Bal.14LMT1410.500.3Bal.14LMT(0.7)1410.700.3Bal.14YMT14100.30.3Bal.S.Pasebani,I.Charit/Journal of Alloys and Compounds599(2014)206–211207dispersion hardening but only at lower temperature(850°C).The coarsening of Ti-enriched particles at above850°C plausibly decreased the hardness.However,at950°C,higher hardness (457HV)was achieved by a combined addition of La and Ti toFig.2.Particle size frequency histogram for(a)14Cr,(b)14L,(c)14LMT and(d)14YMT alloys. Fig.1.TEM brightfield micrographs for various alloys(a)14Cr,(b)14L,(c)14LMT and(d)14YMT.the14Cr matrix to produce14LT.Further addition of Mo to14LT improved the hardness through solid solution strengthening in 14LMT(495HV).High dislocation density and no well-defined grain boundaries were characteristics of14Cr alloy as shown in Fig.1a.The presence of a low number density and larger oxide particles(FeCr2O4and Cr2O3)at the boundaries could not create an effective pinning effect during sintering.As a result,some of these particles became confined within the grain interiors.The coarse grains had the capacity to produce and store high density of dislocations that subsequently resulted in the strain hardening effect.The hardening mechanism in14Cr alloy can thus be attributed to greater disloca-tion activities and resulting strain hardening effect.The grain boundary or precipitation hardening cannot be the dominant mechanism because of larger particles,greater inter-particle spac-ing and weakened Zener drag effect at the temperature of sinter-ing.Such strain hardening capability in nanocrystalline Fe consolidated via SPS was reported by other researchers,too [14,15].Interestingly,the high hardness in Fe–14Cr alloy consoli-dated via SPS at1100°C for4min by Auger et al.[10]wasFig.3.Three-dimensional atom maps showing NCs for(a)14LMT–91Â34Â30nm3and(b)14YMT–93Â30Â30nm3.Fig.4.(a)The relative density and(b)microhardness values for different SPSed alloys processed at different SPS temperatures for a dwell time of7min.attributed to the formation of martensitic laths caused by higher carbon content diffusing from the die,possible Cr segregation and rapid cooling during SPS.It is noteworthy to mention that no martensite lath was observed in the consolidated14Cr alloy in the present study.The level of solutes in the bcc matrix could be much greater than the equilibrium level,associated with a large number of vacancies created during milling.Our recent study[16]has shown that high energy ball milling has a complex role in initiating nucle-ation of La–Ti–O-enriched NCs in14LMT alloy powder,with a mean radius of%1nm,a number density of3.7Â10À24mÀ3and a composition of Fe–12.11Cr–9.07Ti–4.05La(at.%).The initiation of NCs during ball milling of NFSs has also been investigated by other researchers[8,17,18].According to Williams et al.[8],due to a low equilibrium solubility of O in the matrix,the precipitation of nanoparticles is driven by an oxidation reaction,subsequently resulting in reduction of the free energy.As the SPS proceeds,the number density of NCs would decrease and larger grain boundary oxides would form with the grain structure developing simulta-neously during the sintering process[8].Formation of larger grain boundary oxides as shown in Fig.1a could have been preceded by segregation of O and Cr to grain boundaries leading to a decrease in the level of the solutes in the ferritic matrix.The initial oxides forming in a chromium-rich matrix can be Cr2O3as suggested by Williams et al.[8].However,formation of LaCrO3in14L alloy (shown in Fig.1b)was associated with a higher reduction in the free energy according to the enthalpy of formation of various oxi-des given in Table2.The presence of nanoparticles caused grain boundary pinning and subsequently stabilized the nanocrystalline grains.The high density of defects(dislocations and vacancies)in a supersaturated solid solution,such as14LMT and14YMT alloys, could dramatically increase the driving force for accelerated sub-grain formation during the initial stage of sintering.At the initial stage,the vacancies created during the milling are annihilated [8,17].Meanwhile,the temperature is not high enough to produce a significant number of thermal vacancies;subsequently,any nucleation of new NCs will be prevented.As the SPS proceeds with no nucleation of new NCs,the high concentrations of extra solutes in the matrix are thermodynamically and kinetically required to precipitate out to form larger oxide particles.The larger solute-enriched oxide particles can be formed more favorably on the grain boundaries due to the higher boundary diffusivity.On the other hand,it should be considered that there is a dynamic plastic deformation occurring within the powder particles during SPS. The interaction of larger particles and dislocations introduced by dynamic hot deformation can explain the coarsening in some grains;because larger particles could not effectively pin the dislo-cations and the grain boundary migration could be facilitated fol-lowing the orientation with lower efficiency of Zener drag mechanism[19].Once the extra solutes present in the matrix pre-cipitated out,the microstructure will remain very stable because of the grain boundary pinning by triple-junctions of the grain bound-aries themselves[20],along with the high density of NCs and other ultrafine oxide particles[8].Further coarsening of the grains will be prevented even for longer dwell times at950°C.Therefore,a bi-modal grain size distribution emerged.The hardening of14LMT and14YMT alloys were attributed to a combined effect of solid solution strengthening,Hall-Petch strengthening and precipitation hardening.Based on the APT studies of the as-milled powder[16]and for-mation mechanism of the oxide particles suggested by Williams et al.[8]it could be speculated that in14LMT and14YMT alloys, Cr–O species formfirst and then absorb Ti and La/Y.This is associ-ated with a change in the interfacial energy of Cr–O species even though it is not thermodynamically the most favorable oxide.It has been established that the driving force for the oxide precipi-tates to form is the low solubility limit of oxygen in the ferritic ma-trix.The change in free energy due to oxidation reaction and nucleation of oxide nanoparticles is the leading mechanism[8].The majority of the oxygen required to generate the oxide nano-particles may be provided from the surface oxide during milling process.Furthermore,higher concentrations of Cr led to greater nucleation of Cr–O by influencing the kinetics of oxide formation. Concentrations and diffusivities of the oxide species along with the energy barrier for nucleation will control the nucleation of oxide nanoparticles.After the Cr–O formed during sintering,the Ti–O and Y/La-enriched clusters could form.The sub-stoichiome-tric NCs in14LMT and14YMT alloys were not due to insufficient level of O in the matrix[8].Formation of stoichiometric Y2Ti2O7 and Y2TiO5requires very high temperatures[8],which were outside the scope of this study.4.ConclusionThe SPSed Fe–14Cr alloy was found to have a higher hardness at room temperature due to the strain hardening effect.The stability of its microstructure at high temperatures was improved by addi-tion of La forming the Cr–La–O-enriched NCs.Adding La and Ti to Fe–14Cr matrix significantly improved the mechanical behavior and microstructural stability further due to the high number density of Cr–Ti–La–O-enriched NCs in14LMT alloy.It is demon-strated that the potential capability of La in developing new NFSs is promising but further investigations on their thermal and irradiation stability will still be required.AcknowledgementThis work was supported partly by the Laboratory Directed Research and Development Program of Idaho National Laboratory (INL),and by the Advanced Test Reactor National Scientific User Facility(ATR NSUF),Contract DE-AC07-05ID14517.The authors gratefully acknowledge the assistance of the staff members at the Microscopy and Characterization Suite(MaCS)facility at the Center for Advanced Energy Studies(CAES).References[1]M.J.Alinger,G.R.Odette,G.E.Lucas,J.Nucl.Mater.307–311(2002)484.[2]R.L.Klueh,J.P.Shingledecker,R.W.Swindeman,D.T.Hoelzer,J.Nucl.Mater.341(2005)103.[3]M.J.Alinger,G.R.Odette,D.T.Hoelzer,J.Nucl.Mater.329–333(2004)382.[4]M.J.Alinger,G.R.Odette,D.T.Hoelzer,Acta Mater.57(2009)392.Table2The standard enthalpies of formation of various oxide compounds at25°C[8,21,22].Element CompositionÀD H f(kJ molÀ1(oxide))Cr Cr2O31131CrO2583Fe Fe3O41118Fe2O3822Ti TiO543TiO2944Ti2O31522Ti3O52475Y Y2O31907YCrO31493Y2Ti2O73874La La2O31794La2Ti2O73855LaCrO31536210S.Pasebani,I.Charit/Journal of Alloys and Compounds599(2014)206–211[5]G.R.Odette,M.L.Alinger,B.D.Wirth,Annu.Rev.Mater.Res.38(2008)471.[6]ai,T.Okuda,M.Fujiwara,T.Kobayashi,S.Mizuta,H.Nakashima,J.Nucl.Sci.Technol.39(2002)872.[7]ai,M.Fujiwara,J.Nucl.Mater.307–311(2002)749.[8]C.A.Williams,P.Unifantowicz,N.Baluc,G.D.Smith,E.A.Marquis,Acta Mater.61(2013)2219.[9]ler,C.M.Parish,Mater.Sci.Technol.27(2011)729.[10]M.A.Auger,V.De Castro,T.Leguey,A.Muñoz,Pareja,R,J.Nucl.Mater.436(2013)68.[11]C.Heintze,M.Hernández-Mayoral, A.Ulbricht, F.Bergner, A.Shariq,T.Weissgärber,H.Frielinghaus,J.Nucl.Mater.428(2012)139.[12]K.N.Allahar,J.Burns,B.Jaques,Y.Q.Wu,I.Charit,J.I.Cole,D.P.Butt,J.Nucl.Mater.443(2013)256.[13]Y.Q.Wu,K.N.Allahar,J.Burns,B.Jaques,I.Charit,D.P.Butt,J.I.Cole,Cryst.Res.Technol.(2013)1,/10.1002/crat.201300173.[14]K.Oh-Ishi,H.W.Zhang Hw,T.Ohkubo,K.Hono,Mater.Sci.Eng.A456(2007)20.[15]B.Srinivasarao,K.Ohishi,T.Ohkubo,K.Hono,Acta Mater.57(2009)3277.[16]S.Pasebani,I.Charit,Y.Q.Wu, D.P.Butt,J.I.Cole,Acta Mater.61(2013)5605.[17]M.L.Brocq,F.Legendre,M.H.Mathon,A.Mascaro,S.Poissonnet,B.Radiguet,P.Pareige,M.Loyer,O.Leseigneur,Acta Mater.60(2012)7150.[18]M.Brocq,B.Radiguet,S.Poissonnet,F.Cuvilly,P.Pareige,F.Legendre,J.Nucl.Mater.409(2011)80.[19]H.K.D.H.Bhadeshia,Mater.Sci.Eng.A223(1997)64.[20]H.K.D.H.Bhadeshia,Mater.Sci.Technol.16(2000)1404.[21]W.Gale,T.Totemeier,Smithells Metals Reference Book,Amsterdam,Holland,2004.[22]T.J.Kallarackel,S.Gupta,P.Singh,J.Am.Ceram.Soc.(2013)1,http:///10.1111/jace.12435.S.Pasebani,I.Charit/Journal of Alloys and Compounds599(2014)206–211211。
组织结构变革中的路径依赖与路径创造机制研究——以联想集团为例李海东;林志扬【摘要】Due to the strong tendency of historical determinism, the classical path dependence theory can not explain the significant technical and institutional change and the generation of a new path. These issues promote the researchers to switch the research perspective and pay more attention to the path creation and path breaking. Strategic action has a attribute of path dependence, and according to the contention that strategy determines structure and structure follows strategy, the article illustrates that path dependence is embedded in organizational structure system. From the perspective of the organizational structure model evolution! the mechanism of path dependence formation and path creation is discussed. At the same time, the dual impact of path dependence and path creation in organizational structure change on organizational operation is also discussed. Finally, the article takes Lenovo Group of China modern IT industry as an example to illustrates path dependence and path creation in the evolution process of Lenovo Group's organizational structure model.%经典的路径依赖理论因具有较强的历史决定论倾向,因而无法解释重大的技术和制度变革以及新路径的产生,这些问题推动着研究者将研究视角转向了路径创造和路径突破.战略行为具有路径依赖的特征,根据“战略决定结构、结构跟随战略”的思想,组织结构系统内生地蕴含着路径依赖特性.从组织结构模式演进的角度对组织中的路径依赖形成机制和路径创造机制进行研究,并讨论了组织结构变革中的路径依赖和路径创造对组织运行的双重影响.以联想集团为例,探讨了联想集团组织结构模式选择演化历程中的路径依赖和路径创造.【期刊名称】《管理学报》【年(卷),期】2012(009)008【总页数】12页(P1135-1146)【关键词】路径依赖;组织结构变革;路径创造;自我增强机制;联想集团【作者】李海东;林志扬【作者单位】景德镇陶瓷学院工商管理学院;厦门大学管理学院【正文语种】中文【中图分类】C93组织作为一个开放性系统,不断地与外部环境进行物质、能量和信息的交换。