A diffusion theory model of spatially resolved, steady-state diffuse reflectance for the noninvasive
- 格式:pdf
- 大小:920.61 KB
- 文档页数:10
ESTIMATION OF SLUDGE SEDIMENTATIONPARAMETERS FROM SINGLE BATCH SETTLING CURVESALEXIS VANDERHASSELT 1M and PETER A.VANROLLEGHEM 2*M1Laboratory of Microbial Ecology,University Gent,Coupure Links 653,B-9000Gent,Belgium and2BIOMATH Department,University Gent,Coupure Links 653,B-9000Gent,Belgium(First received 1December 1998;accepted in revised form 1April 1999)Abstract ÐIn this paper a comparison is made between two means of obtaining the parameters for the settling velocity models that are at the core of the solid ¯ux theory:(i)the traditional approach using zone settling velocity (V ZS )data obtained from a dilution experiment and (ii)a new direct parameter estimation method relying on a single batch settling curve (SBSC).For four distinct sludges,settling curves were recorded at di erent sludge concentrations (X )and the Vesilind parameters were calculated in the traditional way.The value of the resulting model was evaluated by cross-validating it on its ability to describe complete SBSC's.Provided the dynamics of the sludge blanket descent were moderate a settler model incorporating these traditional Vesilind parameters could reasonably match the experimental batch settling curves.However,when the dynamics of the sludge blanket descent were fast,the Vesilind model failed.It was tried whether other settling velocity models could result in better ®ts to the SBSC.Here,the Cho model turned out to be the most e ective one.When cross-validating the Cho model on the dilution experiment V ZS -data,however,it was found to be less performing than the Vesilind model in describing the relationship between V ZS and X .The fact that the Vesilind model is superior to the Cho model in describing such relationships,while the Cho model is better in describing complete settling curves,clearly points out that current settling models are still very empirical in nature.An important ®nding was that with all settling velocity models tested,practical identi®ability problems appeared,indicating the need for better experimental designs.Finally,the ¯ux curves associated with the SBSC-estimated Cho parameters were compared with the traditional Vesilind ¯ux curves.Although both types of ¯ux curves generally showed similar trends,the reliability of SBSC-based ¯ux curve predictions is at present insu cient to warrant replacement of the traditional estimation of settling characteristics.#1999Elsevier Science Ltd.All rights reservedKey words Ðactivated sludge,batch settling curves,¯ux curves,parameter estimation,sedimentation,settling modelsNOMENCLATUREA surface of the clari®er (m 2)V 0Vesilind maximum settling velocity (m/h)k 'Cho parameter (kg/m 2h)n Vesilind parameter (m 3/kg)n 'Cho parameter (m 3/kg)r h Takacs settling parameter associated with the hindered settling component of the settling velocity equation (m 3/kg)r p Takacs settling parameter associated with the low concentration and slowly settling component of the suspension (m 3/kg)SBSC Single Batch Settling CurveV 0'Takacs maximum theoretical settling velocity (m/h)V ZS stirred zone settling velocity (m/h)X sludge concentration (kg/m 3)X min minimum attainable sludge concentration (kg/m 3)X t threshold sludge concentration for activation of the minimal ¯ux restriction Q uunder¯ow ¯ow rate (m 3/h)INTRODUCTIONThe design and operation of secondary clari®ers is commonly based on the solid ¯ux theory (Ong,1992;Daigger,1995).The basic data required for the application of this theory can be obtained from multiple batch tests by which the stirred zone settling velocities (V ZS )over a range of sludge con-centrations (X )are measured (dilution experiment).The relationship between X and V ZS is most often characterised by the Vesilind equation [Eq.(1)]which contains two parameters:V 0and n .The key underlying assumption is that the settling velocity is only dependent on the local sludge concentration.The Vesilind parameters can then be used to con-Wat.Res.Vol.34,No.2,pp.395±406,2000#1999Elsevier Science Ltd.All rights reservedPrinted in Great Britain0043-1354/99/$-see front matter395/locate/watresPII:S0043-1354(99)00158-X*Author to whom all correspondence should be addressed.Tel.:+32-9-264-5932;fax:+32-9-223-4941;e-mail:peter.vanrolleghem@rug.ac.bestruct the ¯ux curve which is the key input of thesolid ¯ux theory.Obtaining the Vesilind parameters by a dilution experiment is unfortunately time con-suming and labour intensive and can result in scat-tered data (Ekama et al.,1997).An alternative approach consists of linking sludge volume indices as SVI,DSVI and SSVI to V 0and n through empirical functions (Daigger,1995;Ozinsky and Ekama,1995).This linking is done on the basis of extensive historical data sets on sludge volume indi-ces and Vesilind parameters.These relationships have to be used with care because the V ZS is in¯u-enced by factors not incorporated in the sludge con-centration and sludge volume indices (Ozinsky and Ekama,1995).A more structural analysis of the di erent SVI indices allowed Bye and Dold (1998)to seriously question the validity of any correlation between Vesilind parameters and sludge volume in-dices.This leaves operators and researchers at pre-sent with the choice between a labour intensive and a non-unconditional approach.Cacossa and Vaccari (1994)and Vanrolleghem et al.(1996)successfully ®tted one-dimensional settling models to single batch settling curves (SBSC).Hence,batch settling curve based parameter esti-mation appears to have the potential of being a good means of e ciently obtaining information on sludge settling characteristics for use in the solid ¯ux theory.However,each of the above studies was restricted to one speci®c kind of sludge and in neither of the mentioned papers ¯uxes calculated with the obtained parameter estimates were com-pared against ¯uxes obtained with the traditional approach using dilution experiment V ZS 's.Consequently,the work that resulted in this paper focused on the ability of SBSC-based charac-terisation of settling velocity models for use in solid ¯ux theory instead of the method based on dilution experiment V ZS -data.Attention was particularly paid to the practical identi®ability of the parameters from such SBSC data sets and to the comparison between the resulting ¯ux curves and the ¯ux curves obtained from traditional interpretation of dilution experiment V ZS -data sets.METHODSExperimental layoutThe recently developed Settlometer (Vanrolleghem et al.,1996;commercially available from Applitek nv,Deinze,Belgium)allows to automatically record complete batch sludge settling curves,and quantify the initialsettling velocity (V ZS )and the stirred sludge volume (SSV).Sludge settles in batch mode in a settling column that has a height of 70cm and a diameter of 14cm,and is equiped with a stirrer (0.3rpm).Four di erent kinds of sludge (Table 1)were brought to the laboratory where they were aerated over night before use in an experiment.In this way sludge with stable sedi-mentation characteristics was obtained (Vanderhasselt and Verstraete,1999).The storage and settling experiments were performed at room temperature.Each sludge was thickened for 1h and subsequently pumped into the Settlometer where it was allowed to settle for 35min.Next,the sludge was rehomogenised by shortly mixing with air.To set up an experiment at a lower sludge concentration,part of the sludge in the device was sub-sequently replaced by supernatant and a new settling curve was recorded.The amount of sludge to be removed at each dilution step was calculated in such a way that the obtained sludge concentrations were more or less equidi-stant.This procedure was repeated until six settling curves were recorded.Further,care was taken to ensure that at the highest sludge concentration applied still a substantial descent of the sludge blanket occurred,while at the lowest concentration,still a sharp sludge blanket/water interface could be detected.The suspended solids concentration was determined by centrifugation (NBN 366.01,1956).From the dilution experiment described above,the Vesilind par-ameters were obtained by linear least-squares regression of the log(V ZS )against the sludge concentration.Parameters obtained with this approach will be referred to as the tra-ditional Vesilind parameters.Modelling layoutIn the past,quite a number of settling velocity models have been presented in literature (Vesilind,1968;Dick and Young,1972;Takacs et al.,1991;Otterpohl and Freund,1992;Cho et al.,1993;Watts et al.,1996).On lab settler data Grijspeerdt et al.(1995)identi®ed the Takacs et al.(1991)model [Eq.(2)]to be the best.Watts et al.(1996)stated that for the prediction of sludge blanket dynamics the Takacs model [Eq.(2)]can be simpli®ed to the Vesilind model [Eq.(1)].The latter model clearly is the most frequently used (Ong,1992;Ozinsky and Ekama,1995;Bye and Dold,1998).Cho et al.(1993)also advised that the number of parameters in a model should be lim-ited to two in order to be simple enough for application in practice.In view of these re¯ections the Vesilind model was identi®ed as the most interesting to start ter in the study the two-parameter Cho settling velocity model [Eq.(3)]was applied as well.V s V 0e ÀnX1 V s V H 0 e Àr h X ÀX minÀe Àr p X ÀX min2V s kH eÀn H XXX 3Batch settling experiments were thought to be an inter-esting information source for parameter estimation as the data are the result of only the physical properties of theTable 1.``Traditional ``Vesilind parameters for the di erent sludgesSludge Concentration range (kg/m 3)V ZS range (m/h)V 0(m/h)n (m 3/kg)Correlation coe cientB 15.6±3.00.5±3.610.30.190.9693D 4.1±0.50.4±4.7 5.50.640.9712O 14±2.20.8±9.112.20.210.9677P6.6±1.00.7±3.26.90.360.9854Alexis Vanderhasselt and Peter A.Vanrolleghem396measuring device and the settling properties of the sludge.The settling device can be assumed to have constant prop-erties.There are no bulk ¯ows in the batch-wise operating device and,consequently,no associated momentums that in¯uence the movement of the sludge blanket.Additionally,model calibration problems such as the lo-cation of the in¯uent layer (Watts et al.,1996;Ekama et al.,1997)or the determination of short circuit ¯ows from the feedlayer to the sludge recycle (Dupont and Dahl,1995)are avoided.The settling velocity models were incorporated in a set-tler model discretisised into 49horizontal layers of equal volume.This discretisation can be seen as a ®nite di er-ence approximation of the underlying partial di erential equation and was introduced by Stenstrom (1975).The large number of layers was chosen to resolve the detailed behaviour of the settling process as recommended by Jeppsson and Diehl (1996).Further,this number of layers is required to obtain a resolution that is similar to the ac-curacy of the measured settling curve.The layer structure was combined with the minimal ¯ux restriction that restrains the gravitational ¯ux from a certain layer to the layer below to the minimal gravitational ¯ux of both layers (Stenstrom,1975).Vitasovic (1989)activated this restriction only when the sludge concentration reached a certain minimum concentration (X t ),i.e.3kg/m 3.Ekama et al.(1997)identify this restriction as being necessary for numerical stability of the simulation.However,there is also physical ground to this concept:a layer can not unconditionally discharge sludge to a layer in which there is already a substantial amount of sludge.To avoid that X t would in¯uence the result without being actually esti-mated,it was decided to activate the minimal ¯ux restric-tion throughout the whole concentration region (X t =0kg/m 3)for all sludges considered.This does not pose a pro-blem for the description of batch settling data because the sludge concentrations that determine the height of the sludge blanket are relatively high and the sludge concen-tration increases continuously from top to bottom.Further,for Vesilind and Cho based settler models these sludge concentrations are located in the descending part of the ¯ux curve,so the minimal ¯ux restriction will always be active regardless the fact whether X t is equal to 0or 3kg/m 3.The height of the sludge blanket was de®ned as the lo-cation of the uppermost layer with a sludge concentration higher than a certain threshold.Vitasovic (1986)and Vanrolleghem et al.(1996)used 3kg/m 3as the threshold.From a numerical simulation with di erent sludge blanket thresholds it was found that the de®nition of this threshold is not critical.Evidently,this statement only holds as long as the threshold sludge concentration is lower than the sludge concentration at the start of the ex-periment.For practical reasons this sludge blanket threshold was therefore set to 1kg/m 3,being lower than the initial sludge concentration of all modelled exper-iments.When a settling curve is recorded (Fig.1)one observes that some start-up phenomena take place before the des-cent of the blanket starts.To incorporate also these phenomena in the batch settling model,the settling vel-ocity model was slightly changed.This was done by multi-plying the settling velocity model with an on±o term.This term is 0when the simulation time (t )is smaller than the start-up time (T st )and 1when it is larger than T st .In this way the start-up phenomena are incorporated in the model without a ecting the settling part itself.Moreover,this method has the advantage that the settling velocity model is activated as soon as the blanket starts to des-cend.RESULTSBefore starting the parameter estimation,the clari®er model was evaluated from a theoretical point of view.Analysis of the concentrationpro®lesFig.1.Settling curves from the dilution experiment with the B sludge.Estimation of sludge sedimentation parameters 397showed that the evolution of the sludge blanket in the zone settling phase is determined only by the in-itial sludge concentration.Only when the settling curve starts to bend (i.e.when the settling velocity decreases)higher concentrations have a signi®cant impact on the evolution of the sludge blanket.Hence,only from that time instant onwards,the data contain information on more than one sludge concentration.Therefore,it is postulated to be an essential experimental condition that the obtained settling curve shows su cient amount of bending before reliable parameter estimation can be per-formed on a SBSC.This statement can be sup-ported from a mathematical point of view:thereisFig.2.Data and model output from the Vesilind based model for the O sludge at 11.0kg/m 3.(A)Traditional calibration on set of V ZS .(B)Optimisation result V 0=13.1m/h,n =0.219m 3/kg,T st =6.70E À02h.Alexis Vanderhasselt and Peter A.Vanrolleghem398only one additional parameter needed to describe a straight line once one point (the initial height at the starting of the sedimentation phase)of the curve is known.Settling curves at di erent sludge concentrations were recorded for each sludge.As an example the resulting settling curves for the B sludge are depicted in Fig.1.For the di erent sludges,the tra-ditional Vesilind parameters resulting from such di-lution experiment are summarised in Table 1.Looking at the parameter values one can conclude that sludges with good settling properties (B,O)as well as poor settling sludges (D,P)were included in the experimental plan.Cross-validation 1:traditional Vesilind parameters to single batch settling curveFor each experimental data set,the batch settler model was solved using the traditional Vesilind par-ameters.For T st ,a value was used that gave a mini-mal squared error (SSE)between the simulation result and the data.Provided that the dynamics in the sludge blanket were rather slow,reasonable agreement (and,thus,cross-validation)could be obtained between the data and the simulation results e.g.Fig.2(A).Still,when the zone settling phase is ending and the settling curve starts to bend [in Fig.2(A)between t =0.2and t =0.5h]asmallFig.3.Data and model output for the B sludge at 5.0kg/m 3.(A)For the Vesilind based model with traditional calibration on set of V ZS .(B)Optimisation results of the Cho based model k '=29.6kg/h m 2,n '=0.12m 3/kg,T st =1.10E À02.Estimation of sludge sedimentation parameters 399discrepancy appears between the data and the simu-lation result.For settling curves with faster dynamics (higher V ZS ),e.g.Fig.3(A),it becomes obvious that the discrepancy becomes bigger for this transition zone.Still,although the model (with Vesilind settling velocity model and parameters obtained from a dilution experiment)is not able to predict a solids pro®le that allows a smooth tran-sition between the zone settling phase and the com-paction phase,it is able to accurately describe the zone settling phase and the ®nal height of the blan-ket.Similar trends could be observed with the other sludges.It can be concluded that the model allows to adequately describe slow dynamic batch settling curves when it is calibrated with parameters from a dilution experiment.Parameter estimation from a slow dynamic single batch settling curveFor the estimation from a SBSC approach,a direction set algorithm (Brent,1973)was used to ®nd the parameters (V 0,n and T st )that would result in a minimal SSE between the observed data and the simulation results.For the data of Fig.2(A)this resulted in the ®t depicted in Fig.2(B).As can be seen for this slow dynamic SBSC,the obtained ®t is rather good.The ®t in Fig.2(B)is better than the ®t of Fig.2(A).This is as expected because the simulated curve in Fig.2(A)is not the result of a parameter estimation exercise,but the result of a cross-validation of parameters obtained from the dilution experiment.Further,the estimated Vesilind parameters (V 0=13.1;n =0.22)can be considered rather close to the ones obtained from the traditional determination (V 0=10.3;n =0.19).Although the above is important for the quality assurance of lab experiments,it is the ¯ux curves in which practitioners are interested.In order to evalu-ate this,the ¯ux curves for both parameter sets (tra-ditional vs SBSC estimation)were plotted against each other (Fig.4).As can be seen from this ®gure,both curves are rather close to each other,although in the upper concentration region di erences run up to 17%.Fig.4can also be considered as a cross-validation from the SBSC approach to the dilution approach.In the present case,the cross-validation result is reasonable.However,in other cases (data not shown)the results were less.Evaluation of practical identi®abilityWhen estimating parameters it is important to check whether the available data are informative enough to give unique values to the model's par-ameters.When the information content of a data set is too low,it is possible that di erent parameter sets can be found giving equal ®t.This practical identi®ability phenomenon surfaces when di erent estimates are found when the parameter search al-gorithm is started from di erent initial parameter guesses.In order to assess eventual practical identi-®cation problems in this application,the sum of squared errors was calculated for a large number of parameter sets in a parameter subspace.In the resulting Fig.5one observes no sharp minima but rather broad valleys.This clearly points towards identi®cation problems,i.e.it is not straightforward to ®nd the lowest sum of squared errors,leading to the best parameter set.This identi®ability problem was further con®rmed by tests that were conducted in which the search algorithm was startedfromFig.4.Flux curves for the traditional and the SBSC (X =11.0kg/m 3)estimated Vesilind parametersfor the O sludge.Alexis Vanderhasselt and Peter A.Vanrolleghem400di erent initial parameter sets,resulting in di erent optimisation results.As the SBSC approach was not successful in ®nd-ing unique parameter values,it was evaluated whether a multiple response ®tting method as used by De heyder et al.(1997)could provide a solution.In this approach,parameters are estimated using di erent settling curves obtained with di erent con-centrations of one kind of sludge.The identi®cation is then based on the fact that each settling curve contains di erent information due to the di erent initial sludge concentration and the fact that for all curves the same Vesilind parameters must hold.Note,however,that T st is dependent on V ZS (Vanderhasselt et al.,1999)and hence,it is di erent for each curve.Therefore,an optimal T st value was determined for each settling curve.Subsequently,for each settling curve the sum of squared errors was calculated in the V 0/n parameter space.The resulting error surfaces were quite similar inshapeFig.5.Simulation error as function of di erent parameter combinations for O sludge at 11.0kg/m 3.Top:n =0.219m 3/kg.Bottom:T st =0.067h.Estimation of sludge sedimentation parameters 401as the one at the bottom bining these surfaces did not result in a graph with a sharp minimum.Consequently,multiple response ®tting was not successful in solving the identi®ability pro-blem.Parameter estimation from a fast dynamic single batch settling curveFor sedimentation curves with faster dynamics it was not possible to obtain Vesilind parameters which were able to describe the curve adequately.In particular,no parameter set could be found that was able to describe the transition between the zone settling phase and the compression phase substan-tially better than the traditional Vesilind parameters [Fig.3(A)].Because this Vesilind approach was not fully suc-cessful,alternative settling velocity models as pro-posed by Takacs et al.(1991)and Watts et al.(1996)were evaluated.Neither of these models was able to describe the data substantially better than the Vesilind model.A hybrid Vesilind settling velocity model [Eq.(3)]that was proposed by Cho et al.(1993)to describe the relation between V ZS and X in dilution exper-iments,was tested subsequently.It is important to realise that special attention must be directed to the lower concentration region when this Cho model is used to calculate settling velocities:when the sludge concentration reaches zero,the settling velocity goes to in®nity,which is not physically possible.In order to prevent this,a threshold concentration below which the settling velocity is set to zero was incorporated.Arbitrarily this constant was set at 0.1kg/m 3.The precise value of this threshold was identi®ed not to be critical for this particular con-centration.One should note the correspondence between this threshold concentration and the non-settleable fraction X min in the Takacs settling vel-ocity model.Among all models tested,the Cho model gave the best ®ts to the settling curves under consider-ation,i.e.it gave the lowest SSE.The better ability of the Cho model to describe complete batch settling curves is obvious if one compares the ®t of the Vesilind model [Fig.3(A)]with the one of the Cho model [Fig.3(B)].While the Cho model pro-vided better ®ts indicating its superior structural ¯exibility,still identi®ability problems similar to the problems observed with the Vesilind model were encountered.In Fig.6the Cho ¯ux curves calculated with par-ameters obtained from di erent SBCS at di erent concentrations are depicted.Next to the Cho ¯ux curves the traditional Vesilind ¯ux curve is depicted.Considering Fig.6,it is clear that the ¯uxes obtained with the SBSC-estimated parameters should be used with precaution as there can be some di erence between the ¯ux curves obtained from parameters estimated from SBSC's at di erent sludge concentrations.For the D sludge as well as for the other sludges the highest ¯ux curves were obtained with parameters estimated from the settling curve with the highest initial sludge concen-tration.In the higher sludge concentration region the Cho ¯ux curves with parameters obtainedfromFig.6.Flux curves for D sludge,calculated with the Vesilind parameters traditionally calibrated on aset of V ZS 's and Cho parameters obtained from di erent SBSC's.Alexis Vanderhasselt and Peter A.Vanrolleghem402the SBSC approach were situated below the tra-ditional Vesilind ¯ux curve.For each sludge the ¯ux curves with the par-ameters obtained from the middle of the three selected sludge concentrations are plotted in Fig.7.It becomes clear that the ¯uxes resulting from the parameter estimation follow the same trend as observed with the Vesilind ¯ux curves obtained from dilution experiments.Sludges that have hightraditional Vesilind ¯ux curves like B and O also have high Cho-from-SBSC ¯ux curves.Sludges with low traditional ¯ux curves like D and P have corre-sponding low Cho ¯ux curves.Cross-validation 2:Cho SBSC-estimated parameters to dilution experiment V ZSIn order to further investigate the validity of the Cho model,it was evaluated whether the Chopar-Fig.7.Flux curves for the di erent sludges according to the Vesilind model traditionally calibrated on a set of V ZS 's (closed symbols)and the Cho model obtained from the middle concentration SBSC's(opensymbols).Fig.8.Natural logarithm of the V ZS in function of the P sludge concentration.Estimation of sludge sedimentation parameters 403ameters resulting from the SBSC approach could describe the X/V ZS relationships as observed in the dilution experiment(cross-validation).To this end the natural logarithm of the observed and par-ameter predicted V ZS were plotted as function of the sludge concentration,e.g.Fig.8.The Cho par-ameters resulting from the SBSC approach described the observed V ZS/X data less accurate than the Vesilind parameters from the dilution ex-periment.Also the observed correspondence was even lower for the poorly settling sludges as D and P compared to the well settling sludges.DISCUSSIONThe Vesilind parameters calculated from a di-lution experiment V ZS's allow a49layer sedimen-tation model to roughly describe complete experimental batch settling curves.However,when the settling dynamics observed in the experiment increase,the discrepancy between the simulation result and the data enlarges,especially at the tran-sition between zone settling and compression phases.On this observation di erent comments can be made:.When the sludge blanket reaches the bottom of the batch settler,compression phenomena start (or might start)to determine the descent of the sludge blanket in a direct way.In the com-pression phase sludge particles lean on each other.This is a fundamentally di erent situation than in the zone settling phase which is the only phenomenon characterised in the dilution exper-iment(Ekama et al.,1997).Here,the descent of the blanket is depending only on the equilibrium between gravitational and hydraulic friction forces..When the sludge blanket is low,it is also close to the conical bottom of the settling column.Some boundary processes related to the conical bottom and scraper could in¯uence the observed pro-cesses.A hampering of the blanket movement at the bottom could be substantiated by the®nding that in nearly all cases the¯ux curves obtained from parameter estimation on a SBSC are situ-ated below the traditional Vesilind¯ux curves at the high concentration end.The Cho model[Eq.(3)]was found to be the most e ective in®tting the observed settling curves. Grijspeerdt et al.(1995)identi®ed the Takacs model [Eq.(2)]to be the best,although they didn't include the Cho model in their comparison.Here,it has to be mentioned that the values Grijspeerdt et al. (1995)found for the Takacs r p and r h parameters are rather close to each other indicating that both exponential terms interfere with each other over quite a broad range.This is in contradiction with the mechanistic explanation Takacs et al.(1991) give to their model.Further,Grijspeerdt et al.(1995),like Takacs et al.(1991)and Watts et al. (1996),used not a settling curve but a discrete solids pro®le as experimental data for the parameter estimation.Considering the empirical nature of the settling models it would not be surprising that the optimal model structure is depending on the nature of the experiments to be described.In this respect cross-validation of parameters between the two types of data sets as performed by Cacossa and Vaccari(1994),is certainly a task for further research.Because data on concentration pro®les were lack-ing,the Cho model[Eq.(3)]was cross-validated in an alternative way:it was evaluated on its ability to describe the evolution of the V ZS as a function of the suspended solids concentration(dilution exper-iment).For the good settling sludges like the O and B sludge,the model was able to reasonably describe the observed behaviour.However,for the poorly settling sludges like P and D,the Cho model[Eq.(3)]was not able to describe the trends as well as the Vesilind model[Eq.(1)].Hence,the Vesilind model[Eq.(1)]appears to be better in describing the V ZS obtained from dilution experiments, whereas the Cho model[Eq.(3)]is better in describ-ing complete single batch settling curves.This points clearly towards the empirical nature of the used settling velocity models and on the fact that not all processes involved in the batch settling pro-cess are understood and taken into account.In the past,practical identi®cation problems as-sociated with the estimation of parameters from a single settling curve have only been addressed to a minor extent.Cacossa and Vaccari(1994)(vaguely) mentioned that the parameters estimated by their algorithm were``somewhat sensitive to the initial guesses used''.These authors stated further that reasonable parameter estimation was only possible if the sludge volume was less than half the initial volume after60min of sedimentation in their120-cm tall stirred column.However,no detailed analy-sis was reported.In the current study identi®ability problems were still encountered with curves that had a sludge volume less than a quarter of the in-itial one after30min of settling in a70-cm tall col-umn.Vanrolleghem et al.(1996)stated that no more than three parameters in a Takacs based batch settler model could be identi®ed from SBSC's recorded with the Settlometer.Despite these practi-cal identi®ability problems encountered while using SBSC's,the determination of Vesilind parameters through the more labour intensive,traditional di-lution experiment can also be troublesome as Ekama et al.(1997)report that scattered data can be obtained.From slow dynamic SBSC's one can obtain¯ux curves which are quite close to the ones obtained by a traditional dilution experiment.However,this is not always the case.For fast dynamic SBSC's the traditional Vesilind model is not able to describeAlexis Vanderhasselt and Peter A.Vanrolleghem 404。
diffusion模型解读## Diffusion Models Explained.Diffusion models are a type of generative model that has gained popularity in recent years for their ability to generate high-quality images, text, and other data from scratch. They work by gradually "denoising" a random input, adding details and structure until a coherent output is produced.Diffusion models are based on the idea of diffusion, a process that spreads out a substance over time. In the context of generative modeling, diffusion is used to spread out the noise in a random input, making it gradually more uniform. By reversing the diffusion process, the model can then learn to generate data that is increasingly structured and detailed.The training process for a diffusion model involves two main steps:1. Forward diffusion: In this step, a random input is gradually "denoised" by adding noise to it over multiple iterations. This process makes the input increasingly uniform and unstructured.2. Reverse diffusion: In this step, the model learns to reverse the diffusion process, generating data that is increasingly structured and detailed. The model is trained on a dataset of real data, and it learns to generate data that matches the distribution of the real data.Diffusion models have several advantages over other generative models. First, they can generate high-quality data that is visually appealing and realistic. Second, they are relatively easy to train, and they can be used to generate a wide variety of data types. Third, they are computationally efficient, and they can be used to generate large amounts of data quickly.However, diffusion models also have some limitations. First, they can be slow to generate data, especially forlarge datasets. Second, they can be difficult to control, and it can be difficult to generate data with specific properties.Despite these limitations, diffusion models are a promising new approach to generative modeling. They havethe potential to revolutionize the way we generate data,and they could be used to create new applications in a wide range of fields.## 中文回答:扩散模型通俗解释。
Frictionally excited thermoelastic instability in disc brakes—Transientproblem in the full contact regimeAbstractExceeding the critical sliding velocity in disc brakes can cause unwanted forming of hot spots, non-uniform distribution of contact pressure, vibration, and also, in many cases, permanent damage of the disc. Consequently, in the last decade, a great deal of consideration has been given to modeling methods of thermo elastic instability (TEI), which leads to these effects. Models based on the finite element method are also being developed in addition to the analytical approach. The analytical model of TEI development described in the paper by Lee and Barber [Frictionally excited thermo elastic instability in automotive disk brakes. ASME Journal of Tribology 1993;115:607–14] has been expanded in the presented work. Specific attention was given to the modification of their model, to catch the fact that the arc length of pads is less than the circumference of the disc, and to the development of temperature perturbation amplitude in the early stage of breaking, when pads are in the full contact with the disc. A way is proposed how to take into account both of the initial non-flatness of the disc friction surface and change of the perturbation shape inside the disc in the course of braking.Keywords: Thermo elastic instability; TEI; Disc brake; Hot spots1. IntroductionFormation of hot spots as well as non-uniform distribution of the contact pressure is an unwanted effect emerging in disc brakes in the course of braking or during engagement of a transmission clutch. If the sliding velocity is high enough, this effect can become unstable and can result in disc material damage, frictional vibration, wear, etc. Therefore, a lot of experimental effort is being spent to understand better this effect (cf. Refs.) or to model it in the most feasible fashion. Barber described the thermo elastic instability (TEI)as the cause of the phenomenon. Later Dow and Burton and Burton et al.introduced a mathematical model to establish critical sliding velocity for instability, where two thermo elastic half-planes are considered in contact along their common interface. It is in a work by Lee and Barber that the effect of the thickness was considered and that a model applicable for disc brakes was proposed. Lee and Barber’s model is made up with a metallic layer sliding between twohalf-planes of frictional material. Only recently a parametric analysis of TEI in disc brakes was made or TEI in multi-disc clutches and brakes was modeled. The evolution of hot spots amplitudes has been addressed in Refs. Using analytical approach or the effect of intermittent contact was considered. Finally, the finite element method was also applied to render the onset of TEI (see Ref.).The analysis of nonlinear transient behavior in the mode, when separated contact regions occur, is even accomplished in Ref. As in the case of other engineering problems of instability, it turns out that a more accurate prediction by mathematical modeling is often questionable. This is mainly imparted by neglecting various imperfections and random fluctuations or by the impossibility to describe all possible influences appropriately. Therefore, some effort aroused to interpret results of certain experiments in addition to classical TEI (see, e.g.Ref).This paper is related to the work by Lee and Barber [7].Using an analytical approach, it treats the inception of TEI and the development of hot spots during the full contact regime in the disc brakes. The model proposed in Section 2 enables to cover finite thickness of both friction pads and the ribbed portion of the disc. Section 3 is devoted to the problems of modeling of partial disc surface contact with the pads. Section 4 introduces the term of ‘‘thermal capacity of perturbation’’ emphasizing its association with the value of growth rate, or the sliding velocity magnitude. An analysis of the disc friction surfaces non-flatness and its influence on initial amplitude of perturbations is put forward in the Section 5. Finally, the Section 6 offers a model of temperature perturbation development initiated by the mentioned initial discnon-flatness in the course of braking. The model being in use here comes from a differential equation that covers the variation of the‘‘thermal capacity’’ during the full contact regime of the braking.2. Elaboration of Lee and Barber modelThe brake disc is represented by three layers. The middle one of thickness 2a3 stands for the ribbed portion of the disc with full sidewalls of thickness a2 connected to it. The pads are represented by layers of thickness a1, which are immovable and pressed to each other by a uniform pressure p. The brake disc slips in between these pads at a constant velocity V.We will investigate the conditions under which a spatially sinusoidal perturbation in the temperature and stress fields can grow exponentially with respect to the time in a similar manner to that adopted by Lee and Barber. It is evidenced in their work [7] that it is sufficient to handle only the antisymmetric problem. The perturbations that are symmetric with respect to the midplane of the disc can grow at a velocity well above the sliding velocity V thus being made uninteresting.Let us introduce a coordinate system (x1; y1)fixed to one of the pads (see Fig. 1) thepoints of contact surface between the pad and disc having y1 = 0. Furthermore, let acoordinate system (x2; y2)be fixed to the disc with y2=0 for the points of the midplane. We suppose the perturbation to have a relative velocity ci with respect to the layer i, and the coordinate system (x; y)to move together with the perturbated field. Then we can writeV = c1 -c2; c2 = c3; x = x1 -c1t = x2 -c2t,x2 = x3; y = y2 =y3 =y1 + a2 + a3.We will search the perturbation of the uniform temperature field in the formand the perturbation of the contact pressure in the formwhere t is the time, b denotes a growth rate, subscript I refers to a layer in the model, and j =-1½is the imaginary unit. The parameter m=m(n)=2pin/cir =2pi/L, where n is the number of hot spots on the circumference of the disc cir and L is wavelength of perturbations. The symbols T0m and p0m in the above formulae denote the amplitudes of initial non-uniformities (e.g. fluctuations). Both perturbations (2) and (3) will be searched as complex functions their real part describing the actual perturbation of temperature or pressure field.Obviously, if the growth rate b<0, the initial fluctuations are damped. On the other hand, instability develops ifB〉0.2.1. Temperature field perturbationHeat flux in the direction of the x-axis is zero when the ribbed portion of the disc is considered. Next, let us denote ki = Ki/Qicpi coefficient of the layer i temperature diffusion. Parameters Ki, Qi, cpi are, respectively, the thermal conductivity, density and specific heat of the material for i =1,2. They have been re-calculated to the entire volume of the layer (i = 3) when the ribbed portion of the disc is considered. The perturbation of the temperature field is the solution of the equationsWith and it will meet the following conditions:1,The layers 1 and 2 will have the same temperature at the contact surface2,The layers 2 and 3 will reach the same temperature and the same heat flux in the direction y,3,Antisymmetric condition at the midplaneThe perturbations will be zero at the external surface of a friction pad(If, instead, zero heat flux through external surface has been specified, we obtain practically identical numerical solution for current pads).If we write the temperature development in individual layers in a suitable formwe obtainwhereand2.2. Thermo elastic stresses and displacementsFor the sake of simplicity, let us consider the ribbed portion of the disc to be isotropic environment with corrected modulus of elasticity though, actually, the stiffness of this layer in the direction x differs from that in the direction y. Such simplification is, however, admissible as the yielding central layer 3 practically does not take effect on the disc flexural rigidity unlike full sidewalls (layer 2). Given a thermal field perturbation, we can express the stress state and displacements caused by this perturbation for any layer. The thermo elastic problem can be solved by superimposing a particular solution on the general isothermal solution. We look for the particular solution of a layer in form of a strain potential. The general isothermal solution is given by means of the harmonic potentials after Green and Zerna (see Ref.[18]) and contains four coefficients A, B, C, D for every layer. The relateddisplacement and stress field components are written out in the Appendix A.在全接触条件下,盘式制动器摩擦激发瞬态热弹性不稳定的研究摘要超过临界滑动盘式制动器速度可能会导致形成局部过热,不统一的接触压力,振动分布,而且,在多数情况下,会造成盘式制动闸永久性损坏。
When writing an essay in English about My Model,its important to consider the context in which the term model is being used.Here are a few different approaches you might take,depending on the specific meaning of model in your essay:1.A Role Model:Begin by introducing who your role model is and why they are important to you. Discuss the qualities and achievements of your role model that you admire. Explain how their actions or life story has influenced your own life or goals.Example Paragraph:My role model is Malala Yousafzai,a Pakistani activist for female education and the youngest Nobel Prize laureate.Her courage and determination to fight for girls education rights in the face of adversity have deeply inspired me.Malalas story has taught me the importance of standing up for what I believe in,even when it is difficult.2.A Fashion Model:Describe the physical attributes and style of the model.Discuss the impact they have had on the fashion industry or their unique contributions to it.Explain why you find their work or presence in the industry notable.Example Paragraph:Kendall Jenner is a fashion model who has made a significant impact on the industry with her unique style and presence.Her tall and slender physique,combined with her ability to carry off diverse looks,has made her a favorite among designers and fashion enthusiasts alike.I admire her for her versatility and the way she uses her platform to promote body positivity.3.A Model in Science or Technology:Introduce the model as a theoretical framework or a practical tool used in a specific field.Explain the principles behind the model and how it is applied.Discuss the benefits or limitations of the model and its implications in the real world.Example Paragraph:The Standard Model in physics is a theoretical framework that describes three of the four known fundamental forces excluding gravity and classifies all known elementary particles.It has been instrumental in understanding the behavior of subatomic particles and predicting the existence of new particles,such as the Higgs boson.However,the models inability to incorporate gravity or dark matter has led to ongoing research for amore comprehensive theory.4.A Model in Business or Economics:Introduce the business or economic model and its purpose.Explain how the model works and the strategies it employs.Discuss the success or challenges associated with the model and its potential for future growth.Example Paragraph:The subscriptionbased business model has become increasingly popular in recent years, particularly in the software panies like Adobe have transitioned from selling packaged software to offering services on a subscription basis,allowing for continuous revenue streams and a more predictable income.This model has been successful in fostering customer loyalty and providing a steady income,although it requires ongoing innovation to maintain customer interest.5.A Model in Art or Design:Describe the aesthetic or functional qualities of the model.Discuss the creative process or design principles that inform the model.Explain the cultural or historical significance of the model and its influence on contemporary art or design.Example Paragraph:The Eames Lounge Chair,designed by Charles and Ray Eames,is a model of modern furniture that has become an icon of midcentury design.Its elegant form,made from molded plywood and leather,exemplifies the designers commitment to blending comfort with aesthetics.The chairs timeless appeal has made it a staple in both residential and commercial settings,influencing countless furniture designs that followed. Remember to structure your essay with a clear introduction,body paragraphs that develop your points,and a conclusion that summarizes your main e specific examples and evidence to support your claims,and ensure your writing is clear,concise, and engaging.。
大模型diffusion原理英文回答:Diffusion is a fundamental principle in large-scale modeling that describes the process of how substances or information spread through a system. It is commonly used in various fields such as physics, chemistry, biology, and social sciences to understand how different entities interact and propagate.In physics and chemistry, diffusion refers to the movement of particles from an area of high concentration to an area of low concentration. This is driven by the random motion of particles, which leads to a net flow of particles down the concentration gradient. For example, if you open a bottle of perfume in one corner of a room, the scent molecules will diffuse and spread throughout the entire room over time.In biology, diffusion is essential for variousbiological processes. For instance, oxygen and carbondioxide are exchanged in the lungs through the process of diffusion. Oxygen molecules move from the air sacs in the lungs to the bloodstream, where they are transported to different parts of the body. Similarly, carbon dioxide molecules produced as waste in cells diffuse from the bloodstream to the lungs, where they are exhaled.In social sciences, diffusion is used to study the spread of ideas, innovations, or behaviors within a population. It is often referred to as "cultural diffusion" or "social diffusion." For example, the adoption of new technologies, such as smartphones or social media platforms, can spread rapidly through a population through social networks and word-of-mouth. The diffusion of these innovations can be influenced by factors such as social influence, communication channels, and individual characteristics.Large-scale modeling of diffusion involves representing the system as a network of interconnected nodes. Each node represents a location, individual, or entity, and theconnections between nodes represent the pathways through which diffusion occurs. Mathematical models, such as the diffusion equation or network models, are used to simulate and analyze the spread of substances or information in the system.In conclusion, diffusion is a fundamental principle in large-scale modeling that describes how substances or information spread through a system. It is a universal phenomenon that can be observed in various fields, from physics and chemistry to biology and social sciences. Understanding diffusion is crucial for predicting and managing the spread of substances, ideas, or behaviors in complex systems.中文回答:扩散是大规模建模中的一个基本原理,描述了物质或信息在系统中的传播过程。
Measuring absorption coefficientsin small volumes of highly scattering media:source-detector separations for which pathlengths do not depend on scattering propertiesJudith R.Mourant,Irving J.Bigio,Darren A.Jack,Tamara M.Johnson,and Heather lerThe noninvasive measurement of variations in absorption that are due to changes in concentrations ofbiochemically relevant compounds in tissue is important in many clinical settings.One problem withsuch measurements is that the path length traveled by the collected light through the tissue depends onthe scattering properties of the tissue.We demonstrate,using both Monte Carlo simulations andexperimental measurements,that for an appropriate separation between light-delivery and light-collectionfibers the path length of the collected photons does not depend on scattering parameters for therange of parameters typically found in tissue.This is important for developing rapid,noninvasive,andinexpensive methods for measuring absorption changes in tissue.©1997Optical Society of AmericaKey words:Absorbance measurements,turbid media,tissue absorption,concentration measure-ments.1.IntroductionNoninvasive,in vivo methods for measuring absorp-tion coefficients of tissue are potentially very useful biomedical tools.Applications include measure-ments of endogenous compounds such as hemoglobin, bilirubin,and cytochrome oxidase,as well as deter-mining concentrations of exogenous chromophores such as photodynamic therapy and chemotherapy drugs.The specific case of chemotherapy drugs can be used to illustrate the potential benefits of a non-invasive or minimally invasive method for measuring local tissue concentrations.The therapeutic benefit of chemotherapy drugs is determined by the tissue concentration of the drug in the targeted site.The only minimally invasive check available to the oncol-ogist is to track the blood-serum concentration and assume a relationship to the tissue concentration.1 This assumption is unreliable.2Other methods for tracking pharmicokinetics locally,such as microdi-alysis,are invasive.3The ability to track compound concentrations by examining a change in absorption that is due to the presence of the drug in tissue non-invasively would be a tremendous advantage in clin-ical pharmacology,especially for the development of new drugs.4Most work to date has concentrated on making optical measurements in a geometry for which the diffusion approximation is applicable,5–8although neural-network analysis methods have also been used.9,10In this paper we describe a method for measuring small absorption changes by means of a backscattering geometry for which the diffusion ap-proximation is not valid.Rather than a frequency domain or time resolved approach,a simpler and con-sequently less expensive steady-state method is ap-plied.A further advantage of making steady-state measurements is that a wavelength range greater than500nm can be measured in less than1s.The underlying idea in this paper is that,when light delivery and collection are performed withfiber optics in contact with the tissue surface,there is an optimum separation between the source and the de-tectorfibers for which the dependence of the path length on scattering parameters is minimal.This idea can arise from examination of the path lengthsThe authors are with Bioscience and Biotechnology Group CST-4,MS E535,Los Alamos National Laboratory,Los Alamos, New Mexico87545.Received27September1996;revised manuscript received23 December1996.0003-6935͞97͞225655-07$10.00͞0©1997Optical Society of Americafor very small andvery large source–detector sepa-rations.In general,for very small fiber separations,the average path length of the collected photons is longer for a less-scattering media with higher g val-ues than for a highly scattering media with lower g values because the photons need to undergo a certain number of high-angle scattering events to nearly re-verse their direction of travel.11At very large fiber separations one expects the path length to be longer for the more highly scattering media,as is obvious in the case of light transmission.Therefore there must be some crossover regime in the middle for which the average path lengths for different scattering media are quite similar.The first part of the approach taken in developing this method was to investigate general characteris-tics of light-scattering measurements at small fiber separations using Monte Carlo simulations.Based on general trends observed in the Monte Carlo sim-ulations measurement techniques were then devel-oped by experimentation on tissue phantoms.2.General PrinciplesKnowledge of the path length L traveled by collected photons permits the use of Beer’s law,given in Eq.͑1͒,where I is the collected light,a is the absorption coefficient of the medium,and I o is the incident light intensity:I ϭI o exp ͑Ϫa L ͒.(1)For the purpose of discussion we assume that the path length does not depend on wavelength.Then,by comparing the signals at two different wave-lengths,it is possible to determine the difference in absorption coefficients at the two wavelengths:a ͑1͒Ϫa ͑2͒ϭϪlnͫI ͑1͒͞I o ͑1͒I ͑2͒͞I o ͑2͒ͬͲL .(2)For a chromophore with a sharp absorption band ͑e.g.,absorbing at 1but not 2͒,Eq.͑2͒should have great utility for measuring changes in the concentra-tion of a chromophore when L is known.We will show below that,for certain source–detector separa-tions,the path length depends only weakly on the scattering properties over the range found in tissue.We also find that small changes in absorption have a negligible effect on path length.Therefore,if the difference in absorption at 1and 2is small ͑we are only concerned with small changes in absorption ͒and appropriate fiber separations are chosen,the above assumption that the path length does not depend on wavelength is valid.3.Monte Carlo SimulationsTo investigate characteristics of light scattering with small source–detector separations,we performed Monte Carlo simulations by using a previously de-scribed code.10Reflection and refraction at inter-faces between different media and the numerical aperture of the light-delivery and light-collection fi-bers are accounted for in the code.Henyey–Greenstein phase functions were used in the simulation,and the effects of different values of g ϭ͗cos ͘were assessed.The delivery and collection fibers were modeled as being 200m in diameter with a numerical aperture of 0.34.In Fig.1the average path length traveled by the collected photons is plotted versus the scattering coef-ficient,s ,of the medium with the anisotropy factor,g ,adjusted to keep s Јϭs ͑1Ϫg ͒within the range 7.5–15cm Ϫ1.Results are shown for three sets of source-and detector-fiber center-to-center separations,d ,and values of a .For d ϭ1.75and a ϭ0.5cm Ϫ1the greatest percent variation in path length for dif-ferent scattering parameters is 9.5%.This demon-strates that for the correct choice of fiber separation the average path length traveled by the collected pho-tons depends only weakly on scattering characteristics.Interesting issues with regard to Fig.1are ͑1͒whether the essential lack of dependence on scattering properties is valid for the range of absorption param-eters found in tissue and ͑2͒whether the choice of fiber separation and the variation in path length depend on properties of the optical delivery and collection system.With regard to the second issue,fiber numerical aper-ture was found to have a negligible effect on the path length.Path lengths determined from Monte Carlo simulations with numerical apertures of 0.22and 0.34agreed within errors for all but two data points for d ϭ1.75and a ϭ0.5cm Ϫ1.The average path lengths of photons traveling from a source to a collection fiber may not be the same in media with the same scattering coefficients but dif-Fig.1.Average path lengths of photons traveling from a delivery fiber to a collection fiber ͑in a backscatter geometry such as that shown in Fig.1͒as calculated by Monte Carlo simulations.Re-sults are shown for three sets of source-and detector–fiber center-to-center separations,d ,and values of a .For d ϭ1.75and a ϭ0.5cm Ϫ1the percent difference in path lengths is 9.5%.For d ϭ1.75and a ϭ0.1cm Ϫ1the difference in path lengths is ϳ16%.For d ϭ3.0and a ϭ0.1cm Ϫ1the percent difference in path lengths is 30%.The error bars are smaller than the symbols and were determined by several simulations with different numbers of incident photons.Simulations in which more than 400photons were collected all resulted in the same value ͑to within 0.1%͒of the path length.Deviations from this value when smaller numbers of photons were collected were used to estimate the errors.ferent absorption coefficients.To investigate how this might affect the choice of optimal fiber separa-tion,simulations with the same scattering parame-ters as in Fig.1were performed,but with absorption coefficients of 2.0cm Ϫ1and 0.1cm Ϫ1.For an absorp-tion coefficient of 2cm Ϫ1,it was found that at source–detector separations of 1.25,1.75,and 2.25mm the variation in path lengths was 17%,7%,and 14%,respectively.For an absorption coefficient of 0.1cm Ϫ1there is a 40%variation between the longest and the shortest average path lengths at a separation of 1.0mm.At fiber separations of 1.5,1.75͑see Fig.1͒,1.87,and 2.0mm the variation in path lengths is 16%–20%.When fiber separation is increased to 3.0mm,the variation in path lengths increases back to ϳ40%͑see Fig.1͒.This suggests that there is a range of fiber separations for which the path length is nearly independent of scattering parameters.Fur-thermore,the choice of optimum source–detector sep-aration may be valid for a range of absorption coefficients.For example,the 1.75-mm separation found to have less than a 10%variation in path length as a function of scattering properties when the absorption coefficient is 0.5cm Ϫ1would also work well for measurements in a medium with an absorp-tion coefficient of 0.1or 2.0cm Ϫ1.4.Experimental MethodsA cw xenon arc lamp ͑Optical Radiation Corporationcompact lamp,Azusa,California ͒was used as a light source.All optical fibers were 200m in diameter with a numerical aperture of 0.22͑C Technologies,Verona,New Jersey,part number MM0002͒.One fiber was used to deliver light to the tissue phantom,and several fibers were used for light collection at different distances from the delivery fiber.The in-cident light power on the sample was ϳ1.5mW.The source–detector separations were 0.7,0.95,1.2,1.42,1.7,2.18,2.4,2.65,2.98,and 3.2mm.Light from eight of the collection fibers was simultaneously dis-persed by a spectrograph ͑Acton Research Corpora-tion,Acton,Massachusetts,spectrometer number 275i ͒and focused onto a two-dimensional thermoelec-trically cooled CCD array ͑Princeton Instruments,Princeton,New Jersey ͒.In this manner one axis of the CCD array corresponded to wavelength while the other axis corresponded to collection fibers at differ-ent distances from the source fiber.Signal integra-tion times were less than or equal to 1s.To account for any fluctuations in lamp intensity some light from the lamp was attenuated and sent directly to the spectrometer.A schematic of the experimental setup is shown in Fig.2.Aqueous suspensions of polystyrene spheres ͑Duke Scientific Corporation,Palo Alto,California ͒and Intralipid-10%͑Pharmacia,Inc,Clayton,New Jer-sey ͒were used as scattering media.Measurements were made in 5%,10%,and 15%concentrations of Intralipid-10%.Based on our previous measure-ments of the scattering coefficient of Intralipid-10%,these concentrations correspond to reduced scatter-ing coefficients of 7,15,and 22cm Ϫ1at 550nm,andreduced scattering coefficients of 5,10,and 16cm Ϫ1at 650nm.12The value of g at 550nm is ϳ0.80,and at 650nm it is ϳ0.79.13The polystyrene spheres,0.890or 0.913m in diameter,were used at concen-trations of 0.2%,0.4%,and 0.6%by weight.This corresponds to reduced scattering coefficients of ϳ7,ϳ14,and ϳ21cm Ϫ1at 550nm,where g ϭ0.92,and reduced scattering coefficients of ϳ6,ϳ12,and ϳ18cm Ϫ1at 650nm,where g ϭ0.91as calculated by Mie theory.͑An index of refraction of 1.59was used for the polystyrene spheres.͒Fifty milliliters of the sus-pensions were placed in standard 50-ml beakers and the probe was placed on or near the surface of the suspensions for the measurements.For absorbers,India ink and the dye Direct Blue 71͑Aldrich Chem-icals,Milwaukee,Wisconsin,ACS number 4399-55-7͒were used.The Direct Blue dye was added to the scattering solutions in aliquots so as to change the absorption coefficient each time by ϳ0.011cm Ϫ1at 585nm.Spectra were taken with no Direct Blue dye present and after each addition of Direct Blue dye.Data were analyzed by division of each spectrum taken with Direct Blue dye present by the spectrum taken before Direct Blue dye was added.The spec-tra were then normalized to 1at 800nm,a wave-length at which Direct Blue dye does not absorb.͑This normalization was done because the level of the solution changed during the course of adding blue dye,which slightly changes the total amount of light detected,although it has no measurable effect on the wavelength dependence of collected light.͒Finally,the negative natural log was taken over the entire wavelength range.These spectra will be referred to as analyzed spectra,and examples are shown in Fig.3.The area under the curves from 575to 595nm ͑approximately the peak of the absorption ͒were then plotted as a function of the absorption coefficient of the added blue dye at 585nm.Figure 4͑a ͒is an example of such a plot.The absorbances of the Direct Blue dye and of the India ink in nonscattering,aqueous solutions were measured on a Cary 5E spectrophotometer ͑Varian Instruments,Sugar Land,Texas ͒and the spectral shape of the Direct Blue dye was determined to beFig.2.Schematic of the experimental setup.independent of dye concentration.Spectra of the blue dye and India ink are shown in Fig.5.Thefinal part of the experimental investigations was to employ knowledge of the optimumfiber sepa-ration for deducing the amount of Direct Blue dye added to a scattering solution.The protocol was as follows:Measurements of scattering solutions of spheres with India ink were made before and after the addition of aliquots of Direct Blue dye.The data were analyzed without knowledge of the amount of dye that was added.5.ResultsIn Fig.4͑a͒the areas under the traces of analyzed spectra͑see Fig.3͒from575to595nm are plotted versus the absorption coefficient of the added blue dye at585nm for a source-fiber to detector-fiber separa-tion of1.42mm.The different curves are for the sixdifferent scattering solutions.͑India ink was notadded to these scattering solutions.͒Despite thelarge range of scattering properties in the differenttissue phantoms,the curves in Fig.4͑a͒are all verysimilar.For both larger and smallerfiber separa-tions the slopes of curves like those plotted in Fig.4͑a͒depend strongly on scattering properties.Thisis shown for afiber separation of2.98mm in Fig.4͑b͒.To determine the best choice offiber separation,the percent difference in the area from575to595nmof the analyzed spectra for the different scatteringsolutions was plotted versus thefiber separation inFig.6.This percent variation in area was calculatedaccording to Eq.͑4͒,where A min is the minimumvalue of the area from575to585nm and A max is themaximum value of the area from575to595nm forthe set of scattering suspensions measured:percent variationϭA maxϪA minA minϫ100.(3)Values for A min and A max were determined byfittingthe area curves͑e.g.,the curves in Fig.4͒to third-order polynomials and use of the values of thefits atan absorbance of0.15cmϪ1.In Fig.6the percentvariation in the area from575to595nm of the ana-lyzed spectra are shown for the Intralipid suspen-sions,for the sphere suspensions,and for theIntralipid and sphere suspensions combined.Thepercent variation in area is lowest atfiber separa-tions from1.2to1.7mm.The effect of the absorption of the starting mediumon the optimum choice of source–detectorfiber sepa-ration was determined.The above measurements on Fig.3.Spectra taken after the addition of Direct Blue dye,whichhave been divided by a spectrum with no blue dye,normalized to1at800nm,and with their negative natural log taken.͑The source–detector separation was1.42mm.The scattering media was10%Intralipid-10%.͒The values ofagiven in thefigure caption are theaverage absorption coefficient of the scattering solution from575to595nm that is due to the addition of the Direct Blue dye.Fig.4.͑a͒Areas under the curve from575to595nm of analyzedspectra plotted versus the average absorption coefficient of the bluedye from575to595nm;dye was added to the scattering solutionsfor a source–detector separation of1.42mm.͑b͒Same for afiberseparation of2.98mm.Fig.5.Absorption spectra of India ink and blue dye.the three different concentrations of polystyrene spheres were repeated,except that India ink was added to the original solutions to provide a background absorption.The percent difference in the areas from 575to 595nm of the analyzed spectra is plotted versus fiber separation in Fig.7.The percent variation in area is smaller for the data with absorption coefficients of 0.4and 0.8cm Ϫ1than for the data taken with a background absorption of nearly zero for fiber separa-tions greater than ϳ1.5mm.The final stage of the experimental investigations was to estimate the amount of Direct Blue dye added to a scattering solution.Measurements were made before and after the addition of Direct Blue dye to solutions of spheres to which India ink had beenadded to yield an absorption coefficient of 0.4cm Ϫ1at 585nm.The area under the analyzed spectra from 575to 595nm for a source–detector separation of 1.7mm was then computed as before and compared with a calibration curve to determine the change in ab-sorption coefficient.The calibration curve was ob-tained by fitting of a third-order polynomial simultaneously to the results of earlier measure-ments taken at a fiber separation of 1.7mm on three polystyrene sphere suspensions ͑0.2%,0.4%,and 0.6%by weight ͒with a background absorption of 0.4cm Ϫ1.This curve is shown in Fig.8.The results of the measurements of absorption coefficients are given in Table 1.Differences in the measured and the ac-tual absorption coefficient change varied from 0to 20%.6.DiscussionThe Monte Carlo and experimental results both in-dicate that for a particular choice of fiber separation the average path length of the collected photons is quite insensitive to the scattering parameters.The Monte Carlo results show explicitly that the average path length is nearly independent of scattering prop-erties.The experimental results,however,do not directly measure the average path length,͗L ͘.In a semi-infinite medium there are an infinite number of routes that can be taken from the source to the de-tector ͑although very few photons travel some of the routes ͒.Let I i be the number of photons that start on route i ,and let c be any constant.Then Ϫln ͑I I o͒ϭϪln͚ͫi ϭ1ϱI iexp ͑ϪaL i͒I oͬc a ͗L ͘,͚i ϭ1ϱI iϽI o.(4)The fact that path length is independent of scat-tering properties can only be inferred from the exper-imental data based on the fact that the change in signal due to the addition of an absorber does not depend on scattering properties.Nonetheless,we believe that,on the basis of the experimentalandFig.6.Percent difference in the largest and smallest areas under the curve from 575to 595nm of the analyzed spectra plotted versus the source–detector fiber separation.Analyzed spectra were used for which the amount of added Direct Blue dye had an absorption of 0.15cm Ϫ1at 585nm.The three scattering suspen-sions composed of 0.2%,0.4%,and 0.6%0.890-m-diameter spheres had reduced scattering coefficients of ϳ6,ϳ12,and ϳ18cm Ϫ1at 650nm,where g ϭ0.91.The three scattering suspen-sions composed of 5%,10%,and 15%Intralipid-10%had reduced scattering coefficients of 5,10,and 16cm Ϫ1at 650nm,where g ϭ0.79.Fig.7.Percent difference in the largest and the smallest areas under the curve from 575to 595nm of the analyzed spectra for the three scattering solutions composed of 0.2%,0.4%,and 0.6%by weight 0.913-m-diameter polystyrene spheres plotted versus the source–detector fiber separation.Analyzed spectra were used for which the amount of added Direct Blue dye had an absorption of 0.15cm Ϫ1at 585nm.Data are given for three different values of a.Fig.8.Best values of the area of analyzed spectra from 575to 595nm versus amount of added absorber for a fiber separation of 1.7mm for three different values of background absorption.These curves were obtained by fitting raw data ͓similar to that shown in Fig.4͑a ͔͒to a third-order polynomial.computational results,it is a good approximation to say that for the correct choice offiber separation the average path length essentially does not depend on the scattering properties.For a range of scattering parameters relevant to tissue,and for the optical geometry studied here,the optimalfiber separation was found experimentally to be approximately1.7mm for absorption coefficients of 0–0.86cmϪ1.This is in reasonable agreement with the Monte Carlo results,which showed that for absorp-tion coefficients of0.1,0.5,and2cmϪ1afiber separa-tion of1.75was within the range of optimalfiber separations for each absorption coefficient.For clini-cal applications the ability to make measurements through an endoscope is advantageous.The source–detector separation of1.7mm is less than the diameter of typical endoscope working channels.Therefore en-doscopic measurements can take advantage of the fact that at a separation ofϳ1.7mm the path length does not depend significantly on the scattering parameters. What are typical changes in values of absorption coefficients in tissue that are of clinical interest? One area of interest is the measurement of the change in absorption due to the administration of a photodynamic therapy drug.The absorption coeffi-cient of Photofrin in tissue is expected to be roughly 0.01–0.1cmϪ1at630nm.14The technique shown above is therefore applicable to the determination of photodynamic drug therapy concentrations in tissue. Other chromophores of potential interest are hemo-globin and cytochrome oxidase.Based on the ex-pected concentration of cytochrome oxidase in brain tissue and the absorption coefficient of reduced minus oxidized cytochrome,15changes in absorption due to changes in the oxygenation of cytochrome oxidase will be approximately0.003cmϪ1.Therefore this technique is probably not applicable to the measure-ment of changes in the oxidation state of cytochrome oxidase in the brain.This technique is,however, applicable to the measurements of changes in blood oxygenation.Oxyhemoglobin and deoxyhemoglobin have several absorption bands with different molar extinction coefficients.Depending on how much the concentration of hemoglobin changes,an appropriate absorption band can be used to measure the change in concentration.The above is by no means a con-clusive list of absorbing compounds of clinical interest. Others include bilirubin,NADH,and chemotherapy drugs.The techniques described in this paper may also be applicable to environmental monitoring and material science applications.To take advantage of the fact that the path length is independent of the scattering parameters and to use Eq.͑2͒to determine the absorption coefficient,it is necessary to know the path length or to have a calibration relating the measurement to an absorp-tion coefficient.The curves in Fig.8can be used to determine the amount of absorber added to a scatter-ing solution that has a background absorption of ei-therϳ0,0.4,or0.8cmϪ1.This approach was tested foraϭ0.4cmϪ1,and the results are presented in Table1.In all cases the errors in the measured change in absorption were less than20%.The curves for background absorption coefficients of aϳ0andaϭ0.4cmϪ1in Fig.8are different from each other,although the curves foraϭ0.4cmϪ1and aϭ0.8cmϪ1are very similar.This can be under-stood by examining the results in Fig.9,in which the area of analyzed spectra from575to595nm is plotted against the absorption coefficient of added Direct Blue dye͑backgroundaϳ0͒.Thefirst part of this curve is steepest,consistent with the trace foraϳ0having the steepest slope of the traces in Fig.8.For values of a greater than0.4cmϪ1the slope of the curve in Fig. 9does not change very much,and this slope is smaller than the initial slope.Similarly,in Fig.8the slopes of the curves for backgroundaϭ0.4cmϪ1andaϭ0.8 cmϪ1are very similar,and both are less than that for aϳ0.͑The slope of the curve foraϾ0.4cmϪ1is what one would expect when adding blue dye starting from an average background absorption of0.4cmϪ1at 585nm.͒The fact that the curves for backgrounda ϳ0andaϭ0.4cmϪ1are different means that,to take advantage of the fact that the average path length traveled by collected photons is insensitive to the scat-tering properties,it is necessary to know approxi-mately the original background absorption.Based on the20%variation͓see Fig.4͑a͔͒in the areas from575 to595nm of the analyzed spectra,the background Table1.Results of Measurements of Absorption Changes due to theAddition of Direct Blue DyeSolution Type͑%0.913mm spheres͒Measured Changein Absorbance͑cmϪ1͒Actual Changein Absorbance͑cmϪ1͒Error͑%͒0.20.0270.033180.20.0700.0767.90.40.0180.022180.40.0770.065180.60.1280.108190.60.1520.12918Fig.9.Areas under the curve from575to595nm of analyzed spectra plotted versus the average absorption coefficient of the blue dye from575to595nm;dye was added to the scattering solutions for a source–detector separation of1.7mm.The scattering mediawas10%Intralipid-10%and the backgroundaϳ0.The non-linearity of the curve for small values ofaand the linearity of thecurve at larger values ofahave important consequences for how accurately the background absorbance must be known.absorption coefficient must be known to an accuracy of ϳ0.2cmϪ1if it is between0and0.4cmϪ1.For larger values of backgrounda,one does not need to knowa as accurately,but only to an accuracy ofϳ2cmϪ1,as estimated from Fig.9.In some cases it may be pos-sible to estimate the background absorption based on a priori knowledge of the tissue being measured.For other situations a method for roughly estimating the absorption needs to be developed.Furthermore,if the background absorption varies by more than0.2 cmϪ1across the wavelength range in which the species of interest absorbs,the approximation that the path length is independent of wavelength made in Section2 becomes invalid.Finally,it is noted that,although the actual measure-ments were performed in50-ml beakers,the actual vol-ume of the scatterer sampled by the light was much smaller.Monte Carlo simulations show that for afiber separation of1.48mm the actual volume interrogated is ϳ0.1cm3for gϭ0.9andsϭ100cmϪ1.7.ConclusionsWe have shown,using both computational methods and experimental data,that with the correct choice of source–detector separation the average photon path length is essentially independent of scattering prop-erties.For the Monte Carlo simulations the tissue-relevant range of parameters investigated was7.5ϽsЈϽ15cmϪ1with g kept in the range0.8ϽgϽ0.95. In the experimental measurements a larger range of parameters was investigated,with measurements made in suspensions withsЈas large as22cmϪ1. The optimum choice of center-to-center separation of the source and the detectorfibers was found experi-mentally to beϳ1.7mm.This distance is smaller than the working channel of most endoscopes. Therefore,with a carefully designed probe,it should be possible to take advantage of this effect in endo-scopic optical measurements.To test the utility of the average path lengths being insensitive to scattering parameters,measurements were made of suspensions to which unknown amounts of absorber had been added,and the absolute concen-trations of the added absorber were deduced.All the measured values were within20%of the correct val-ues.This demonstrates the utility of this technique, which does not require the use of a reflectance stan-dard,for measuring small absorption changes.That the average path length is essentially inde-pendent of scattering properties is important for making measurements of small changes in absorp-tion.Possible applications include concentration measurements of NADH,bilirubin,oxyhemoglobin, deoxyhemoglobin,photodynamic therapy drugs,and chemotherapy drugs.This work was funded by the Laboratory Directed Research and Development program at Los Alamos National Laboratory.Support for Darren A.Jack and Heather ler through the Science and En-gineering Research Semester Program sponsored by the Department of Energy is gratefully acknowl-edged.The authors also acknowledge useful conver-sations with Bruce J.Tromberg and Andreas H. Hielscher and assistance with the Monte Carlo sim-ulations from James Boyer.References1.J.L.Pujol,D.Cupissol,C.Gestinboyer,J.Bres,B.Serrou,andF.B.Michel,“Tumor-tissue and plasma concentrations of plat-inum during chemotherapy of non-small-cell lung cancer pa-tients,”Cancer Chemother.Pharmacol.27,72–75͑1990͒.2.M.G.Donelli,M.Zucchetti,and M.Dincalci,“Do anticanceragents reach the tumor target in the human brain?”Cancer Chemother.Pharmacol.30,251–260͑1992͒.3.L.Stahle,“Microdialysis in pharmocokinetics,”Eur.J.DrugMetab.Pharmokinet.18,89–96͑1993͒.4.D.J.Kerr and G.Los,“Pharmacokinetic principles of locore-gional chemotherapy,”Cancer Surv.17,105–122͑1993͒.5.S.Fantini,M.A.Francescini,J.B.Fishkin,and B.Barbieri,“Quantitative determination of the absorption spectra of chro-mophores in strongly scattering media:a light-emitting-diode based technique,”Appl.Opt.33,5204–5213͑1994͒. 6.B.W.Pogue and M.S.Patterson,“Frequency-domain opticalabsorption spectroscopy offinite tissue volumes using diffusion theory,”Phys.Med.Biol.39,1157–1180͑1994͒.7.B. C.Wilson,“Measurement of tissue optical properties:methods and theories,”in Optical-Thermal Response of Laser-Irradiated Tissue,A.J.Welch and M.J.C.van Gemert,eds.͑Plenum,New York,1995͒,pp.233–274.8.T.J.Farrell and M.S.Patterson,“A diffusion theory model ofspatially resolved steady-state diffuse reflectance for the non-invasive determination of tissue optical properties in vivo,”Med.Phys.19,879–888͑1992͒.9.A.Kienle,L.Ligle,M.S.Patterson,R.Hibst,R.Steiner,andB.C.Wilson,“Spatially resolved absolute diffuse reflectancemeasurements for noninvasive determination of the optical scattering and absorption coefficients of biological tissue,”Appl.Opt.35,2304–2314͑1996͒.10.P.Marquet,F.Bevilacqua,C.Depeursinge,and E.B.Haller,“Determination of reduced scattering and absorption coeffi-cients by a single charge-coupled-device array measurement, part I:comparison between experiments and simulations,”Opt.Eng.34,2055–2063͑1995͒.11.J.R.Mourant,J.Boyer,A.H.Hielscher,and I.J.Bigio,“In-fluence of the scattering phase function on light transport measurements in turbid media performed with small source–detector separations,”Opt.Lett.21,546–548͑1996͒.12.J.R.Mourant,T.Fuselier,J.Boyer,T.M.Johnson,and I.J.Bigio,“Predictions and measurements of scattering and ab-sorption over broad wavelength ranges in tissue phantoms,”Appl.Opt.36,949–957͑1997͒.13.H.J.van Staveren,C.J.M.Moes,J.van Marle,S.A.Prahl,and M.J.C.Gemert,“Light scattering in Intralipid-10%in the wavelength range of400–1100nm,”Appl.Opt.30,4507–4514͑1991͒.14.M.S.Patterson,B.C.Wilson,J.W.Feather,D.M.Burns,andW.Pushka,“The measurement of dihemaoporphyrin ether concentration in tissue by reflectance spectrophotometry,”Photochem.Photobiol.46,337–343͑1987͒.15.C.E.Cooper,S.J.Matcher,J.S.Wyatt,M.Cope,G.C.Brown,E.M.Nemota,and D.P.Delpy,“Near-infrared spectroscopy ofthe brain:relevance to cytochrome oxidase bioenergetics,”Biochem.Soc.Trans.22,974–980͑1994͒.。
Effect of Re on directional c 0-coarsening in commercial single crystal Ni-basesuperalloys:A phase field studyL.T.Mushongera,a ,⇑M.Fleck,a J.Kundin,a Y.Wang b and H.Emmerich aaMaterials &Process Simulation,University of Bayreuth,GermanybDepartment of Materials Science &Engineering,Ohio State University,USAReceived 16December 2014;revised 24March 2015;accepted 27March 2015Abstract—We study the contributions from solute segregation to rafting in CMSX4and CMSX6superalloys.For this purpose,we use a phase-field model that takes into account the cross interaction between all solutes in Ni-base superalloys.The thermodynamic formulation in the phase-field model is validated by comparing phase-field simulations to ThermoCalc equilibrium calculations and DICTRA sharp interface simulations.Additionally,an elastic term is coupled to the model to account for the misfit,elastic inhomogeneity and applied loading.A technique to quantify the kinetics of microstructures with anisotropic c 0-precipitates is integrated in the model.Our simulations reveal that the rafting process is consid-erably slower in CMSX4than in CMSX6.The presence of slow diffusing elements in CMSX4,particularly the refractory element Re,plays a major role in reducing the rate of rafting.The results suggest that Re slows interface mobility by accumulating along the growth path,a behavior we attri-bute to its low diffusivity and low solubility in the c 0-precipitate.Ó2015Acta Materialia Inc.Published by Elsevier Ltd.All rights reserved.Keywords:Phase-field;Ni-base superalloys;Coarsening;Rhenium1.IntroductionModern turbine blades which are used in the hot regions of gas turbines for the aerospace or stationary power plant application are usually manufactured as casted single crys-talline parts.However,even though grain boundaries are excluded,the degradation behavior of respectively devel-oped single crystal nickel-base superalloys is still quite com-plex involving a number of very different microscopic effects.One of these is the diffusion mediated coarsening of the c 0-precipitates.During creep loading along the <100>crystallographic orientation,the precipitates coarsen anisotropically and form plate-like structures,which are also called rafts.In the end,the rafted microstructure contains quite large and dispersed plate-like precipitates aligned either parallel (P -type rafts)or perpendicular (N -type rafts)to the loading direction [1,2].This is detrimental to the properties of these materials since their superior properties emanate from the size,morphology,distribution and volume fraction of the c 0-precipitates.Generally,the modeling of coarsening of coherent phases in elastically anisotropic and inhomogeneous situa-tions has been reviewed by Fratzl et al.[3].For two precip-itates as shown in Fig.1to raft along the ½010 -direction,elements that are barely soluble in the c 0-precipitate (Co,Cr,Mo,Re,etc.)should diffuse away from the growth path while c 0-formers (Al,Ta and Ti)should diffuse into the precipitates.For various reasons,technically used single crystal superalloys contain significant additions of a whole number of different refractory elements such as Mo,W,Ta and Re.Experimental studies [4–6]have revealed that Re which is added mainly as a solid-solution strengthener sig-nificantly reduces the kinetics of coarsening as well.Ru ¨sing et el.[5]observed the formation of Re clusters in the c -matrix during coarsening in the Ni–Al–Ta–Re superalloy.This segregation behavior of Re plays a major role in deter-mining the kinetics of rafting.An experimental study by Giamei et al.[4]of model alloys based on the MAR-M200alloy concluded that for coarsening to occur,Re solutes should diffuse away from the paths of moving c Àc 0interfaces.Furthermore,simulations by [7]identified that the c Àc 0interface migration is controlled by atomic diffusion of Re.So far,already a number of phase-field approaches have been used to study directional coarsening of rafting in Ni-base superalloys.For example,the rafting phenomenon has been studied using phase-field models formulated on a purely elasticity framework [8,9].Phase-field models with contributions from plastic strains [10,12,14,27]have recently been developed to study rafting.Furthermore,[15]has used an elasto-viscoplastic model to study microstructure evolution under creep loading in Ni-base superalloys.All the models formulated using these various/10.1016/j.actamat.2015.03.0481359-6462/Ó2015Acta Materialia Inc.Published by Elsevier Ltd.All rights reserved.⇑Corresponding author;e-mail:leslie.mushongera@uni-bayreuth.deAvailable online at ScienceDirectActa Materialia 93(2015)60–72/locate/actamatframeworks converge in identifying the lattice misfit,elastic inhomogeneity and applied load as the driving force for the directional coarsening of c0-precipitates.The resulting microstructures obtained using the different approaches are also qualitatively similar although quantitative compar-isons[13]have revealed that the kinetics differ.While it is acknowledged that mechanics play a significant role in determining the orientation of rafts,the influence of diffu-sion on coarsening is crucial for the understanding of the kinetics of the process[16].Most phase-field studies,how-ever,focus on the technically less interesting case of binary Ni–Al systems.However,a ternary phase-field model that is linked directly to commercial CALPHAD software to provide quantitative thermodynamic driving forces has been developed[17].Our previous work[18]on non-direc-tional coarsening in CMSX4and CMSX6during long-term aging revealed that slow diffusing elements play an impor-tant role in determining the nature and kinetics of coarsening.The objective of this paper is to study coarsening in the commercial CMSX4and CMSX6Ni-base superalloys in the presence of applied loading along the½010 direction. We go beyond that objective by studying the contributions from solute segregation to coarsening.Of particular interest is to understand the mechanisms by which solutes with low diffusivities e.g.,Re,affect coarsening.The basis of our investigation is a multicomponent phase-field model formulated based on[18,19].The extended multicomponent phase-field model uses inputs from CALPHAD and kinetic databases for the relative driving forces.The uniqueness of the extended multicom-ponent model lies in that it takes into account the cross interaction of all the solutes in Ni-base superalloys,even the slowest diffusing ones.The thermodynamic formulation in the phase-field model is validated against DICTRA sharp-interface simulations as well as ThermoCalc equilib-rium calculations.Additionally,to account for the misfit between two types of lattices,elastic inhomogeneity and applied loading,the model is coupled to an elastic term based on previous works[18,20,21].Furthermore,a tech-nique to quantify the kinetics of rafting is integrated in the model.The occurrence of Ostwald ripening and coales-cence during the rafting process results in vastly irregular shaped precipitates.As a result,it is quite difficult tofind a global measure that quantifies the evolution of such a microstructure.The approach presented in this paper uses the aspect ratio calculated on the basis of interface orienta-tions as a quantitative measure.A unique feature of this technique is that it quantifies the coarsening of anisotropic and coalesced precipitates.In the following section,we provide the formulation of our extended multicomponent phase-field model.In Section3,the thermodynamic formulation is validated against sharp-interface calculations.Studies of c0-rafting morphological evolution and kinetics are then presented in Section4.In Section5,simulation results of the contri-bution of solute diffusivity to c0-rafting are provided before concluding in the last section.2.Model descriptionIn this section,we present a multicomponent phase-field model coupled to an elastic term that accounts for the con-tributions from the misfit and elastic inhomogeneity.We neglect the contributions from plastic strains.The incorpo-ration of plastic effect is in principle possible but is extre-mely complicated.At least,more complicated than the incorporation of elastic heterogeneity,which has been shown to be a good approximation to the influence of the micro-mechanics on the coarsening kinetics in Ni-base superalloys[8,9,13].Neglecting the anti-phase domains,a single non-con-servedfield variable/ð~x;tÞis introduced to describe the coherent c=c0-two phase microstructure.To distinguish between the two distinct bulk phases,fixed values are assigned to represent them,1for the c-phase and0for the c0-phase.Additionally,it is postulated that/ð~x;tÞvaries smoothly within an interface of afinite width. We neglect the appearance of anti-phase boundaries, which form between different translational variants of the ordered c0-phase.The emphasis of the current model is tofind a quantitative description of the coarsening kinetics of the c0-phase.The current description is solely based on the assumption that the kinetics of coarsening is limited by the complex inter-diffusion kinetics of mul-tiple components in the c-phase.Strictly speaking,we develop a model for coarsening kinetics in superalloys with low c0-fractions.It is possible to include the influ-ence from anti-phase boundaries by an extension of the description of the ordered c0-phase via multiple order parameters as has been demonstrated before [11,12,15,22].However,a quantitative kinetic description of the c0-coarsening behavior including the effect of anti-phase boundary migration requires reliable values for the anti-phase mobilities.They are certainly difficult to measure directly in the experiment.In the following subsection,we give a detailed descrip-tion of the grand-potential functional which is the basis of our model formulation.This is followed by the evolution equations in Section2.2.The determination and rescaling of model parameters is then presented in the last subsection.illustration of solutefluxa compressive loadsuperalloy with a negativeL.T.Mushongera et al./Acta Materialia93(2015)60–72612.1.EnergeticsWe write the thermodynamic functional in terms of the grand-potential formalism [19]X /;l ;~u ½ ¼ZVx int þx ch þx el ðÞdV ;ð1Þon the region V .The interfacial grand-potential density is written as x int /;r /ðÞ¼3rn 2ðr /Þ2þ6rng dw ð/Þ;ð2Þwhere g dw ð/Þ¼/2ð1À/Þ2is the double well potential thatguarantees that the grand-potential density has two local minima at /¼0and /¼1,corresponding to the two dis-tinct bulk phases.The parameter n is the interface width and r is the isotropic interfacial energy.The gradient energy term ðr /Þ2allows the interface to have a finite width.The thermodynamic contribution in Eq.(1)is postu-lated as an interpolation between two bulk grand-potential densitiesx ch ð/;l Þ¼x c ch ðl Þh ð/Þþx cch ðl Þh ð1À/Þ:ð3ÞThe third order polynomial h ð/Þ¼/2ð3À2/Þisadopted as the interpolation function.For the chemical grand-potential density of phase p ,we start from a quadra-tic ansatz [23]x p ch ðl Þ¼ÀX i12X jv p ij l j þA p i !l i þB p;ð4Þwhere A p i ;B pand v p ij are phase dependent parameters that define the parabolic grand-potential density.Since,the gen-eralized susceptibility can be defined as the second deriva-tive of the grand-potential with respect to the diffusion potential,the parameter v ij can be interpreted as a general-ized susceptibility [19],which corresponds to the inverse of the thermodynamic factor.More precisely,v ij are the com-ponents of the matrix ^v which is the inverse of the thermo-dynamic factor matrix b X.The indices i ;j denote the species in the Ni-base superalloy excluding the base element Ni.For details of the multicomponent systems with two or more phases see [24,25].The superscript p indicates the phase,which denotes either the c or the c 0-phase.The parameters are obtained from a CALPHAD database for a defined alloy composition at a specific temperature,see Section 2.3.The variational derivative of the grand-poten-tial functional with respect to the diffusion potential of spe-cies i gives its concentration profile through the c Àc 0interfacec i ð/Þ¼Àd X dl i¼c c i h ð/Þþc c 0i h ð1À/Þ;ð5Þwhere the concentration of a phase is given as c pi ¼À@x p ch @l i¼X jv p ij l j þA p i :ð6ÞThe elastic energy x el in Eq.(1)is defined in terms of interpolated elastic constants C ijkl and the eigenstrain e 0ik ,x el /;~u ðÞ¼12X iklme ik À e 0ik ÀÁC iklm e lm À e 0lm ÀÁ:ð7ÞThe elastic constants and eigenstrain are generally con-centration dependent.However,in this paper,we limit tothe case where these parameters are constant in concentra-tion but vary for the two phases.The mechanical descrip-tion of the model is done by coupling the elastic constants and eigenstrain to the phase fields using the inter-polation function,C iklm ð/Þ¼C c iklm h ð/ÞþC ciklm h ð1À/Þ;ð8Þ e 0ik ¼h 1À/ðÞ2a c 0Àa c a c 0þa cd ik:ð9Þwhere d ik is the Kronecker-delta function and a c and a c 0arethe lattice parameters of the two phases.Since we discuss here a two phase situation,we define the c -phase to be the reference phase so we simply have to obtain for the eigenstrain of the c 0-phase.The elastic constants and the misfit for the studied superalloys are presented in Section 2.3.2.2.Kinetic equationsThe evolution of the phase field is given by a non-linear partial differential equation @/@t ¼ÀK 3rn d Xd/;ð10Þwhere K is the kinetic coefficient.In explicit form,Eq.(10)becomes1K @/@t ¼r 2/À2n2g 0dw ð/ÞÀ13rn @h ð/Þ@/@x ch @h þ@x el@h:ð11ÞNow we formulate the equations of motion for the non-equilibrium diffusion potential,based on the concentration conservation condition @ c i ð/Þ@tþ~r Á~j i ¼0;ð12Þwhere c i ð/Þis the local concentration and ~j i is the respective flux of species i .The thermodynamic driving force for solute diffusion is the gradient of the diffusion potential,so that the flux is given by~j i ¼ÀXjM ij ð/Þ~rl j ;ð13Þwhere M ij ð/Þis the atomic mobility tensor weighted overthe phases.Inserting Eq.(5)together with the definition of the flux Eq.(13)into Eq.(12),we obtain the equations for the evolution of the diffusion potential fields@l i @t ¼X j v ðÞÀ1ij ~r X kM jk ð/Þ~r l k !ÀX j v ðÞÀ1ij D A j @h @/@/@t ;ð14Þwhere the concentration difference D A j ¼A c j ÀA cj is inde-pendent from the phase field and we defined the interpo-lated susceptibility matrix v ij ð/Þas v ij ð/Þ¼h ð/Þv c ij þh ð1À/Þv c 0ij ,with v p ij ¼@c pi =@l j .Here,we point out that in general we have to take the inverse of the susceptibility matrix at each grid-point and each time-step.This is obvi-ously computationally expensive.To simplify Eq.(14),we62L.T.Mushongera et al./Acta Materialia 93(2015)60–72adopt equal matrices for the two phases,that:v cik ¼v c0ikandM c ij¼M c0ij.In the case of equal mobility and thermody-namic matrices for the two phases,the phase-field depen-dence drops and Eq.(14)simplifies to@l i @t ¼XjD ij r2l jÀÁÀD A iv i@h@/@/@t;ð15Þwhere the diffusion coefficient is defined as D ij¼PkvÀ1ikM kj.We restrict to major terms,v i since the off-diagonal ele-ments of the thermodynamic factor matrix obtained from DICTRA are several orders of magnitude smaller than the major terms.The evolution Eqs.(11)and(14)are solved together with the solution of the elastic problem, with the strains e ij being the dependent variables.For coherent phase transformations in a purely elastic frame-work,the mechanical equilibrium can be written asX j @r ijð/Þ@x j¼0;ð16Þwhere the stresses are given by r ij/ðÞ¼@x el=@e ij.The assumption of mechanical equilibrium is legitimate because the information about changes in the elastic state of deformation is transported with the materials’sound speed,which is much faster than the transport of any solute.The influence of dislocation motion on coarsening has been discussed earlier(see review by[26]).The incor-poration of dislocation motion is in principle possible but expensive.It is more expensive than the incorporation of elastic heterogeneity as done in this paper,which has been shown to be a goodfirst approximation to the influ-ence of the micro-mechanics on the coarsening kinetics in Ni-base superalloys[8,9,13].Generally,we expect that plastic deformations due to the presence of dislocations in the c-phase or at the c=c0-phase boundary will lead to a reduction of the local stresses.As a consequence of this,within our elastic framework,we will systemati-cally overestimate the contribution from the elastic energy.Therefore,since the coarsening kinetics increase with increasing elastic driving force,the coarsening kinet-ics are similarly overestimated.Nevertheless,even though the predictions of our model with respect to the kinetics of coarsening may deviate in the absolute value,we still believe that the relative predictions are quantitatively cor-rect even without the explicit inclusion of dislocation motion.Here,the term relative prediction should be understood as follows:Consider for instance the case that the model would predict an increase of the coarsen-ing kinetics by10%upon a certain change in the initial alloying composition,then we believe that the coarsening kinetics of two corresponding experiments with alloys of two correspondingly different initial compositions should give the same10%difference in the coarsening kinetics, provided that all other model limitations are negligible.Thefinal three evolution Eqs.(11),(15)and(16)are all solved on uniform square grids using explicitfinite-differ-ence methods.Second order discretization schemes are used for the spatial derivatives andfirst order schemes for the time derivatives.The elastic equilibrium(16)is solved iter-atively in each time-step by means of a Jacobi relaxation. The advantage of such an explicitfinite difference scheme is that a parallelization by means of domain decomposition is quite straight forward[20,28].2.3.Determination of model parametersThe model parameters are split into three categories; thermodynamic,kinetic and material parameters.The thermodynamic parameters are extracted from the TTNi8 database using the CALPHAD software,ThermoCalc. The kinetic parameters are obtained from the MobNi1 database using the DICTRA software while the material parameters(elastic parameters and interfacial properties) are from the literature.The aim is to perform simulations on c=c0-phase transformations in the CMSX4and CMSX6commercial superalloys with compositions pre-sented in Table1.In the two kinetic Eqs.(11)and(15)we defined the fol-lowing set of thermodynamic parameters v p ij;A p i and B p i. Now,we describe how these parameters can be related to respective c=c0-two phase equilibrium data,as it results from CALPHAD calculations using ThermoCalc in con-junction with the thermodynamic database TTNi8.When two connecting phases are in chemical equilibrium with each other,the diffusion potentials are continuous at the common interface,which leads to the conditionl eq i jc¼l eq i jc0;ð17Þthat results in the common tangent construction.Note,that this equilibrium condition has already been applied in terms of a local equilibrium condition during the modeling in Section2.1,where we omitted the phase index of the non-equilibrium diffusion potential.The full set of thermo-dynamic two-phase equilibrium data required as input for our multi-component description is presented in our previ-ous paper[18].The equilibrium values c p;eqiand f p;eqchc p;eqiðÞare found by the common tangent construction.Theequilibrium values c p;eqiand f p;eqchc p;eqiðÞare extracted from ThermoCalc/TTNi8database for a specific alloy composi-tion at a specific temperature.Now,we show how the CALPHAD input data relate to our model parameters.Let us start with the generalized sus-ceptibility v i,as required in Eq.(11)as well as in(15),which relates to the following thermodynamic factor matrixX p ij¼@2f pch@c i@c jT;p;c k¼@li@c j:ð18ÞWe adopt only the diagonal terms X ii of the thermody-namic factor matrix.This is because the diagonal terms of thermodynamic factor matrix are several orders of magni-tude larger than the off-diagonal terms.Furthermore,we assume equal terms for both phases,X c ii¼X c0ii¼X i.Using Eq.(6)we can obtain the parameter A p i from the equilibrium concentration of phase p,A p i¼c p;eqiÀXjvijl eq j:ð19ÞIf the thermodynamic factor matrices are equal for thetwo phases,Eq.(19)simplifies to A p i¼c p;eqiÀPil eq i=X i.Table1.The composition in wt.%of the superalloys under study.Al Co Cr Mo Re Ta Ti W Ni CMSX4 5.69.6 6.40.6 3.0 6.5 1.0 6.4base CMSX6 4.8 5.09.8 3.0– 2.0 4.7–baseL.T.Mushongera et al./Acta Materialia93(2015)60–7263To determine the values of B p,we require the relation between the equilibrium chemical grand-potential density Eq.(4)and equilibrium free energy density,which is given by the following Legendre transformationf p;eq chc p;eqiðÞ¼x p;eqchþPil eq i c p;eqi.Therefore we obtain forthe values of B p,B p¼f pch À12Xijv ij l eq j l eq i;ð20Þwhere the equilibrium free energy of phase p is set to be equal to the equilibrium Gibbs free energy the phase,f p ch ¼g p;eqch.Note,that generally,the Gibbs free energy Gand the Helmholtz free energy F for solids are equal up to a constant factor[29].Again,when the thermodynamic factor matrices are equal for the two phases,Eq.(20)becomes B p¼f pch À12Pil eq iðÞ2=X i.The thermodynamicparameters are presented in our previous paper[18].The interdiffusion coefficients as required in Eq.(15),are calculated from the mobility database MobNi1using the DICTRA software.The interdiffusion matrices for CMSX4and CMSX6at1273.15K are presented in Table2and3,respectively.In order to compute the elastic energy,experimental val-ues of the misfit and elastic constants for both phases are required.The cubic elastic constants are taken from litera-ture as C c11¼215:5GPa,C c12¼162GPa,C c44¼77:6GPafor the matrix and C c011¼222:7GPa,C c011¼164:2GPa,C c011¼85:6GPa for the precipitate[31].The misfit for bothCMSX4and CMSX6at1273.15K is approximately À0:14%[30].The degree of elastic anisotropy, n A¼2C44=C11ÀC12%2:9,is positive,which shall result in P-type rafting for a compressive external strain,i.e.the elongated c0-precipitates are preferentially aligned parallel to loading direction[8].In the simulations,the interfacial energy r is assumed to be isotropic,with a value of80mJ=m2for CMSX4and 120mJ=m2for CMSX6.The interfacial energy is estimated by the sharp interface approximation[32].The kinetic coef-ficient K is related to the interdiffusion coefficients obtained from DICTRA.Since we study interdiffusion-limited phase transformations,the kinetic coefficient is chosen to be lar-ger than any term in the diffusion matrix.The kinetic coef-ficient is chosen to be1:2Â10À14m2sÀ1for CMSX4and 0:8Â10À14m2sÀ1for CMSX6.3.Model validation:Application to c!c0transformationAs a basic validation of the model,as presented in Section2,its predictions are compared with ThermoCalc equilibrium calculations as well as DICTRA one-dimen-sional sharp interface simulations.The phase-field simula-tions are performed without the contributions from elasticity since the ThermoCalc and DICTRA models do not include elasticity.This implies that the elastic driving force@x el=@h in Eq.(11)is excluded in the simulations. In the phase-field model(PFM),a thin strip of c-matrix of length L x¼1000D x and width L y¼50D x is used.A c0-precipitate of size50D xÂ50D x is embedded on the left side of the c-matrix.A uniform grid spacing M x¼8:6nm and interface width n¼5M x are chosen.Furthermore,for the phase-field set-up to resemble a one-dimensional system, appropriate boundary conditions are chosen.For the phase fields/,Dirichlet boundary conditions are applied on all domain boundaries.For the diffusion potentialfield l, Von Neumann boundary conditions are applied on the left boundary while Dirichlet boundary conditions are set for the rest of the boundaries.A time step of D t¼5Â10À3s is used.The interfacial energy r is isotropic with a value of80mJ=m2.In the following subsection,the ability of the PFM in predicting solute partitioning in multicomponent alloys is validated against ThermoCalc equilibrium calculations. The ability of PFM to capture correct interface kinetics during phase transformations is validated in Section3.2 against DICTRA sharp interface calculations.3.1.Validation of equilibrium multicomponent concentration profilesHere,we validate our extended multi-component model via ThermoCalc calculations with respect to equilibrium phase concentrations.Quasi one-dimensional phase-field simulations of the c!c0phase transformation at 1273.15K for CMSX4and CMSX6superalloys with com-positions given in Table.1are performed.The initial com-position of each species is set to be uniform in both the c0-precipitate and the c-matrix.The precipitate grows limited by solute diffusion until the equilibrium volume fraction of about56%is reached.Fig.2shows PFM predictions of the time evolution of(a)Ta and(b)Cr concentration profilesTable2.The scaled diffusion matrix D cik=D0for CMSX4obtained from DICTRA using MobNi1database(Scaling factor D0¼10À15m2sÀ1).Al Co Cr Mo Re Ta Ti WAl 5.98À1.030.155À0.132À0.1320.02220.0777À0.165 Co0.780.575À0.255À0.0262À0.00260.02890.0323À0.0363 Cr0.18À0.066 1.01À0.0122À0.05560.04770.100À0.0642 Mo 1.72À0.046À0.0540.657À0.0490.03050.0098À0.0593 Re 1.30À0.035À0.011À0.0381À0.00560.03050.0716À0.0218 Ta 2.40À0.280 1.16À0.0009À0.08820.7160.0131À0.106 Ti 2.31À0.810 1.030.0148À0.1010.0536 1.80.119W 2.62À0.7760.560À0.0089À0.05750.03940.1120.0004Table3.The scaled diffusion matrix D cik =D0for CMSX6obtained fromDICTRA using MobNi1database(Scaling factor D0¼10À15m2sÀ1).Al Co Cr Mo Ta TiAl 3.96À0.3200.6440.01880.01650.367CoÀ0.3200.989À0.296À0.02490.01490.155Cr0.644À0.291 1.2À0.02450.02400.436Mo0.0188À0.1860.07320.7440.01620.422Ta0.01650.0762 1.940.006980.7100.583Ti0.367À0.294 1.600.01380.0270 2.4764L.T.Mushongera et al./Acta Materialia93(2015)60–72。
a r X i v :q u a n t -p h /0702010v 2 16 F eb 2007Field quantization in inhomogeneous anisotropic dielectrics with spatio-temporal dispersionL G Suttorp Instituut voor Theoretische Fysica,Universiteit van Amsterdam,Valckenierstraat 65,1018XE Amsterdam,The Netherlands Abstract.A quantum damped-polariton model is constructed for an inhomogeneous anisotropic linear dielectric with arbitrary dispersion in space and time.The model Hamiltonian is completely diagonalized by determining the creation and annihilation operators for the fundamental polariton modes as specific linear combinations of the basic dynamical variables.Explicit expressions are derived for the time-dependent operators describing the electromagnetic field,the dielectric polarization and the noise term in the latter.It is shown how to identify bath variables that generate the dissipative dynamics of the medium.PACS numbers:42.50.Nn,71.36.+c,03.70.+k Submitted to:J.Phys.A:Math.Gen.1.IntroductionQuantization of the electromagneticfield in a linear dielectric medium is a nontrivial task for various reasons.First of all,since the response of a dielectric to externalfields is frequency-dependent in general,temporal dispersion should be taken into account.The well-known Kramers-Kronig relation implies that dispersion is necessarily accompanied by dissipation,so that the quantization procedure has to describe an electromagneticfield that is subject to damping.Furthermore,since the transverse and the longitudinal parts of the electromagneticfield play a different role in the dynamics,the quantization scheme should treat these parts separately. For inhomogeneous and spatially dispersive media this leads to complications in the quantization procedure,which further increase in the presence of anisotropy.When the losses in a specific range of frequencies are small,temporal dispersion can be neglected.Field quantization in an inhomogeneous isotropic dielectric medium without spatio-temporal dispersion has been accomplished by employing a generalized transverse gauge,which depends on the dielectric constant[1]–[6].A phenomenological scheme forfield quantization in lossy dielectrics has been formulated on the basis of thefluctuation-dissipation theorem[7]–[10].By adding a fluctuating noise term to the Maxwell equations and postulating specific commutation relations for the operator associated with the noise,one arrives at a quantization procedure that has been quite successful in describing the electromagneticfield in lossy dielectrics.An equivalent description in terms of auxiliaryfields has been given as well[11,12],while a related formalism has been presented recently[13].However, all of these quantization schemes have the drawback that the precise physical nature of the noise term is not obvious,since its connection to the basic dynamical variables of the system is left unspecified.As a consequence,the status of the commutation relations for the noise operator is that of a postulate.A justification of the above phenomenological quantization scheme has been sought by adopting a suitable model for lossy dielectrics.To that end use has been made of an extended version of the Hopfield polariton model[14]in which damping effects are accounted for by adding a dynamical coupling to a bath environment. Huttner and Barnett[15,16]were thefirst to employ such a damped-polariton model in order to achievefield quantization for a lossy dielectric.Their treatment,which is confined to a spatially homogeneous medium,yields an explicit expression for the noise term as a linear combination of the canonical variables of the model.In a later development,an alternative formulation of the quantization procedure in terms of path integrals has been given[17],while Laplace transformations have been used to simplify the original formalism[18].More recently,the effects of spatial inhomogeneities in the medium have been incorporated by solving an inhomogeneous version of the damped-polariton model[19]–[21].In this way a full understanding of the phenomenological quantization scheme has been reached,at least for those dielectrics that can be represented by the damped-polariton models mentioned above.The latter proviso implies a limitation in various ways.First,one would like to include in a general model not only the effects of spatial inhomogeneity,but also those of spatial dispersion.Furthermore,it would be desirable to incorporate the consequences of spatial anisotropy,so that the theory encompasses crystalline media as well.Finally,while treating temporal dispersion and the associated damping,we would like to refrain from introducing a bath environment in the Hamiltonian from the start.Instead,we wish to formulate the Hamiltonian interms of a full set of material variables,from which the dielectric polarization emerges by a suitable projection.In this way we will be able to account for any temporal dispersion that is compatible with a few fundamental principles like causality and net dielectric loss.For a homogeneous isotropic dielectric without spatial dispersion such an approach has been suggested before[16,22].Recently,several attempts have been made to remove some of the limitations that are inherent to the earlier treatments.In[23]the effects of spatial dispersion are considered in a path-integral formalism for a model that is a generalization of that of the original Huttner-Barnett approach.The discussion is confined to homogeneous dielectrics and to leading orders in the wavenumber,so that an analysis of the effects of arbitrary spatial dispersion in an inhomogeneous medium is out of reach.In [24]crystalline media have been discussed in the framework of a damped-polariton model with an anisotropic tensorial bath coupling.A complete diagonalization of the model along the lines of[15,16]turned out to face difficulties due to the tensorial complexity,so that the full dynamics of the model is not presented.Both spatial dispersion and anisotropy are incorporated in the quantization scheme discussed in [25].Use is made of a Langevin approach in which a damping term of a specific form is introduced.The commutation relations for the noise operator are postulated,as in the phenomenological quantization scheme.Finally,several treatments have appeared in which a dielectric model is formulated while avoiding the explicit introduction of a bath[26,27].However,a complete expression for the noise polarization operator in terms of the basic dynamical variables of the model is not presented in these papers.A direct proof of the algebraic properties of the latter operator is not furnished either.In the present paper,we shall show how the damped-polariton model can be generalized in such a way that all of the above restrictions are removed.As we shall see,our general model describes the quantization and the time evolution of the electromagneticfield in an inhomogeneous anisotropic lossy dielectric with arbitrary spatio-temporal dispersion.A crucial step in arriving at our goals will be the complete diagonalization of the Hamiltonian.It will lead to explicit expressions for the operators describing the electromagneticfield and the dielectric polarization,and for the noise contribution contained in the latter.In this way the commutation relations for the noise operator will be derived rigorously from our general model,instead of being postulated along the lines of the phenomenological scheme.Finally,we shall make contact with previous treatments by showing how to construct a bath that generates damping phenomena in the dynamical evolution of the model.2.Model HamiltonianIn this section we shall construct the general form of the Hamiltonian for a polariton model describing an anisotropic inhomogeneous dispersive dielectric.The result, which we shall obtain by starting from a few general principles,will contain several coefficients that can be chosen at will.As we shall see in a subsequent section,these coefficients can be adjusted in such a way that the susceptibility gets the appropriate form for any causal lossy dielectric that we would like to describe.The Hamiltonian of the electromagneticfield is taken to have the standard form:H f= d r 12µ0[∇∧A(r)]2 (1) with the Hermitian vector potential A(r)and its associated Hermitian canonicalmomentumΠ(r).We use the Coulomb gauge∇·A=0.In this gauge bothΠand A are transverse.The canonical commutation relations read[Π(r),A(r′)]=−i¯hδT(r−r′),[Π(r),Π(r′)]=0,[A(r),A(r′)]=0(2) where the transverse delta function is defined asδT(r)=Iδ(r)+∇∇(4πr)−1,with I the unit tensor.The Hamiltonian of the dielectric material medium is supposed to have the general formH m=¯h d r ∞0dωωC†m(r,ω)·C m(r,ω)(3) with the standard commutation relations for the creation and annihilation operators: [C m(r,ω),C†m(r′,ω′)]=Iδ(r−r′)δ(ω−ω′),[C m(r,ω),C m(r′,ω′)]=0.(4) The medium operators commute with thefield operators.The material creation and annihilation operators are assumed to form a complete set describing all material degrees of freedom.Hence,any material dynamical variable, for instance the dielectric polarization density,can be expressed in terms of these operators.For a linear dielectric medium,the Hermitian polarization density is a linear combination of the medium operators,which has the general form:P(r)=−i¯h d r′ ∞0dω′C m(r′,ω′)·T(r′,r,ω′)+h.c.(5) The complex tensorial coefficient T appearing in this expression will be determined later on,when the dielectric susceptibility is properly identified.On a par with P we define its associated canonical momentum density W,again as a linear combination of the medium operatorsW(r)=− d r′ ∞0dω′ω′C m(r′,ω′)·S(r′,r,ω′)+h.c.(6) with a new complex tensorial coefficient S that is closely related to T,as we shall see below.For future convenience we inserted a factorω′in the integrand and a minus sign in front of the integral.As W and P are a canonical pair,they must satisfy the standard commutation relations[W(r),P(r′)]=−i¯h Iδ(r−r′),[W(r),W(r′)]=0,[P(r),P(r′)]=0.(7)Hence,the coefficients S and T have to fulfill the requirements:d r′′ ∞0dω′′˜T(r′′,r,ω′′)·T∗(r′′,r′,ω′′)−c.c.=0(8)d r′′ ∞0dω′′ω′′˜S(r′′,r,ω′′)·T∗(r′′,r′,ω′′)+c.c.=Iδ(r−r′)(9)d r′′ ∞0dω′′ω′′2˜S(r′′,r,ω′′)·S∗(r′′,r′,ω′′)−c.c.=0(10)where the tilde denotes the transpose of a tensor and the asterisk the complex conjugate.Furthermore,the Hamiltonian should contain terms describing the interaction between thefield and the medium.Two contributions can be distinguished:a transverse part and a longitudinal part.In a minimal-coupling scheme,which we shall adopt here,the transverse part is a bilinear expression involving the transversevector potential A and the canonical momentum density W.To ensure compatibility with Maxwell’s equations an expression quadratic in A should be present as well,as we shall see in the following.For dielectrics with spatial dispersion both expressions are non-local.The general form of the transverse contribution to the interaction Hamiltonian isH i=−¯h d r d r′W(r)·F1(r,r′)·A(r′)+1d r d r′∇·P(r)∇′·P(r′)2ε0 d r{[P(r)]L}2=Π(r,t)(13)ε0˙Π(r,t)=1ε0 d r′T∗(r,r′,ω)·[P(r′,t)]L′(15) where all operators now depend on time.The subscript L′denotes the longitudinal part with respect to r′.The time derivative of the polarization density follows by combining(5)and(15):˙P(r,t)=−¯h d r′ ∞0dω′ω′C m(r′,ω′,t)·T(r′,r,ω′)−¯h d r′ d r′′ d r′′′ ∞0dω′ω′˜T(r′,r,ω′)·S∗(r′,r′′,ω′)·F1(r′′,r′′′)·A(r′′′,t)¯h−iEliminatingΠfrom(13)and(14)wefind an inhomogeneous wave equation for the vector potential∆A(r,t)−12ε0[Π(r)]2+12¯h d r d r′A(r)·F(r,r′)·A(r′)+1The complex tensorial coefficient T can be chosen freely.It has to satisfy two constraints,thefirst of which has been written already in(8).The second one follows by substituting(20)in(10):d r′′ ∞0dω′′ω′′2˜T(r′′,r,ω′′)·T∗(r′′,r′,ω′′)−c.c.=0.(22)Finally,insertion of(20)in(9)leads to the equalityd r′′ ∞0dω′′ω′′˜T(r′′,r,ω′′)·T∗(r′′,r′,ω′′)+c.c.=F(r,r′).(23)This relation defines the real tensor F in terms of T.It shows that F(r,r′)satisfies the symmetry property˜F(r,r′)=F(r′,r),as we know already from the way F2occurs in (11).As an integral kernel the tensor F(r,r′)is positive-definite.This is established by taking the scalar products of(23)with real vectors v(r)and v(r′),and integrating over r and r′.The result is positive for any choice of v.As a consequence,the inverse of F is well defined.The polarization density is given by(5),while the canonical momentum density reads according to(6)with(20):W(r)=− d r′ d r′′ ∞0dω′ω′C m(r′,ω′)·T(r′,r′′,ω′)·F−1(r′′,r)+h.c.(24) where the right-hand side contains the inverse of F.The Hamiltonian(21)has been constructed by starting from general forms for its parts H f,H m,H i and H es and requiring consistency with Maxwell’s equations.It may be related to a Lagrange formalism,as is shown in Appendix A.In the following we shall investigate the dynamics of the model defined by(21). As the Hamiltonian is quadratic in the dynamical variables it is possible to accomplish a complete diagonalization.This will be the subject of the next section.3.Diagonalization of the HamiltonianWe wish tofind a diagonal representation of the Hamiltonian(21)in the formH=¯h d r ∞0dωωC†(r,ω)·C(r,ω).(25) The creation and annihilation operators satisfy the standard commutation relations of the form(4).They are linear combinations of the dynamical variables in(21): C(r,ω)= d r′ f1(r,r′,ω)·A(r′)+f2(r,r′,ω)·Π(r′)+ ∞0dω′ f3(r,r′,ω,ω′)·C m(r′,ω′)+f4(r,r′,ω,ω′)·C†m(r′,ω′) (26) with as-yet unknown tensorial coefficients f i,thefirst two of which are taken to be transverse in their second argument.To determine f i we use Fano’s method[28]:we evaluate the commutator[C(r,ω),H]and equate the result to¯hωC(r,ω).Comparing the contributions involving the various canonical operators we arrive at the four equationsii ε0 d r ′′ d r ′′′ ∞0dω′′{f 3(r ,r ′′,ω,ω′′)·[T ∗(r ′′,r ′′′,ω′′)]L ′′′+f 4(r ,r ′′,ω,ω′′)·[T (r ′′,r ′′′,ω′′)]L ′′′}·˜T (r ′,r ′′′,ω′)=ωf 3(r ,r ′,ω,ω′)(29)−i ¯h ω′ d r ′′f 2(r ,r ′′,ω)·˜T∗(r ′,r ′′,ω′)−ω′f 4(r ,r ′,ω,ω′)−¯h c 2d r ′′T ∗(r ,r ′′,ω)·[G (r ′′,r ′,ω−i 0)]T ′(31)f 2(r ,r ′,ω)=i µ0ω d r ′′T ∗(r ,r ′′,ω)·[G (r ′′,r ′,ω−i 0)]T ′(32)f 3(r ,r ′,ω,ω′)=I δ(r −r ′)δ(ω−ω′)−µ0¯h ω d r ′′ d r ′′′T ∗(r ,r ′′,ω)·[G (r ′′,r ′′′,ω−i 0)]T ′′′·˜T(r ′,r ′′′,ω′)+µ0¯h ω2ω+ω′d r ′′ d r ′′′T ∗(r ,r ′′,ω)·G (r ′′,r ′′′,ω−i 0)·˜T ∗(r ′,r ′′′,ω′).(34)The Green function G (r ,r ′,z )occurring in these expressions is defined as the solution of the differential equation:− G (r ,r ′,z )×←−∇′ ×←−∇′+z 2c2 d r ′′G (r ,r ′′,z )·χ(r ′′,r ′,z )=I δ(r −r ′)(35)The spatial derivative operator ∇′acts to the left on the argument r ′of G (r ,r ′,z ).According to this inhomogeneous wave equation the Green function determines the propagation of waves through a medium that is characterized by a tensor χ(r ,r ′,z).The latter plays the role of a non-local anisotropic susceptibility,as will become clear in the next section.It is defined in terms of T and its complex conjugate asχ(r,r′,z)≡¯hω−z˜T(r′′,r,ω)·T∗(r′′,r′,ω)+1ε0 d r′′˜T(r′′,r,ω)·T∗(r′′,r′,ω)(39) for positiveωand byχ(r,r′,ω+i0)−χ(r,r′,ω−i0)=−2πi¯h2πi ∞−∞dω1ε0F(r,r′)(43)∞−∞dωω2[χ(r,r′,ω+i0)−χ(r,r′,ω−i0)]=0.(44)Incidentally,we remark that for large|z|the asymptotic behaviour ofχfollows from (41)with(42)–(44)asχ(r,r′,z)≃−¯hz2+O 1c2G(r,r′,z)+ z2The medium operator C m (r ,ω)is a linear combination of the diagonalizing operator and its Hermitian conjugate:C m (r ,ω)= d r ′ ∞0dω′ ˜f ∗3(r ′,r ,ω′,ω)·C (r ′,ω′)−˜f 4(r ′,r ,ω′,ω)·C †(r ′,ω′) (53)asfollowsbytaking the inverse of (26).Substituting (33)–(34)and inserting the result in (5)we get after some algebraP (r ,t )=i ¯h c 2 d r′ d r ′′ ∞0dωω2[G (r ,r ′,ω+i 0)]L ·˜T(r ′′,r ′,ω)·C (r ′′,ω)e −i ωt +h .c .(55)From the Maxwell equation ∇·(ε0E +P )=0it follows that the left-hand side is proportional to the longitudinal part [E (r ,t )]L of the electric field.The ensuing expression for the latter is analogous to (52),so that we arrive at the following result for the complete electric field:E (r ,t )=i µ0¯h d r ′ d r ′′ ∞0dωω2G (r ,r ′,ω+i 0)·˜T(r ′′,r ′,ω)·C (r ′′,ω)e −i ωt +h .c .(56)Inspection of (54)shows that the polarization consists of two terms.The first term is proportional to the electric field,at least in Fourier space and after taking a spatial convolution integral.The proportionality factor is χ(r ,r ′,ω),which plays the role of a susceptibility tensor,as we anticipated in the previous section.The second term in (54)is not related to the electric field.It represents a noise polarization density P n (r ,t )defined as P n (r ,t )=−i ¯h d r ′ ∞0dω˜T (r ′,r ,ω)·C (r ′,ω)e −i ωt +h .c .(57)that has to be present so as to yield a quantization scheme in which the validity of the canonical commutation relations in the presence of dissipation is guaranteed.Introducing the Fourier transform P n (r ,ω)via P n (r ,t )= ∞dωP n (r ,ω)e −i ωt +h .c .(58)and its counterparts E (r ,ω)and P (r ,ω),we get from (54)with (56):P (r ,ω)= d r ′χ(r ,r ′,ω+i 0)·E (r ′,ω)+P n (r ,ω).(59)The Fourier-transformed noise polarization density is proportional to the diagonalizing operator:P n (r ,ω)=−i ¯h d r ′˜T(r ′,r ,ω)·C (r ′,ω).(60)as follows from (57)and(58).As wehavegot now an explicit expression for P n (r ,ω)we can derive its commutation relation.By employing (39)we obtain:P n (r ,ω),P †n(r ′,ω′) =−i ¯h ε0c 2 d r′ d r ′′ ∞0dωω2χ(r ,r ′,ω+i 0)·G (r ′,r ′′,ω+i 0)·P n (r ′′,ω)e −i ωt+ ∞0dωP n (r ,ω)e −i ωt +h .c .(64)By adding (62)and (64)we get an expression for the dielectric displacement D (r ,t ).Upon using (47)we may write it as D (r ,t )=− d r ′ ∞0dω∇×[∇×G (r ,r ′,ω+i 0)]·P n (r ′,ω)e −i ωt +h .c .(65)Clearly,the dielectric displacement is purely paring with (63)we find that Maxwell’s equation ∇×B (r ,t )=µ0∂D (r ,t )/∂t is satisfied.It is instructive to return to the time-dependent representation of the linear constitutive relation (59):P (r ,t )= d r ′ t−∞dt ′χ(r ,r ′,t −t ′)·E (r ′,t ′)+P n (r ,t )(66)with the time-dependent susceptibility tensor defined by writing:χ(r ,r ′,ω+i 0)= ∞0dt χ(r ,r ′,t )e i ωt .(67)The convolution integral in the first term of (66),which expresses the causal response of the medium,depends on the electric field at all times t ′preceding t and at all positions r ′,whereas the second contribution is the noise term,which in classical theory does not appear.Sometimes [26]a different splitting of the various contributions to the polarization density is proposed,by writing an equation of the general form of (66)in which the response term covers only a limited range of values of t ′,for instance t ′∈[0,t ]for t >0.In such a formulation the convolution integral does not represent the full causal response of the medium,so that part of the response is hidden in the second term.As a consequence,the latter is no longer a pure noise term,so that it cannot be omitted in the classical version of the theory.The above expressions for the fields and the polarization density in terms of the Fourier-transformed noise polarization density satisfying the commutation relations (61)are the central results in the present formalism for field quantization in inhomogeneous anisotropic dielectric media with spatio-temporal dispersion.Although we are describing dissipative media,it has not been necessary to explicitlyintroduce a bath,as is commonly done in the context of damped-polariton treatments [15,16,19,20].In the next section,we shall show how a bath may be identified in the present model.5.Bath degrees of freedomIn the Hamiltonian(21)the dielectric medium is described by the operators C m(r,ω) and C†m(r,ω).The polarization density P(r)and its canonical conjugate W(r)are given in(5)and(24)as suitable linear combinations of the medium operators C m and C†m.Since the latter depend on the continuous variableω,they describe many more degrees of freedom than P and W.The extra degrees of freedom can be taken together to define a so-called‘bath’,which is independent of P and W.Although the name might suggest otherwise,the bath as introduced in this way is part of the medium itself,and not some external environment.Its role is to account for the dissipative effects in the dispersive medium,which may arise for instance through a leak of energy by heat production.In the following we shall identify the operators associated to the bath.Subsequently,we shall show how the Hamiltonian can be rewritten so as to give an explicit description of the coupling between the polarization and the bath.In this way,we will be able to compare our model to its counterparts in previous papers [15,16,19,20].The bath will be described by operators C b(r,ω)and C†b (r,ω)satisfying the usualcommutation relations.These bath operators are linear combinations of the medium operators:C b(r,ω)= d r′ ∞0dω′ H1(r,r′,ω,ω′)·C m(r′,ω′)+H2(r,r′,ω,ω′)·C†m(r′,ω′) (68) with tensor coefficients H i that will be determined presently.Since the bath variables are by definition independent of both P(r′)and W(r′)for all r′,they have to commute with the latter.With the use of(5)and(24)we get from these commutation relations the following conditions:d r′′ ∞0dω′′[H1(r,r′′,ω,ω′′)·T∗(r′′,r′,ω′′)+H2(r,r′′,ω,ω′′)·T(r′′,r′,ω′′)]=0(69)d r′′ ∞0ω′′ω′′[H1(r,r′′,ω,ω′′)·T∗(r′′,r′,ω′′)−H2(r,r′′,ω,ω′′)·T(r′′,r′,ω′′)]=0.(70)To determine H i we start from the following Ansatz:H1(r,r′,ω,ω′)= d r′′ δ(ω−ω′)h1(r,r′′,ω)+1ω+ω′h2(r,r′′,ω)·˜T∗(r′,r′′,ω′)(72) with new tensor coefficients h i.Substituting these expressions in(69)–(70)and using (36)and(39),wefind that both of these conditions are simultaneously satisfied whenh 1and h 2are related as d r ′′h 2(r ,r ′′,ω)·χ(r ′′,r ′,ω+i 0)==12i d r ′′ d r ′′′h 1(r ,r ′′,ω)·[χ(r ′′,r ′′′,ω+i 0)−χ(r ′′,r ′′′,ω−i 0)]·˜h ∗1(r ′,r ′′′,ω)==π¯h ε0d r ′′T ∗(r ,r ′′,ω)·χ−1(r ′′,r ′,ω+i 0).(76)It should be noted that the coefficients h i are determined up to a unitary transformation.This freedom,which is available to H i as well,corresponds to a natural arbitrariness in the choice of the bath operators themselves.As the bath operators have been identified now,we can rewrite the Hamiltonian so as to clarify their role in the dynamics of our model.To that end we have to eliminate the medium operators C m in favor of the bath operators C b .Employing(5),(24)and (68)we can write the medium operators as:C m (r ,ω)= d r ′T ∗(r ,r ′,ω)· i 2ε0[Π(r )]2+12πi ¯h 2 d r d r ′ d r ′′ d r ′′′P (r )·F −1(r ,r ′)· ∞0dωω3[χ(r ′,r ′′,ω+i 0)−χ(r ′,r ′′,ω−i 0)] ·F −1(r ′′,r ′′′)·P (r ′′′)+12¯hd r d r ′W (r )·F (r ,r ′)·W (r ′)−¯h d r d r ′W (r )·F (r ,r ′)·A (r ′)+1i¯h −causality and positive-definiteness of the dissipative energy loss.Incidentally,it may be remarked that amplifying dielectric media,which have been treated in the context of the phenomenological quantization scheme as well[10,29],are not covered by the present damped-polariton model.To describe media with a sustained gain,e.g.a laser above threshold,one has to incorporate a driving mechanism in the Hamiltonian, which accounts for the ongoing input of energy that is indispensable for a stationary gain.As we have shown,the time evolution of the dynamical variables forfield and matter can be determined completely by deriving the operators that diagonalize the Hamiltonian.The diagonalizing operators are closely related to the noise part of the polarization density,which plays an important role in the phenomenological quantization scheme.The proof of the commutation properties of the noise polarization density follows from its relation to the diagonalizing operators.In setting up our model Hamiltonian we have avoided to introduce a bath environment from the beginning.The subsequent formalism could be developed without ever discussing such a bath.Nevertheless,one may be interested in an analysis of the complete set of degrees of freedom of the dielectric medium in our model.If that analysis is carried out,onefinds,as we have seen above,that specific combinations of medium variables can be associated to what may be called a bath.The coupling of the polarization to this bath can be held responsible for the dissipative losses that characterize a dispersive dielectric.AcknowledgmentsI would like to thank dr.A.J.van Wonderen for numerous discussions and critical comments.Appendix grangian formulationIn this appendix we shall show how the Hamiltonian(21)can be related to a Lagrange formalism.We start by postulating the following Lagrangian for an anisotropic linear dielectric with spatio-temporal dispersion that interacts with the electromagneticfield: L= d r 12µ0[∇∧A(r)]2+12ε0 d r{[P(r)]L}2.(A.1) Here A(r)is the transverse vector potential and Q m(r,ω)are material coordinates depending on position and frequency.The polarization density P(r)is taken to be an anisotropic and non-local linear combination of these material coordinates of the form P(r)= d r′ ∞0dω′Q m(r′,ω′)·T0(r′,r,ω′)(A.2) with a real tensor coefficient T0(r,r′,ω).One easily verifies that the Lagrangian equations have the form1∆A(r,t)−¨Q m (r ,ω,t )+ω2Q m (r ,ω,t )= d r T 0(r ,r ′,ω)·E (r ′,t )(A.4)with the electric field given as E (r ,t )=−˙A (r ,t )−(1/ε0)[P (r ,t )]L .The first Lagrangian differential equation is consistent with Maxwell’s equation,as it should.The second Lagrangian equation shows that the material coordinates are harmonic variables that are driven by the electric field in an anisotropic and non-local way.Introducing the momenta Π(r )and P m (r ,ω)associated to A and Q m asΠ(r )=ε0˙A (r )(A.5)P m (r ,ω)=˙Q m (r ,ω)+ d r ′T 0(r ,r ′,ω)·A (r ′)(A.6)we obtain the Hamiltonian corresponding to (A.1)in the standard fashion.The result is:H = d r 12µ0[∇∧A (r )]2+12 d r d r ′ d r ′′ ∞dωA (r )·˜T 0(r ′,r ,ω)·T 0(r ′,r ′′,ω)·A (r ′′)+12 1/2d r ′ C m (r ′,ω)·U (r ′,r ,ω)+C †m (r ′,ω)·U ∗(r ′,r ,ω) (A.8)Q m (r ,ω)=i ¯hAppendix B.Evaluation of the tensorial coefficients f i In thisappendix wewill showhowtheequations (27-30)can be solved.We start by using (27)to eliminate f 1from (28).As a result we obtain the differential equation:∆′f 2(r ,r ′,ω)+ω2ε0 d r′′ d r ′′′ d r ′′′′ ∞0dω′′{f 3(r ,r ′′,ω,ω′′)·[T ∗(r ′′,r ′′′,ω′′)]L ′′′+f 4(r ,r ′′,ω,ω′′)·[T (r ′′,r ′′′,ω′′)]L ′′′}·˜T(r ′′′′,r ′′′,ω′)·T ∗(r ′′′′,r ′,ω′)=0.(B.2)A similar relation is obtained by multiplying (30)by T (r ′,r ′′′′,ω′)and integrating over r ′:−i ¯h ω′ d r ′′ d r ′′′f 2(r ,r ′′,ω)·˜T∗(r ′′′,r ,′′,ω′)·T (r ′′′,r ′,ω′)−(ω+ω′) d r ′′f 4(r ,r ′′,ω,ω′)·T (r ′′,r ′,ω′)−¯h c 2f 2(r ,r ′,ω)−i µ0ωd r ′′ ∞0dω′′{f 3(r ,r ′′,ω,ω′′)·[T ∗(r ′′,r ′,ω′′)]T ′+f 4(r ,r ′′,ω,ω′′)·[T (r ′′,r ′,ω′′)]T ′}=0.(B.5)Here we used the transversality of f 2(r ,r ′,ω)in its second argument to write the first term as a repeated vector product,with the spatial derivative operator ∇′acting to the left on theargument r ′of thefunction f 2(r ,r ′,ω).The integral in (B.5)contains the transverse parts of T and T ∗only.A more natural form of the differential equation,with the full tensors T and T ∗,is obtained by introducing instead of f 2a new tensor g defined as:g (r ,r ′,ω)≡i ωf 2(r ,r ′,ω)−1c 2g (r ,r ′,ω)+µ0ω2d r ′′ ∞0dω′′[f 3(r ,r ′′,ω,ω′′)·T ∗(r ′′,r ′,ω′′)+f 4(r ,r ′′,ω,ω′′)·T (r ′′,r ′,ω′′)]=0.(B.7)The integral contribution still depends on f 3and f 4,so that the differential equation is not yet in closed form.However,we may rewrite the integral in such a way that its relation to g becomes obvious.This can be achieved with the help of the identity: d r ′′ ∞0dω′′[f 3(r ,r ′′,ω,ω′′)·T ∗(r ′′,r ′,ω′′)+f 4(r ,r ′′,ω,ω′′)·T (r ′′,r ′,ω′′)]==ε0 d r ′′g (r ,r ′′,ω)·χ(r ′′,r ′,ω−i 0)+s (r ,r ′,ω).(B.8)which contains a tensor s (r ,r ′,ω)that arises while avoiding a pole in the complex frequency plane,as we shall see below.Furthermore the right-hand side contains the susceptibility tensor χthat has been defined in (36).In (B.8)the frequency is chosen to be in the lower half of the complex plane just below the real axis.Correspondingly,the term −i 0is an infinitesimally small number on the negative imaginary axis.To prove (B.8)we divide (B.2)by ω′−ω+i 0,with i 0an infinitesimally small imaginary number.The result is:−i ¯h ω′ε01ω+ω′ d r ′′ d r ′′′f 2(r ,r ′′,ω)·˜T∗(r ′′′,r ′′,ω′)·T (r ′′′,r ′,ω′)。
反应扩散模型在图灵斑图中的应用及数值模拟张荣培;王震;王语;韩子健【摘要】反应扩散方程模型常被用于描述生物学中斑图的形成.从反应扩散模型出发,理论推导得到Gierer-Meinhardt模型的斑图形成机理,解释了非线性常微分方程系统的稳定常数平衡态在加入扩散项后会发生失稳并产生图灵斑图的过程.通过计算该模型,得到图灵斑图产生的参数条件.数值方法中采用一类有效的高精度数值格式,即在空间离散条件下采用Chebyshev谱配置方法,在时间离散条件下采用紧致隐积分因子方法.该方法结合了谱方法和紧致隐积分因子方法的优点,具有精度高、稳定性好、存储量小等优点.数值模拟表明,在其他条件一定的情况下,系统控制参数κ 取不同值对于斑图的产生具有重要的影响,数值结果验证了理论结果.%Turing proposed a model for the development of patterns found in nature in 1952. Turing instability is known as diffusion-driven instability, which states that a stable spatially homogeneous equilibrium may lose its stability dueto the unequal spatial diffusion coefficients. The Gierer–Mainhardt modelis an activator and inhibitor system to model the generating mechanism of biological patterns. The reaction-diffusion system is often used to describe the pattern formation model arising in biology. In this paper, the mechanism of the pattern formation of the Gierer-Meinhardt model is deduced from the reactive diffusion model. It is explained that the steady equilibrium state of the nonlinear ordinary differential equation system will be unstable after adding of the diffusion term and produce the Turing pattern. The parameters of the Turing pattern are obtained by calculating the model. There are a variety of numerical methods including finitedifference method and finite element method. Compared with the finite difference method and finite element method, which have low order precision, the spectral method can achieve the convergence of the exponential order with only a small number of nodes and the discretization of the suitable orthogonal polynomials. In the present work, an efficient high-precision numerical scheme is used in the numerical simulation of the reaction-diffusion equations. In spatial discretization, we construct Chebyshev differentiation matrices based on the Chebyshev points and use these matrices to differentiate the second derivative in the reaction-diffusion equation. After the spatial discretization, we obtain the nonlinear ordinary differential equations. Since the spectral differential matrix obtained by the spectral collocation method is full and cannot use the fast solution of algebraic linear equations, we choose the compact implicit integration factor method to solve the nonlinear ordinary differential equations. By introducing a compact representation for the spectral differential matrix, the compact implicit integration factor method uses matrix exponential operations sequentially in every spatial direction. As a result, exponential matrices which are calculated and stored have small sizes, as those in the one-dimensional problem. This method decouples the exact evaluation of the linear part from the implicit treatment of the nonlinear reaction terms. We only solve a local nonlinear system at each spatial grid point. This method combines with the advantages of the spectral method and the compact implicit integration factor method, i.e., high precision, good stability, and small storage and soon. Numerical simulations show that it can have a great influence on the generation of patterns that the system control parameters take different values under otherwise identical conditions. The numerical results verify the theoretical results.【期刊名称】《物理学报》【年(卷),期】2018(067)005【总页数】10页(P50-59)【关键词】反应扩散方程;Gierer-Meinhardt模型;图灵斑图;Chebyshev谱方法【作者】张荣培;王震;王语;韩子健【作者单位】沈阳师范大学数学与系统科学学院,沈阳 110034;山东科技大学数学与系统科学学院,青岛 266590;沈阳师范大学数学与系统科学学院,沈阳 110034;沈阳师范大学数学与系统科学学院,沈阳 110034【正文语种】中文1 引言斑图是在空间或时间上具有某种规律性的非均匀宏观结构,普遍存在于自然界.1952年,著名的英国数学家图灵把他的目光转向生物学领域,用一个反应扩散系统成功地说明了某些生物体表面图纹产生的原理[1].图灵从数学角度表明,在反应扩散系统中,稳定状态会在某些条件下失稳,并自发产生空间定态图纹,此斑图通常称为图灵斑图. 经过多年的研究,各界学者利用反应扩散系统预测得到了更多的图灵斑图,在理论和实验方面取得了许多重要成果.他们证实了化学系统中图灵斑图的形成[2],讨论自催化反应中的动力学行为,探讨此类耦合反应扩散体系中影响图灵斑图的因素[3].给出Gray-Scott模型、Brusselator模型等系统扩散引起不稳定的数学机理[4],并描述了Gierer-Meinhardt,Lengyel-Epstein等模型的某些动力学行为(性质)[5,6].最近几年,图灵斑图在实验方面取得一系列最新的进展,Copie等[7]运用实验在一个双稳态被动非线性共振器中探讨了图灵调制和法拉第参数不稳定性的相互作用;Tompkins等[8]利用微流体化学室证实图灵理论体系,并观测到第七种时空模式;Lacitignola[9]研究了图灵不稳定现象的发生条件,论述了具体形态的电化学反应扩散模型在一个球面上的图案形成的特性;Gaskins等[10]在二氧化氯碘丙二酸反应实验中,通过添加卤化钠盐溶液得到新的图灵斑图.在这些系统中存在两种化学反应物质,它们不仅能相互作用,而且还能进行独自扩散.事实上,图灵斑图的产生对应的是一个非线性反应动力学过程与一种特殊扩散过程的耦合.这个特殊的扩散过程由于两种因子的扩散速度不同会发生失稳,这就是图灵斑图产生的机理.在数学上,图灵斑图可以用无量纲化的反应扩散方程组描述[11],即式中u和v是系统变量,分别代表参与化学反应的两种物质的浓度;c和d是扩散系数,t是时间变量,f(u,v)和g(u,v)表示反应项.设Ω为RN中带有光滑边界的有界区域,Ω=[0,a]×[0,b],边界为∂Ω,边界条件为齐次Neumann边界条件,即其中n表示边界上单位外法向.由于(1)式为耦合的非线性反应扩散方程,很难得到其精确解.近年来,许多学者用有限差分方法、有限元方法、谱方法等[12−14]多种数值方法求解(1)式,这些方法各有特点.相比于有限元方法和有限差分方法的低阶精度,谱方法[14]仅用少量的节点,采用Legendre,Chebyshev等适合的正交多项式离散即可达到指数阶收敛的谱精度.图灵斑图在空间上的结构具有一定的规律,且解比较光滑,因此采用谱方法离散是可行的.常用的谱配置方法主要有Fourier配置法[15],Chebyshev配置法[16],Hermite配置法等[17].由于本文考虑的(1)式边界条件为齐次Neumann边界条件,因此采用Chebyshev配置方法求解(1)式.对(1)式进行空间离散后,得到的是刚性的非线性常微分方程组(ODEs).显式时间离散方法虽可以用迭代的方法求解,但其对时间步长有严格的约束;隐式方法虽然可以允许大的时间步长,但是对于阶数非常大的非线性方程组的求解问题十分复杂,这对于全隐式方法来说是一个巨大的挑战.由于谱配置法所得到的谱微分矩阵是满的,显然利用追赶法等代数线性方程组的快速解法是不合适的,因此交替方向隐式方法在这里并不适用.本文采用紧致隐积分因子(compact implicit integration factor,cIIF)方法求解ODEs.2006年Nie等[18]以隐积分因子(IIF)方法为基础发展了cIIF方法.传统的隐积分因子方法在求解高维问题时,离散矩阵的指数运算的存储量和运算量非常大,导致运算速度缓慢.紧致隐积分因子方法[19]通过引入离散矩阵的紧致表达式并在各个方向进行矩阵的指数运算,使得中央处理器(CPU)的存储大大降低,计算速度也得到了显著提高.本文内容安排如下:第2节对反应扩散方程组进行线性分析,通过特征值解释图灵斑图的数学机理,然后以Gierer-Meinhardt模型为例分析系统处于稳定状态和不稳定状态时各参数需要满足的条件,进而探索斑图形成需要满足的条件;第3节研究数值方法,在空间离散条件下采用Chebyshev谱方法,时间离散条件下采用紧致隐积分因子方法,用MATLAB进行编程求解;第4节给出大量数值实验并对理论分析结果进行验证.2 图灵斑图的形成2.1 斑图形成的数学机理首先考虑(1)式没有扩散项,假设存在惟一的均匀定态解(u0,v0),即常数u0,v0满足令U=u−u0,V=v−v0,并在(u0,v0)处线性化后得到如下系统:式中c11=fu(u0,v0), c12=fv(u0,v0),c21=gu(u0,v0),c22=gv(u0,v0).均匀定态解(u0,v0)在没有扩散时是稳定的,这等价于相应的特征值问题的矩阵的特征值实部是负数.考虑加入扩散项后的反应扩散方程组((1)式).如果此时产生斑图,即(u0,v0)是不稳定的,要求特征值有正实部.所谓不稳定,体现为两种反应物的扩散速度不同,从而引起失稳.对(1)式作线性化处理,研究特征值正实部引起的线性不稳定性,进而推导出原方程的不稳定性.对均匀定态解(u0,v0)作一个微扰,可得线性微扰方程为求解如下方程可得相应的特征值:式中λ为特征值.只要(5)式中的特征值有正实部,则(u0,v0)对于(1)式是不稳定的.考虑到齐次Neumann边界条件,得到(5)式所对应的特征值为具体推导过程见附录A.2.2 Gierer-Meinhardt模型生物的发育过程是复杂的,其中重要的是形态形成阶段,与之对应的是生物体内器官的形成.由于该阶段的重要性,渐渐形成一个新的领域——形态学,主要研究导致细胞分化和定位因素的浓度对组织器官的影响.Gierer-Meinhardt模型是由Gierer和Meinhardt在研究激活物和抑制剂两种不同物质的产生和扩散时建立的[20],之后Gierer和Meinhardt利用数值方法导出一维和二维空间区域中上述系统产生多样斑图的条件.Gierer-Meinhardt模型被广泛应用于形态形成过程中一些基本现象的研究,最近的一些工作可以参见文献[21—23].以Gierer-Meinhardt模型为例,结合上述理论分析,计算产生斑图时需要满足的条件.取(1)式中其中系数κ,η,ε为系统的控制参数,固定η=0.1,ε=0.04.由此得到线性化系统(3)式中的系数为易得该系统的特征值为λ1= −1.2984,λ2=−7.7016,此时系统是稳定的.加入扩散项后,原方程组对应的特征问题为相应的特征方程为为使(8)式含有正实部的特征值,需要考虑两种情况.第一种情况是两个特征值异号,则应满足图1 特征值的实部Re(λ)随参数的变化(a)κ=0.0128;(b)κ=0.0152;(c)κ=0.008Fig.1.Real part Re(λ)of eigenvalues varying with parameters:(a)κ=0.0128;(b)κ=0.0152;(c)κ=0.008.经过化简可以得到此时应满足得0.0093248.第二种情况是两个特征值都是正的,应满足此时κ无解.由于反应扩散方程组联系于解析半群,所以线性化后的正实部特征值引起的不稳定性可以推导出原方程组的不稳定性.故当κ>κ0=0.0093248时,系统处于不稳定状态,因而系统能够产生斑图.特征值的实部Re(λ)在参数κ取不同值时的变化如图1所示.从图1可以看出,当κ = 0.0128>κ0和κ=0.0152>κ0时,特征值的实部会出现正值,此时系统不稳定;当κ=0.008<κ0时,特征值的实部始终为负,系统最后会达到稳定状态.第3节将用数值算例验证该结论.3 数值方法3.1 Chebyshev谱配置法将求解区域[−1,1]2离散为Gauss-Lobatto网格,即其中Nx和Ny是正整数.对于一般的求解区域Ω=[a,b]×[c,d],可以采用公式将区域转化为[−1,1]2.在网格Th中将u(x,y)数值解定义为矩阵形式,U∈R(Nx−1)×(Ny−1),式中ui,j表示u在网格点(xi,xj)的数值解.引入Chebyshev一阶微分矩阵和二阶微分矩阵(具体推导过程见附录B).则u(x,y)关于x的二阶偏导数在配置点的值,可以用矩阵乘积的形式近似为矩阵Ax是在Chebyshev二阶微分矩阵基础上考虑Neumann边界条件得到的,其中同样地,对于y的二阶偏导数,有UAy,其中矩阵Ay定义同Ax.借助谱微分矩阵,可将方程中的Laplace算子离散成矩阵乘积的形式,即将Chebyshev谱配置方法应用于反应扩散方程,得到其半离散形式为式中3.2 紧致隐积分因子法将对空间离散后得到的非线性常微分方程组((12)式)采用紧致隐积分因子方法进行时间离散.定义时间步长为τ=Δt,第n层时间步为tn=nτ,n=0,1,2,···. 在(12)式两端同时左乘指数矩阵e−Axt,右乘指数矩阵e−Ayt.为描述方便,取(10)式c=1,d=1,可将(12)式中第一个等式写为将时间离散为0=t0<t1<···,将(13)式在一个时间步长内关于时间积分,并用梯形公式近似可得二阶紧致隐积分因子格式为进一步化简得在非线性方程组(13)式中,右端第一项可以通过矩阵乘积得到,右端第二项采用Picard迭代方法求解:同理处理(12)式中第二个等式可得该方法中矩阵eAxΔt和eAyΔt的阶数分别为Nx×Nx和Ny×Ny.在空间网格剖分量很大时,该方法可以降低存储量和运算量,使计算速度更快.4 数值算例对于前述Gierer-Meinhardt模型,取Ω =(−1,1)× (−1,1),η=0.1,c=0.04,κ 是不固定的参数.设其中图2 取κ=0.0128时Gierer-Meinhardt模型形成的斑图(a)t=20;(b)t=80;(c)t=170;(d)t=270;(e)t=320;(f)t=340;(g)t=500;(h)t=600;(i)t= 900Fig.2.Turing patterns in Gierer-Meinhardt model whenκ=0.0128:(a)t=20;(b)t=80;(c)t=170;(d)t=270;(e)t=320;(f)t=340;(g)t=500;(h)t =600;(i)t=900.4.1 数值算例I取κ=0.0128,N=100,h=2/100=0.02,τ=0.1h,t取图2所示各值时,得到对应的图像.由图2可知,随着时间的推移,初始扰动不断增强扩大,最终形成清晰的斑图.4.2 数值算例II取κ=0.0152,t取图3所示各值,其他参数与算例I相同,可得到t取不同值时对应的图像.由图3可知,随着时间的推移,初始扰动不断增强扩大,最终形成清晰的斑图. 图3 取κ=0.0152时Gierer-Meinhardt模型形成的斑图(a)t=30;(b)t=80;(c)t=90;(d)t=140;(e)t=160;(f)t=290;(g)t=520;(h)t=620;(i)t=9 90Fig.3.Turing patterns in Gierer-Meinhardt model whenκ=0.0152:(a)t=30;(b)t=80;(c)t=90;(d)t=140;(e)t=160;(f)t=290;(g)t=520;(h)t= 620;(i)t=990.4.3 数值算例III取κ=0.008,其他取值与算例II相同,t取不同值时对应的图像如图4所示.由图4可知,随着时间的推移,系统达到稳定状态,反应扩散模型不能形成斑图.由数值模拟结果来看,其他条件一定的情况下,κ取不同值对于产生斑图有重要的影响.数值模拟结果与理论结果一致.此外,我们也对周期性边界条件的Gierer-Meinhardt模型采用Fourier谱方法进行数值求解,结果显示周期边界条件对斑图的形状几乎没有影响.5 结论介绍了图灵斑图形成的数学机理,并结合Gierer-Meinhardt模型,分析系统不稳定状态的各系数需要满足的条件,即产生斑图的条件.运用紧致隐积分因子方法大大减少了存储和CPU运算时间,该方法对于大时间数值模拟是一个高效、高精度的数值方法.数值算例模拟了斑图形成的过程,验证了理论分析结果.这些结论还可应用于求解带有分数阶的反应扩散方程组.图4 取κ=0.008时Gierer-Meinhardt模型形成的斑图(a)t=30;(b)t=80;(c)t=90;(d)t=140;(e)t=160;(f)t=220;(g)t=290;(h)t=270;(i)t=9 90Fig.4.Turing patterns in Gierer-Meinhardt model whenκ=0.008:(a)t=30;(b)t=80;(c)t=90;(d)t=140;(e)t=160;(f)t=220;(g)t=290;(h)t=2 70;(i)t=990.附录A 图灵斑图的形成机理首先在区域Ω⊂RN(N=1,2)内考虑带有齐次Neumann边界条件的Laplace算子的特征值问题.一维情况下,特征值问题为式中a∈R+.特征值问题可表示为µ2−λ=0,解得只有λ< 0时可解得特征值λk= −(kπ/a)2,且特征值所对应的特征函数为在二维情况下,特征值问题为式中a,b∈R+,应采用分离变量法求解特征值. 设u=X(x)Y(y),代入方程得设解得故特征值为λk,l=且特征值所对应的特征函数为考虑方程组的特征值问题,令代入原方程组可得设则方程组可化为当方程组(A3)有非零解,满足此时方程组所对应的特征值为附录B 谱微分矩阵定义在[−1,1]上的标准k阶Chebyshev多项式Tk(x)为 Tk(x)=cos(k arccosx),k=0,1,2,···. 令x=cosz,则有Tk=coskz,满足如下递推关系:Tk(x)在[−1,1]上的N+1个Gauss-Lobatto点值为零:设N阶多项式uN(x)∈PN在上述配置点xj满足uN(xj)=u(xj),则有式中hj(x)为N阶Lagrange基函数.用配置法求解未知量在网格点处的值,需要表示配置点处的导数值.对(B3)式求p阶导数,得式中系数从而可得一阶谱微分矩阵其中这里二阶谱微分矩阵可以由一阶谱微分矩阵平方得到,即参考文献[1]Turing A M 1952 Philos.Trans.R.Soc.Lond.B 2 37[2]Li X Z,Bai Z G,Li Y,Zhao K,He Y F 2013 Acta Phys.Sin.62 220503(inChinese)[李新政,白占国,李燕,赵昆,贺亚峰2013物理学报62 220503][3]Zhang L,Liu S Y 2007 Appl.Math.Mec.28 1102(in Chinese)[张丽,刘三阳2007应用数学和力学28 1102][4]Li B,Wang M X 2008 Appl.Math.Mec.29 749(in Chinese)[李波,王明新2008应用数学和力学29 749][5]Hu W Y,Shao Y Z 2014 Acta Phys.Sin.63 238202(in Chinese)[胡文勇,邵元智2014物理学报 63 238202][6]Peng R Wang M 2007 Sci.China A 50 377[7]Copie F,Conforti M,Kudlinski A,Mussot A,Trillo S 2016 Phys.Rev.Lett.116 143901[8]Tompkins N,Li N,Girabawe C,Heymann M,Ermentrout G B,Epstein IR,Fraden S 2014 A 111 4397[9]Lacitignola D,Bozzini B,Frittelli M,Sgura I 2017 Commun.Nonlinear Sci.Numer.Simul.48 484[10]Gaskins D K,Pruc E E,Epstein I R,Dolnik M 2016 Phys.Rev.Lett.117 056001[11]Zhang R P,Yu X J,Zhu J,Loula A 2014 Appl.Math.Model.38 1612[12]Zhang R P,Zhu J,Loula A,Yu X J 2016 put.Appl.Math.302 312[13]Bai Z G,Dong L F,Li Y H,Fan W L 2011 Acta Phys.Sin.60 118201(in Chinese)[白占国,董丽芳,李永辉,范伟丽2011物理学报60 118201][14]Zhang R,Zhu J,Yu X,Li M,Loula A F D 2017 put.310 194[15]Lv Z Q,Zhang L M,Wang Y S 2014 Chin.Phys.B 23 120203[16]Wang H 2010 mun.181 325[17]Hoz F D L,Vadillo F 2013 put.Phys.14 1001[18]Nie Q,Zhang Y T,Zhao R 2006 put.Phys.214 521[19]Nie Q,Wan F Y M,Zhang Y T,Liu X F 2008 put.Phys.227 5238[20]Gierer A,Meinhardt H 1972 Kybernetik 12 30[21]Ward M J,Wei J 2003 J.Nonlinear Sci.13 209[22]Wei J,Winter M 2004 J.Math.Pures Appl.83 433[23]Li H X 2015 J.Northeast Normal University 3 26(in Chinese)[李海侠2015东北师大学报3 26]。
一、概念介绍diffusion model(扩散模型)是指在不同领域中用于描述物质、能量、信息传播过程的数学模型,通常通过二阶偏微分方程描述。
扩散模型在物理学、生物学、化学、经济学等领域中都有广泛的应用。
它可以帮助人们理解和预测在各种条件下的扩散过程,对于解决一些现实问题具有重要的意义。
二、扩散模型的基本原理扩散模型的基本原理是描述在时间和空间上的物质传播过程。
它假设被扩散的物质以一定的速率在空间中传播,并且在不同位置上会出现浓度的差异。
扩散模型可以用数学方程来描述这种浓度的变化,通常是由一个偏微分方程来表示。
三、数学描述扩散模型最常用的数学描述是扩散方程(diffusion equation)。
在一维情况下,扩散方程通常写作:∂u/∂t = D ∂2u/∂x2其中,u是随时间和空间变化的物质浓度,t是时间,x是空间坐标,D是扩散系数。
这个方程描述的是浓度随时间和空间的变化规律,扩散系数D反映了物质扩散的速率。
四、常见的扩散模型1. Fick定律Fick定律是描述物质扩散的最基本的定律之一。
它表明了物质浓度梯度的方向和大小与物质的扩散速率成正比。
Fick定律可以用扩散方程来描述,在一维情况下可以写作:J = -D ∂u/∂x其中,J是物质的扩散通量,D是扩散系数,u是物质的浓度。
这个定律对于描述被扩散物质的传播速率和规律有着重要的意义。
2. 热传导方程热传导方程是扩散模型在热传导领域的应用。
它描述了物体内部温度分布随时间的变化规律。
热传导方程在一维情况下可以写作:∂u/∂t= α ∂2u/∂x2其中,u是物体内部的温度分布,α是热扩散系数。
热传导方程对于热能在物体内部的传播和分布规律有着重要的作用。
五、应用领域扩散模型在自然科学和社会科学的各个领域都有广泛的应用。
在物理领域,扩散模型可以用于描述物质和能量的传播规律;在生物领域,可以用于描述细胞内物质的传输过程;在化学领域,可以用于描述溶液中各种物质的扩散规律;在经济学领域,可以用于描述市场信息的传播过程等。
一、项目名称:非线性反应扩散方程理论及应用二、提名者及提名意见(专家提名项目还应公示提名专家的姓名、工作单位、职称和学科专业)提名单位:陕西省教育厅提名意见:本项目是系统研究非线性反应扩散方程稳态解及大时间行为的原创性成果。
针对具有重要生物、几何背景的反应扩散方程开展了系列深入细致的研究。
率先提出了多资源空间异质恒化器模型,完整刻画了模型共存态的整体分歧结构,提出了扩散驱动的物种共存机制,激发和推动了空间异质恒化器模型的研究。
建立了系列具有抑制剂的空间异质恒化器模型,发展了相关模型稳态解性态的研究,揭示了抑制剂对物种共存的调节作用。
研究了有关二维Minkowski问题的非线性偏微分方程正解的存在性,建立了负指数的Sobolev不等式。
研究了相关反应扩散模型稳态解及大时间行为,揭示了趋化、化感、周期介质等因素对反应扩散系统整体解、行波解及空间斑图形成机制的影响。
8篇代表性论文主要发表于美国工业与应用数学学会会刊《SIAM J. Appl. Math.》、《SIAM J. Math. Anal.》、国际数学著名期刊《Adv. Math.》及《J. Differential Equations》、《Indiana Univ. Math. J.》等国际知名SCI刊物。
8篇代表性论文总他引232次,SCI他引162次;1篇代表性论文入选SCI高被引论文。
研究成果丰富了非线性反应扩散方程理论,多项研究工作处于国际领先水平,获得了2017年陕西高等学校科学技术一等奖。
提名该项目为陕西省自然科学一等奖。
三、项目简介本项目属于非线性反应扩散方程研究领域,是项目组近二十年来在该领域研究工作的总结。
项目组先后承担国家自然科学基金项目8项,承担教育部高等学校博士学科点专项科研基金、教育部新世纪优秀人才支持计划、教育部骨干教师资助计划、教育部优秀青年教师资助计划等项目5项。
反应扩散方程不仅具有强烈的实际背景,而且对数学分析也提出了许多挑战性的问题,一直是偏微分方程的重要研究方向。
Advances in Geosciences,4,17–22,2005 SRef-ID:1680-7359/adgeo/2005-4-17 European Geosciences Union©2005Author(s).This work is licensed under a Creative CommonsLicense.Advances in GeosciencesIncorporating level set methods in Geographical Information Systems(GIS)for land-surface process modelingD.PullarGeography Planning and Architecture,The University of Queensland,Brisbane QLD4072,Australia Received:1August2004–Revised:1November2004–Accepted:15November2004–Published:9August2005nd-surface processes include a broad class of models that operate at a landscape scale.Current modelling approaches tend to be specialised towards one type of pro-cess,yet it is the interaction of processes that is increasing seen as important to obtain a more integrated approach to land management.This paper presents a technique and a tool that may be applied generically to landscape processes. The technique tracks moving interfaces across landscapes for processes such as waterflow,biochemical diffusion,and plant dispersal.Its theoretical development applies a La-grangian approach to motion over a Eulerian grid space by tracking quantities across a landscape as an evolving front. An algorithm for this technique,called level set method,is implemented in a geographical information system(GIS).It fits with afield data model in GIS and is implemented as operators in map algebra.The paper describes an implemen-tation of the level set methods in a map algebra program-ming language,called MapScript,and gives example pro-gram scripts for applications in ecology and hydrology.1IntroductionOver the past decade there has been an explosion in the ap-plication of models to solve environmental issues.Many of these models are specific to one physical process and of-ten require expert knowledge to use.Increasingly generic modeling frameworks are being sought to provide analyti-cal tools to examine and resolve complex environmental and natural resource problems.These systems consider a vari-ety of land condition characteristics,interactions and driv-ing physical processes.Variables accounted for include cli-mate,topography,soils,geology,land cover,vegetation and hydro-geography(Moore et al.,1993).Physical interactions include processes for climatology,hydrology,topographic landsurface/sub-surfacefluxes and biological/ecological sys-Correspondence to:D.Pullar(d.pullar@.au)tems(Sklar and Costanza,1991).Progress has been made in linking model-specific systems with tools used by environ-mental managers,for instance geographical information sys-tems(GIS).While this approach,commonly referred to as loose coupling,provides a practical solution it still does not improve the scientific foundation of these models nor their integration with other models and related systems,such as decision support systems(Argent,2003).The alternative ap-proach is for tightly coupled systems which build functional-ity into a system or interface to domain libraries from which a user may build custom solutions using a macro language or program scripts.The approach supports integrated models through interface specifications which articulate the funda-mental assumptions and simplifications within these models. The problem is that there are no environmental modelling systems which are widely used by engineers and scientists that offer this level of interoperability,and the more com-monly used GIS systems do not currently support space and time representations and operations suitable for modelling environmental processes(Burrough,1998)(Sui and Magio, 1999).Providing a generic environmental modeling framework for practical environmental issues is challenging.It does not exist now despite an overwhelming demand because there are deep technical challenges to build integrated modeling frameworks in a scientifically rigorous manner.It is this chal-lenge this research addresses.1.1Background for ApproachThe paper describes a generic environmental modeling lan-guage integrated with a Geographical Information System (GIS)which supports spatial-temporal operators to model physical interactions occurring in two ways.The trivial case where interactions are isolated to a location,and the more common and complex case where interactions propa-gate spatially across landscape surfaces.The programming language has a strong theoretical and algorithmic basis.The-oretically,it assumes a Eulerian representation of state space,Fig.1.Shows a)a propagating interface parameterised by differ-ential equations,b)interface fronts have variable intensity and may expand or contract based onfield gradients and driving process. but propagates quantities across landscapes using Lagrangian equations of motion.In physics,a Lagrangian view focuses on how a quantity(water volume or particle)moves through space,whereas an Eulerian view focuses on a localfixed area of space and accounts for quantities moving through it.The benefit of this approach is that an Eulerian perspective is em-inently suited to representing the variation of environmen-tal phenomena across space,but it is difficult to conceptu-alise solutions for the equations of motion and has compu-tational drawbacks(Press et al.,1992).On the other hand, the Lagrangian view is often not favoured because it requires a global solution that makes it difficult to account for local variations,but has the advantage of solving equations of mo-tion in an intuitive and numerically direct way.The research will address this dilemma by adopting a novel approach from the image processing discipline that uses a Lagrangian ap-proach over an Eulerian grid.The approach,called level set methods,provides an efficient algorithm for modeling a natural advancing front in a host of settings(Sethian,1999). The reason the method works well over other approaches is that the advancing front is described by equations of motion (Lagrangian view),but computationally the front propagates over a vectorfield(Eulerian view).Hence,we have a very generic way to describe the motion of quantities,but can ex-plicitly solve their advancing properties locally as propagat-ing zones.The research work will adapt this technique for modeling the motion of environmental variables across time and space.Specifically,it will add new data models and op-erators to a geographical information system(GIS)for envi-ronmental modeling.This is considered to be a significant research imperative in spatial information science and tech-nology(Goodchild,2001).The main focus of this paper is to evaluate if the level set method(Sethian,1999)can:–provide a theoretically and empirically supportable methodology for modeling a range of integral landscape processes,–provide an algorithmic solution that is not sensitive to process timing,is computationally stable and efficient as compared to conventional explicit solutions to diffu-sive processes models,–be developed as part of a generic modelling language in GIS to express integrated models for natural resource and environmental problems?The outline for the paper is as follow.The next section will describe the theory for spatial-temporal processing us-ing level sets.Section3describes how this is implemented in a map algebra programming language.Two application examples are given–an ecological and a hydrological ex-ample–to demonstrate the use of operators for computing reactive-diffusive interactions in landscapes.Section4sum-marises the contribution of this research.2Theory2.1IntroductionLevel set methods(Sethian,1999)have been applied in a large collection of applications including,physics,chemistry,fluid dynamics,combustion,material science,fabrication of microelectronics,and computer vision.Level set methods compute an advancing interface using an Eulerian grid and the Lagrangian equations of motion.They are similar to cost distance modeling used in GIS(Burroughs and McDonnell, 1998)in that they compute the spread of a variable across space,but the motion is based upon partial differential equa-tions related to the physical process.The advancement of the interface is computed through time along a spatial gradient, and it may expand or contract in its extent.See Fig.1.2.2TheoryThe advantage of the level set method is that it models mo-tion along a state-space gradient.Level set methods start with the equation of motion,i.e.an advancing front with velocity F is characterised by an arrival surface T(x,y).Note that F is a velocityfield in a spatial sense.If F was constant this would result in an expanding series of circular fronts,but for different values in a velocityfield the front will have a more contorted appearance as shown in Fig.1b.The motion of thisinterface is always normal to the interface boundary,and its progress is regulated by several factors:F=f(L,G,I)(1)where L=local properties that determine the shape of advanc-ing front,G=global properties related to governing forces for its motion,I=independent properties that regulate and influ-ence the motion.If the advancing front is modeled strictly in terms of the movement of entity particles,then a straightfor-ward velocity equation describes its motion:|∇T|F=1given T0=0(2) where the arrival function T(x,y)is a travel cost surface,and T0is the initial position of the interface.Instead we use level sets to describe the interface as a complex function.The level set functionφis an evolving front consistent with the under-lying viscosity solution defined by partial differential equa-tions.This is expressed by the equation:φt+F|∇φ|=0givenφ(x,y,t=0)(3)whereφt is a complex interface function over time period 0..n,i.e.φ(x,y,t)=t0..tn,∇φis the spatial and temporal derivatives for viscosity equations.The Eulerian view over a spatial domain imposes a discretisation of space,i.e.the raster grid,which records changes in value z.Hence,the level set function becomesφ(x,y,z,t)to describe an evolv-ing surface over time.Further details are given in Sethian (1999)along with efficient algorithms.The next section de-scribes the integration of the level set methods with GIS.3Map algebra modelling3.1Map algebraSpatial models are written in a map algebra programming language.Map algebra is a function-oriented language that operates on four implicit spatial data types:point,neighbour-hood,zonal and whole landscape surfaces.Surfaces are typ-ically represented as a discrete raster where a point is a cell, a neighbourhood is a kernel centred on a cell,and zones are groups of mon examples of raster data include ter-rain models,categorical land cover maps,and scalar temper-ature surfaces.Map algebra is used to program many types of landscape models ranging from land suitability models to mineral exploration in the geosciences(Burrough and Mc-Donnell,1998;Bonham-Carter,1994).The syntax for map algebra follows a mathematical style with statements expressed as equations.These equations use operators to manipulate spatial data types for point and neighbourhoods.Expressions that manipulate a raster sur-face may use a global operation or alternatively iterate over the cells in a raster.For instance the GRID map algebra (Gao et al.,1993)defines an iteration construct,called do-cell,to apply equations on a cell-by-cell basis.This is triv-ially performed on columns and rows in a clockwork manner. However,for environmental phenomena there aresituations Fig.2.Spatial processing orders for raster.where the order of computations has a special significance. For instance,processes that involve spreading or transport acting along environmental gradients within the landscape. Therefore special control needs to be exercised on the order of execution.Burrough(1998)describes two extra control mechanisms for diffusion and directed topology.Figure2 shows the three principle types of processing orders,and they are:–row scan order governed by the clockwork lattice struc-ture,–spread order governed by the spreading or scattering ofa material from a more concentrated region,–flow order governed by advection which is the transport of a material due to velocity.Our implementation of map algebra,called MapScript (Pullar,2001),includes a special iteration construct that sup-ports these processing orders.MapScript is a lightweight lan-guage for processing raster-based GIS data using map alge-bra.The language parser and engine are built as a software component to interoperate with the IDRISI GIS(Eastman, 1997).MapScript is built in C++with a class hierarchy based upon a value type.Variants for value types include numeri-cal,boolean,template,cells,or a grid.MapScript supports combinations of these data types within equations with basic arithmetic and relational comparison operators.Algebra op-erations on templates typically result in an aggregate value assigned to a cell(Pullar,2001);this is similar to the con-volution integral in image algebras(Ritter et al.,1990).The language supports iteration to execute a block of statements in three ways:a)docell construct to process raster in a row scan order,b)dospread construct to process raster in a spreadwhile(time<100)dospreadpop=pop+(diffuse(kernel*pop))pop=pop+(r*pop*dt*(1-(pop/K)) enddoendwhere the diffusive constant is stored in thekernel:Fig.3.Map algebra script and convolution kernel for population dispersion.The variable pop is a raster,r,K and D are constants, dt is the model time step,and the kernel is a3×3template.It is assumed a time step is defined and the script is run in a simulation. Thefirst line contained in the nested cell processing construct(i.e. dospread)is the diffusive term and the second line is the population growth term.order,c)doflow to process raster byflow order.Examples are given in subsequent sections.Process models will also involve a timing loop which may be handled as a general while(<condition>)..end construct in MapScript where the condition expression includes a system time variable.This time variable is used in a specific fashion along with a system time step by certain operators,namely diffuse()andfluxflow() described in the next section,to model diffusion and advec-tion as a time evolving front.The evolving front represents quantities such as vegetation growth or surface runoff.3.2Ecological exampleThis section presents an ecological example based upon plant dispersal in a landscape.The population of a species follows a controlled growth rate and at the same time spreads across landscapes.The theory of the rate of spread of an organism is given in Tilman and Kareiva(1997).The area occupied by a species grows log-linear with time.This may be modelled by coupling a spatial diffusion term with an exponential pop-ulation growth term;the combination produces the familiar reaction-diffusion model.A simple growth population model is used where the reac-tion term considers one population controlled by births and mortalities is:dN dt =r·N1−NK(4)where N is the size of the population,r is the rate of change of population given in terms of the difference between birth and mortality rates,and K is the carrying capacity.Further dis-cussion of population models can be found in Jrgensen and Bendoricchio(2001).The diffusive term spreads a quantity through space at a specified rate:dudt=Dd2udx2(5) where u is the quantity which in our case is population size, and D is the diffusive coefficient.The model is operated as a coupled computation.Over a discretized space,or raster,the diffusive term is estimated using a numerical scheme(Press et al.,1992).The distance over which diffusion takes place in time step dt is minimally constrained by the raster resolution.For a stable computa-tional process the following condition must be satisfied:2Ddtdx2≤1(6) This basically states that to account for the diffusive pro-cess,the term2D·dx is less than the velocity of the advancing front.This would not be difficult to compute if D is constant, but is problematic if D is variable with respect to landscape conditions.This problem may be overcome by progressing along a diffusive front over the discrete raster based upon distance rather than being constrained by the cell resolution.The pro-cessing and diffusive operator is implemented in a map al-gebra programming language.The code fragment in Fig.3 shows a map algebra script for a single time step for the cou-pled reactive-diffusion model for population growth.The operator of interest in the script shown in Fig.3is the diffuse operator.It is assumed that the script is run with a given time step.The operator uses a system time step which is computed to balance the effect of process errors with effi-cient computation.With knowledge of the time step the it-erative construct applies an appropriate distance propagation such that the condition in Eq.(3)is not violated.The level set algorithm(Sethian,1999)is used to do this in a stable and accurate way.As a diffusive front propagates through the raster,a cost distance kernel assigns the proper time to each raster cell.The time assigned to the cell corresponds to the minimal cost it takes to reach that cell.Hence cell pro-cessing is controlled by propagating the kernel outward at a speed adaptive to the local context rather than meeting an arbitrary global constraint.3.3Hydrological exampleThis section presents a hydrological example based upon sur-face dispersal of excess rainfall across the terrain.The move-ment of water is described by the continuity equation:∂h∂t=e t−∇·q t(7) where h is the water depth(m),e t is the rainfall excess(m/s), q is the discharge(m/hr)at time t.Discharge is assumed to have steady uniformflow conditions,and is determined by Manning’s equation:q t=v t h t=1nh5/3ts1/2(8)putation of current cell(x+ x,t,t+ ).where q t is theflow velocity(m/s),h t is water depth,and s is the surface slope(m/m).An explicit method of calcula-tion is used to compute velocity and depth over raster cells, and equations are solved at each time step.A conservative form of afinite difference method solves for q t in Eq.(5). To simplify discussions we describe quasi-one-dimensional equations for theflow problem.The actual numerical com-putations are normally performed on an Eulerian grid(Julien et al.,1995).Finite-element approximations are made to solve the above partial differential equations for the one-dimensional case offlow along a strip of unit width.This leads to a cou-pled model with one term to maintain the continuity offlow and another term to compute theflow.In addition,all calcu-lations must progress from an uphill cell to the down slope cell.This is implemented in map algebra by a iteration con-struct,called doflow,which processes a raster byflow order. Flow distance is measured in cell size x per unit length. One strip is processed during a time interval t(Fig.4).The conservative solution for the continuity term using afirst or-der approximation for Eq.(5)is derived as:h x+ x,t+ t=h x+ x,t−q x+ x,t−q x,txt(9)where the inflow q x,t and outflow q x+x,t are calculated in the second term using Equation6as:q x,t=v x,t·h t(10) The calculations approximate discharge from previous time interval.Discharge is dynamically determined within the continuity equation by water depth.The rate of change in state variables for Equation6needs to satisfy a stability condition where v· t/ x≤1to maintain numerical stabil-ity.The physical interpretation of this is that afinite volume of water wouldflow across and out of a cell within the time step t.Typically the cell resolution isfixed for the raster, and adjusting the time step requires restarting the simulation while(time<120)doflow(dem)fvel=1/n*pow(depth,m)*sqrt(grade)depth=depth+(depth*fluxflow(fvel)) enddoendFig.5.Map algebra script for excess rainfallflow computed over a 120minute event.The variables depth and grade are rasters,fvel is theflow velocity,n and m are constants in Manning’s equation.It is assumed a time step is defined and the script is run in a simulation. Thefirst line in the nested cell processing(i.e.doflow)computes theflow velocity and the second line computes the change in depth from the previous value plus any net change(inflow–outflow)due to velocityflux across the cell.cycle.Flow velocities change dramatically over the course of a storm event,and it is problematic to set an appropriate time step which is efficient and yields a stable result.The hydrological model has been implemented in a map algebra programming language Pullar(2003).To overcome the problem mentioned above we have added high level oper-ators to compute theflow as an advancing front over a land-scape.The time step advances this front adaptively across the landscape based upon theflow velocity.The level set algorithm(Sethian,1999)is used to do this in a stable and accurate way.The map algebra script is given in Fig.5.The important operator is thefluxflow operator.It computes the advancing front for waterflow across a DEM by hydrologi-cal principles,and computes the local drainageflux rate for each cell.Theflux rate is used to compute the net change in a cell in terms offlow depth over an adaptive time step.4ConclusionsThe paper has described an approach to extend the function-ality of tightly coupled environmental models in GIS(Ar-gent,2004).A long standing criticism of GIS has been its in-ability to handle dynamic spatial models.Other researchers have also addressed this issue(Burrough,1998).The con-tribution of this paper is to describe how level set methods are:i)an appropriate scientific basis,and ii)able to perform stable time-space computations for modelling landscape pro-cesses.The level set method provides the following benefits:–it more directly models motion of spatial phenomena and may handle both expanding and contracting inter-faces,–is based upon differential equations related to the spatial dynamics of physical processes.Despite the potential for using level set methods in GIS and land-surface process modeling,there are no commercial or research systems that use this mercial sys-tems such as GRID(Gao et al.,1993),and research systems such as PCRaster(Wesseling et al.,1996)offerflexible andpowerful map algebra programming languages.But opera-tions that involve reaction-diffusive processing are specific to one context,such as groundwaterflow.We believe the level set method offers a more generic approach that allows a user to programflow and diffusive landscape processes for a variety of application contexts.We have shown that it pro-vides an appropriate theoretical underpinning and may be ef-ficiently implemented in a GIS.We have demonstrated its application for two landscape processes–albeit relatively simple examples–but these may be extended to deal with more complex and dynamic circumstances.The validation for improved environmental modeling tools ultimately rests in their uptake and usage by scientists and engineers.The tool may be accessed from the web site .au/projects/mapscript/(version with enhancements available April2005)for use with IDRSIS GIS(Eastman,1997)and in the future with ArcGIS. It is hoped that a larger community of users will make use of the methodology and implementation for a variety of environmental modeling applications.Edited by:P.Krause,S.Kralisch,and W.Fl¨u gelReviewed by:anonymous refereesReferencesArgent,R.:An Overview of Model Integration for Environmental Applications,Environmental Modelling and Software,19,219–234,2004.Bonham-Carter,G.F.:Geographic Information Systems for Geo-scientists,Elsevier Science Inc.,New York,1994. Burrough,P.A.:Dynamic Modelling and Geocomputation,in: Geocomputation:A Primer,edited by:Longley,P.A.,et al., Wiley,England,165-191,1998.Burrough,P.A.and McDonnell,R.:Principles of Geographic In-formation Systems,Oxford University Press,New York,1998. Gao,P.,Zhan,C.,and Menon,S.:An Overview of Cell-Based Mod-eling with GIS,in:Environmental Modeling with GIS,edited by: Goodchild,M.F.,et al.,Oxford University Press,325–331,1993.Goodchild,M.:A Geographer Looks at Spatial Information Theory, in:COSIT–Spatial Information Theory,edited by:Goos,G., Hertmanis,J.,and van Leeuwen,J.,LNCS2205,1–13,2001.Jørgensen,S.and Bendoricchio,G.:Fundamentals of Ecological Modelling,Elsevier,New York,2001.Julien,P.Y.,Saghafian,B.,and Ogden,F.:Raster-Based Hydro-logic Modelling of Spatially-Varied Surface Runoff,Water Re-sources Bulletin,31(3),523–536,1995.Moore,I.D.,Turner,A.,Wilson,J.,Jenson,S.,and Band,L.:GIS and Land-Surface-Subsurface Process Modeling,in:Environ-mental Modeling with GIS,edited by:Goodchild,M.F.,et al., Oxford University Press,New York,1993.Press,W.,Flannery,B.,Teukolsky,S.,and Vetterling,W.:Numeri-cal Recipes in C:The Art of Scientific Computing,2nd Ed.Cam-bridge University Press,Cambridge,1992.Pullar,D.:MapScript:A Map Algebra Programming Language Incorporating Neighborhood Analysis,GeoInformatica,5(2), 145–163,2001.Pullar,D.:Simulation Modelling Applied To Runoff Modelling Us-ing MapScript,Transactions in GIS,7(2),267–283,2003. Ritter,G.,Wilson,J.,and Davidson,J.:Image Algebra:An Overview,Computer Vision,Graphics,and Image Processing, 4,297–331,1990.Sethian,J.A.:Level Set Methods and Fast Marching Methods, Cambridge University Press,Cambridge,1999.Sklar,F.H.and Costanza,R.:The Development of Dynamic Spa-tial Models for Landscape Ecology:A Review and Progress,in: Quantitative Methods in Ecology,Springer-Verlag,New York, 239–288,1991.Sui,D.and R.Maggio:Integrating GIS with Hydrological Mod-eling:Practices,Problems,and Prospects,Computers,Environ-ment and Urban Systems,23(1),33–51,1999.Tilman,D.and P.Kareiva:Spatial Ecology:The Role of Space in Population Dynamics and Interspecific Interactions.Princeton University Press,Princeton,New Jersey,USA,1997. Wesseling C.G.,Karssenberg, D.,Burrough,P. A.,and van Deursen,W.P.:Integrating Dynamic Environmental Models in GIS:The Development of a Dynamic Modelling Language, Transactions in GIS,1(1),40–48,1996.。
A Deformable Mixture Parsing Model with ParseletsJian Dong1,Qiang Chen1,Wei Xia1,Zhongyang Huang2,Shuicheng Yan11Department of Electrical and Computer Engineering,National University of Singapore,Singapore2Panasonic Singapore Laboratories,Singapore{a0068947,chenqiang,weixia,eleyans}@.sg,{zhongyang.huang}@ AbstractIn this work,we address the problem of human pars-ing,namely partitioning the human body into semantic re-gions,by using the novel Parselet representation.Previousworks often consider solving the problem of human poseestimation as the prerequisite of human parsing.We ar-gue that these approaches cannot obtain optimal pixel levelparsing due to the inconsistent targets between these tasks.In this paper,we propose to use Parselets as the build-ing blocks of our parsing model.Parselets are a group ofparsable segments which can generally be obtained by low-level over-segmentation algorithms and bear strong seman-tic meaning.We then build a Deformable Mixture Pars-ing Model(DMPM)for human parsing to simultaneously handle the deformation and multi-modalities of Parselets. The proposed model has two unique characteristics:(1)the possible numerous modalities of Parselet ensembles are ex-hibited as the“And-Or”structure of sub-trees;(2)to fur-ther solve the practical problem of Parselet occlusion or absence,we directly model the visibility property at some leaf nodes.The DMPM thus directly solves the problem of human parsing by searching for the best graph configura-tion from a pool of Parselet hypotheses without intermediate prehensive evaluations demonstrate the encour-aging performance of the proposed approach.1.IntroductionHuman parsing[31]has drawn much attention recently for its wide applications in human-centric analysis,such as person identification[16]and clothing analysis[7,21].The success of human parsing relies on the seamless coopera-tion of human pose estimation[32],segmentation[2],and region labeling[31].However,previous works often con-sider solving the problem of human pose estimation as the prerequisite of human parsing[31].We argue that these ap-proaches cannot obtain optimal pixel level parsing due to the inconsistent targets of these tasks.In this paper we aim to develop a unified framework for human parsing.To this end,we reconsider the basic levelParseletspants ts s…facece…skirtki t…hair rightarmbags……upperclothesHeadHalf bodyclothesUpperbodyLowbody…hairDeformable Mixture Parsing Modelrupperclothesrs skirt tleftarmrightshoe……LeafAndOr…Heaghta f Human ParsingFigure1:Parselets are image segments that can generally be obtained by low-level segmentation techniques and bear strong semantic meaning.The instantiated Parselets,which are activated by our Deformable Mixture Parsing Model, provide accurate semantic labeling for human parsing. representation.Although the key points[33]or rigid tem-plates[32,12]representation can facilitate the localization of human parts,leading to great success in human detec-tion and pose estimation[32],it fails to provide accurate pixel-level labeling.This limitation hinders key points or templates to be the ideal building blocks for human parsing. On the other hand,there exists exciting progress of bottom-up region hypotheses based segmentation methods[5,10], which have achieved the state-of-the-art performance[11]. More specifically,region hypotheses based segmentation is performed byfirst generating extensive object hypothe-ses based on bottom-up information and then ranking them, with the critical assumption that the object has a large prob-ability to be tightly covered by at least one of the generated hypotheses.This assumption usually holds well for objects with homogeneous appearance.However,for objects with large appearance variance,finding a single region hypothe-sis to tightly cover the whole object is very difficult.Based on the above observation,we propose to use Parselets as the building blocks for human parsing as shown in Fig.1.The Parselets are a group of semantic im-age segments with the following characteristics:(1)they can generally be obtained by low-level over-segmentation2013 IEEE International Conference on Computer Visionalgorithms[3,1],i.e.they are parsable by bottom-up tech-niques;(2)they have strong and consistent semantic mean-ing,i.e.they are parsable by the human knowledge.An object consisting of parts with large variance usually can-not be well segmented out by the low-level segmentation methods,e.g.a human body cannot be perfectly segmented by edge-based segmentation[3].However,we argue that the localized semantic regions,e.g.the skirt or hair area of human in Fig.1,often show homogeneous appearance and can be segmented out as segments.Such image segments, denoted as Parselets,explicitly encode segmentation and se-mantic level information.With the Parselet representation,we propose the De-formable Mixture Parsing Model(DMPM)for human pars-ing.DMPM is represented as an“And-Or”graph[33]based hierarchical model to simultaneously handle the deforma-tion and multi-modalities of Parselets.The joint learning and inference of best configuration for both appearance and structure in our DMPM guarantee the overall performance. We perform human parsing by generating extensive hy-potheses for Parselets and subsequently assembling them by DMPM.The major contributions of this work can be sum-marized as follows:•We propose the novel Parselet representation.By ex-plicitly encoding segmentation and semantic informa-tion,Parselets serve as ideal building blocks for hu-man parsing models.Human parsing is then performed with the Parselet representation,rather than with the key point[33]or rigid template[32,12]representa-tion.The instantiated Parselets directly provide accu-rate pixel-level semantic information.In practice,sev-eral over-segmentation techniques are utilized to en-sure the high recall rate of Parselets.•We build a novel Deformable Mixture Parsing Model(DMPM)for human parsing.The“co-occurrence”and“exclusive”modalities of Parselets are exhibited as the“And-Or”structure of sub-trees.To further solve the problem of Parselet occlusion or absence,we directly add the“visibility”property at the corresponding nodes.Joint learning and inference of appearance and structure parameters guarantee the overall performance.In addition,the tree structure of our DMPM allows efficient inference.•In order to verify the effectiveness of the proposed framework,we construct a high resolution human parsing dataset consisting of2,500images.All the pixels in the images are thoroughly annotated with18 types of Parselets.As far as we know,this is the largest human dataset with full parsing labels.It could serve as the benchmark for segmentation-based human anal-ysis in the research community.2.Related WorkSelective Search for Recognition:Selective search ap-proaches for object recognition have achieved great success in the past few years[24,10,27,5,2,4].This line of works first generate a set of object hypotheses based on bottom-up information and then convert the recognition problem into a ranking pared with exhaustive sliding win-dow scanning[9,12],selective search usually enables more expensive and potentially more powerful recognition tech-niques[28,27].Our work differs from the above works sig-nificantly as we focus on parts instead of whole objects.We claim that region hypotheses are better hypotheses for parts than for objects toward categories with heterogeneous ap-pearance.Gu et al.[17]also addressed the problem of seg-menting and recognizing objects based on their parts.They generated part hypotheses and then formulated the problem in the generalized Hough transformation framework.Our work differs from this work significantly as their work fo-cuses on the segmentation and is unable to exploit the hier-archical structure of the object.Part Based Model:Hierarchical part based models can better grasp the complicated structure than rigid models and thus usually achieve better performance for articulated objects[12,32,34].Pictorial Structure(PS)based meth-ods[13,12,32]are the most common approaches for pose estimation and object recognition.However,unlike our DMPM,part templates are usually spread in all nodes of PS based models,which makes it inconvenient to model com-plicated composite relation.The stochastic image grammar model[33,8]is also effective for modeling the hierarchical structure.However,these models rely on complex learning and inference procedures which can only be made tractable using approximate algorithms[25].On the contrary,de-spite the sophisticated structure of DMPM,we show that a tractable and exact inference algorithm exists.Human Parsing:Human parsing,namely partitioning the human body into semantic regions,plays an important role in many human-centric applications[7,21,22,30,20]. Torr and Zisserman proposed an approach for simultane-ous human pose estimation and body part labeling under the CRF framework[26],which can be regarded as a con-tinuation of combining segmentation and human pose esti-mation[19].Yamaguchi et al.[31]performed human pose estimation and attribute labeling sequentially for clothing parsing.Our method differs from these methods as previ-ous research on human parsing tends tofirst align human parts[32]due to the large pose variations or the complexity of the models.However,such sequential approaches may fail to capture the correlations between human appearance and structure,leading to unsatisfactory results.The pro-posed DMPM,which can solve human parsing in a unified framework,significantly distinguishes our work from oth-ers.Figure 2:Human decomposition based on different basic el-ements.The original image,Parselet based decomposition and joint based decomposition are shown sequentially.3.ParseletsParselets lie at the heart of our human parsing frame-work.In this section,we first give the definition of human Parselets.Then we present the details of hypothesis gener-ation and feature representation for Parselets.And finally,we briefly introduce the modalities of Parselet ensembles.3.1.Parselet DefinitionWe notice that the classical part-based models [13,32]usually divide body into parts based on joints.However,such decomposition is unsuitable for segment hypotheses because joint-based parts usually do not correspond to the segments from bottom-up cues.Considering the left im-age in Fig.2,the whole dress is likely to be captured by a single segment from the bottom-up techniques.But for the right image,the upper clothes,coat and pants should in-tuitively correspond to three separate segments.This dif-ference is hard to be grasped by joint based decomposi-tion.To overcome this limitation,we propose the Parse-lets to serve as the building elements for our parsing model.Formally,the Parselets are a group of semantic image seg-ments which have the following characteristics:(1)they can generally be obtained by low-level segmentation algo-rithms [3,1,5],i.e .they are parsable by the bottom-up tech-niques.This characteristic guarantees that Parselets can be retrieved with high possibility by the bottom-up hypothesis generation schemes.(2)They bear strong and consistent se-mantic meaning,i.e .they are parsable by the human knowl-edge.Since our ultimate goal is to perform human parsing,the basic elements of the parsing model should have clear semantic meaning.We now decompose human body into homogeneous re-gions based on low-level cues.The homogeneous regions,which have clear semantic meaning and appear in many dif-ferent images,are defined as Parselets.Through careful design,each defined Parselet will have high probability to form a single segment.Specifically,we define 18types of Parselets as described in Table 1.These Parselets are repre-sentative and can properly cover most of human body.They engage about 98.4%of human body in our labeled datasets and can be obtained with high recall rate using the method introduced in Section 3.2.Detailed statistics are shown in the experiment section.It is worth noting that the Parse-let definition is flexible to be redesigned for different appli-cations.The only assumption here is that those semanticTable 1:18types of Parselets for humanParselets Head hathair sunglasses Bodyupper clothescoat full body clothesskirt pants Foot left/right shoeSkin face left/right armleft/right legAccessorybagscarfbeltregions can be segmented out with high probability.3.2.Hypothesis Generation for ParseletsIn order to obtain the Parselet hypotheses with high re-call rate,we combine several low-level segmentation meth-ods.As Parselets usually appear in different scales,the hier-archical segmentation algorithm should be a natural way to generate hypotheses.Here,we choose Ultrametric Contour Map (UCM)[3],which works well to preserve the bound-ary information.However,the merging scheme of UCM proceeds by removing the edge with smallest probability and thus only neighboring super-pixels can be merged.This may prevent non-adjacent segments from merging as a sin-gle segment and lead to unsatisfactory results for some Parselets,which are separated by noise segments.For ex-ample,the dress in the left image of Fig.2is split into separate segments by the stripe pattern with strong edges.Hence UCM fails to merge them in the early stage.In ad-dition,some garments,such as a belt,may also divide a Parselet into separate segments.To handle these difficul-ties,we add another appearance based segmentation and merging scheme.Specifically,we first use the fast appear-ance based over-segmentation method [1]and sequentially merge the nearby (not necessarily adjacent)regions with the smallest similarity score in a similar manner as in [27].We define the similarity score S between segments a and b as S (a,b )=S size (a,b )+S appearance (a,b ),both of which are normalized to [0,1].S size (a,b )is defined as the frac-tion of the image that the region a and b jointly occupy.This factor encourages small regions to be merged early.S appearance (a,b )is defined as the χ2distance of the color and SIFT [23]histogram of segments a and b [29].Fi-nally,we utilize another complementary scheme,namely CPMC [5],which directly generates many segments of dif-ferent scales.The segments from the above three methods are combined into the final Parselet hypothesis.3.3.Feature RepresentationCompared with exhaustive sliding window scanning [9,12],our Parselet based representation enables complex and expensive feature design.It has been shown that the bag of words feature performs better than the rigid template for categories with large pose and view variance [28,27,5].As our Parselet categorization is essentially a classification problem,we follow the state-of-the-art feature extraction-coding-pooling classification pipeline [15,6,4].In this work,we adopt the Fisher Kernel (FK)+average pool-ing[15]and enhanced feature+second order pooling[4], which have been shown with the best performance among current BoW encoding methods.In addition,as our algo-rithm only employs the size and appearance features which can be efficiently propagated throughout the hierarchical structure embedded in the pools of segments,the feature extraction is reasonably fast.3.4.Parselet EnsembleParselets serve as the building blocks of our human pars-ing model.The Parselets are low-level parts from the def-inition.In practice,several Parselets are often grouped to-gether in order to form the middle-level human body part, e.g.head,body,etc.Those middle-level parts cannot be represented by a single type of Parselets but can be mod-eled by the ensembles of Parselets.More specifically,the ensembles of Parselets show two kinds of modalities as fol-lows:(1)Co-occurrence.The modality of co-occurrence represents the relation that several types of Parselets co-exist and are merged to form a larger middle-level human part.This is the most typical modality of Parselet ensem-bles.For example,the“hair”usually comes with“face”to form the“head”.(2)Exclusivity.The modality of exclu-sivity models the relationship of different types of Parselets that cannot coexist logically.For example,for the“lower-body”area,there are two possible Parselets,i.e.“skirts”and “pants”.However,“skirts”and“pants”usually cannot co-exist.The exclusivity for the middle-level concept“lower-body”means that only one of the two exclusive Parselets, i.e.“skirts”and“pants”,can exist for the“lower-body”.The middle level concepts formed from Parselet ensem-bles can be further merged with Parselet(s)or other middle level concepts.They also exhibit co-occurrence or exclu-sivity modalities to form an even higher level concept.This higher level concept thus inherits all the information from its sub-components.This inheritance property guarantees that we can model complex objects(e.g.human)with mul-tiple levels of concepts.4.Human Parsing over ParseletsWith the Parselets and their ensembles,we propose the Deformable Mixture Parsing Model(DMPM)for human parsing.Specifically,we propose to employ an“And-Or”graph[33]based hierarchical model to simultaneously han-dle the deformation and multi-modalities of Parselets.The “co-occurrence”modality is modeled as the“And”relation while“exclusivity”modality is modeled as the“Or”rela-tion in the graph.The deformation is modeled as pair-wise parent-child distance.We construct a hierarchical model,as hierarchical models have been shown to be ef-fective for grasping the structure of objects in part based approaches[32,33].In addition,absence/occlusion is com-mon for some Parselets.Hence we explicitly model this by utilizing a special structure call virtual“Leaf”node.Fig.3Half bodyclothesUpperbodycoatupperclothesClothingRelationHumanOr(a) “And-Or” GraphAndOrLeafVirtualLeafThe subgraphThe diamonds,rectangles,eclipses and eclipses with boundary represent“Or”nodes,“And”nodes,“Leaf”nodes and virtual“Leaf”nodes,respectively.shows a subgraph from our human graph,while the full graph of our parsing model is listed in the supplementalfile. In the next subsections,we will introduce our DMPM fol-lowed by the inference and learning algorithms.4.1.Deformable Mixture Parsing ModelWefirst define the notations used in the following sec-tion.P represents the Parselet hypothesis segments in an image generated according to Section3.2.For a hypothesis segment with index i,its scale(the square root of its area) and centroid are donated as s i and c i=(x i,y i).Formally,a DMPM model is represented as a graph G=(V,E)where V is the set of nodes and E is the set of edges.The edges are defined by the parent-child structure and kids(ν)de-note the children of nodeν.There are three basic types of nodes,“And”,“Or”and“Leaf”nodes which specify dif-ferent parent-child relationships as depicted in Fig.3by di-amonds,rectangles and eclipses respectively.Each“Leaf”node corresponds to one type of Parselets.The state variables of the graph specify the graph con-figuration.Specifically,the graph topology is instantiated by a switch variable t at“Or”nodes,which indicates the set of active nodes V(t).Starting from the top level,an ac-tive“Or”nodeν∈V O(t)selects a child tν∈kids(ν). The active“And”or“Or”nodes have the state variables gν=(sν,cν)which specify the virtual scale and centroid of the node.The active“Leaf”nodesν∈V L(t)have the state variables dνwhich specify the index of the segments for Parselets.In summary,we specify the configuration of the graph by the states z={(tν,gν):ν∈V O(t)}{gν:ν∈V A(t)}{dν:ν∈V L(t)}where the active nodes V(t)are determined from the{tν:ν∈V O(t)}.We then let z kids(ν)={zμ:μ∈kids(ν)}denote the states of all the child nodes of an“And”nodeν∈V A and let z tνdenote the state of the selected child node of an“Or”nodeν∈V O.Invisibility Modeling:Some Parselets,such as bags and scarfs,have high probability to be absent or occluded, namely invisible.In other words,these“Leaf”nodes should be with the visibility property.We explicitly model these notes by using a special structure,denoted as virtual“Leaf”node.More specifically,we introduce an auxiliary “Invis-ible”type of nodes which have no appearance representa-tion.Then the virtual “Leaf”node is represented as a struc-ture consisting of an “Or”node,an ordinary “Leaf”node and an “Invisible”node,as shown in Fig.3.The activated nodes in the virtual “Leaf”node structure thus explicitly suggest whether the corresponding “Leaf”node (Parselet)is visible or not.For standard “Leaf”node μ,the cor-responding score is w Lμ·ΦL (P,z μ),where ΦL (P,z μ)is the feature vector extracted from the segment d μas de-scribed in Section 3.3.For the virtual “Leaf”node with “Or”node ν,“Leaf”node μand “Invisible”node ρ,thescore is w L μ·ΦL (P,z μ)+w O ν,μor w Oν,ρdepending on the vis-ibility of the corresponding Parselet.w O ν,μand w O ν,ρare the learned weights for the visibility property,which are em-bedded in the “Or”node of the virtual “Leaf”node.It is worth noting that the state of the “Invisible”node fully de-pends on its weight in the “Or”node and its own score is always 0.We can now write the full score associated with a state variable z :S (P,z )=μ∈V L (t )w L μ·ΦL (P,z μ)+ μ∈V O (t )w O μ,t μ+μ∈V A (t )w Aμ·ΦA(z μ,z kids(μ)).(1)The first term in Eqn.(1)is an appearance model that com-putes the local score of assigning the segment d μas Parseletμ.The last two terms are independent of the data and can be considered as priors of occurrence and the spatial geometry.Based on the graph structure,we can further decompose the last term of Eqn.(1)as follows:S (P,z )= μ∈V L (t )w L μ·ΦL (P,z μ)+μ∈V O (t )w Oμ,t μ+μ∈V A (t )ν∈kids(μ)w Aμ,ν·ψ(d μ,d ν).(2)ψ(d μ,d ν)=[dx dx 2dy dy 2ds ]T measures the geo-metric difference between part μand ν,where dx =(x ν−x μ)/√s ν·s μ,dy =(y ν−y μ)/√s ν·s μand ds =s ν/s μare the relative location and scale of part νwith respect to μ.Compared with the most prevalent hierarchical modeling approaches [32,12],the proposed model has the following distinctive characteristics:•We use Parselets as the basic elements for our pars-ing model.The parsing problem is now transferred as searching the best configuration of the hierarchical model.Once the maximization is obtained,we can di-rectly get the accurate pixel-level segmentation and se-mantic labels from the corresponding Parselets.•The “And-Or”graph structure allows both co-occurrence and exclusivity relations between differentparts.Unlike previous methods [32,12],which oftenuse “Or”node to model the multi-view properties of the same part,the “Or”node here plays the role of se-lecting the best configuration among mixture of sub-graphs,which is more flexible.•We explicitly model the visibility property of the “Leaf”node,which is practical and critical for some Parselets.The introduction of a special node,i.e .the Invisible node,brings the flexibility for the real-life sit-uation without adding extra model complexity.4.2.InferenceInference corresponds to maximizing S (P,z )from Eqn.(2)over z .As graph G =(V ,E)is a tree,inference can be done efficiently with dynamic programming.More specifically,we can simply iterate over all subparts starting from the leaves and moving “upstream”to the root.The message from children to their parent can be computed by the following:score I τ(z τ)=0,(3)score L τ(z τ)=w L τ·ΦL(P,z τ),(4)score O ν(z ν)=max ρ∈kids(ν)[m ρ(z ν)],(5)m ρ(z ν)=max z ρ[score ρ(z ρ)]+w Oν,ρ,(6)score Aμ(z μ)=ρ∈kids(μ)n ρ(z μ),(7)n ρ(z μ)=max z ρ[score ρ(z ρ)+w Aμ,ρ·ψ(d μ,d ρ)].(8)At the bottom level,the scores of “Invisible”nodes and“Leaf”nodes are calculated as in Eqn.(3)and Eqn.(4).“Or”node selects the maximal response from its children for its score as in Eqn.(5)and Eqn.(6).The score of “And”node is calculated by accumulating the scores of its chil-dren plus the corresponding deformation as in Eqn.(7)and Eqn.(8).The above equations suggest that we can express the energy function recursively and hence find the optimal z using dynamic programming.In addition,the maximization over z can be partially accelerated by generalized distance transformation,which makes the whole algorithm more ef-ficient [14,12].4.3.LearningGiven the labeled examples {P i ,z i },the max-margin framework is arguably preferable to maximum-likelihood estimation as our final goal is discrimination.Note that the scoring function of Eqn.(2)is linear in model param-eters w =(w L ,w O ,w A ),and can be written compactly as S (P,z )=w ·Φ(P,z ).Thus both appearance and structure parameters can be learned in a unified framework,which is critical for achieving the state-of-the-art perfor-mance for many applications [12,32].Here,we formulate the structured learning problem in a max-margin framework as in [12]:Table 2:The best IoU scores for each type of Parselets on the FS and DP datasets.dataset hat hair s-gls u-cloth coat f-cloth skirt pants belt l-shoe r-shoe face l-arm r-arm l-leg r-leg bag scarf FS 84.078.256.684.1null 90.891.692.865.772.471.983.479.879.879.279.981.876.1DP 83.581.058.888.671.993.989.392.571.073.273.885.693.492.986.786.584.878.2min ww 2+Ciξis.t.w ·(Φ(P i ,z i )−Φ(P i ,z ))≥Δ(z i ,z )−ξi ,∀z ;(9)where Δ(z i ,z j )is a loss function which penalizes incor-rect estimate of z.This loss function gives partial credit to states which differ from the ground truth slightly.The loss function is defined as follows:Δ(z i ,z j )=ν∈V L (t i )V L (t j )δ(z νi ,z νj ),(10)where δ(z νi ,z νj )=1,if ν/∈V L (t i ) V L (t j )or sim (d νi ,d νj )≤σ.sim (·,·)is the intersection over unionratio of two segments d νi and d νj ,and σis the threshold,which is set as 0.8in the experiments.This loss function penalizes both configurations with “wrong”topology and leaf nodes with wrong segments.The optimization problem Eqn.(9)is known as a structural SVM,which can be effi-ciently solved by the cutting plane solver of SVMStruct [18]and the stochastic gradient descent solver in [12].5.Experiments5.1.Experimental SettingsDataset:Our experiments are conducted on two datasets.The first one is the Fashionista (FS)dataset [31],which has 685annotated samples with 56different clothing labels.This dataset is originally designed for fine-grained clothing parsing.To adapt this dataset for our human pars-ing,we merge their labels according to our Parselet def-inition.As there is no direct link between their annota-tion and our “coat”Parselet,we ignore the “coat”Parselet and merge all upper body clothing into the “upper clothes”Parselet.The second dataset,called Daily Photos (DP),con-tains 2500high resolution images,which are crawled fol-lowing the same strategy as the FS dataset [31].In order to obtain quantitative evaluation results,we thoroughly anno-tate the semantic labels at pared with FS,the DP dataset contains much more images and has consis-tent labels with Parselet definition for human parsing.Evaluation Criterion:The parsing result is evaluated based on two complementary metrics.The first one is Av-erage Pixel Accuracy (APA)[31],which is defined as the proportion of correctly labeled pixels in the whole image.This metric mainly measures the overall performance.The second metric is Intersection over Union (IoU)[11],which is widely used in evaluating segmentation and suitable for measuring the performance of each Parselet separately.We also devise two variants of IoU for Parselets to make Parse-lets comparable with objects.The first one is the “Merging IoU”(mIoU)which merges the hypothesis for each ParseletTable 3:Comparison of Parselets versus objects in terms ofthe best IoU score on FS and DP datasets.dataset CPMC [27]SLIC [1]UCM [3]Combined Obj IoU FS 0.8300.5590.4300.831Par mIoU FS 0.8950.7250.6040.917Par wIoU FS 0.8440.6210.5460.860Obj IoU DP 0.8150.5340.4430.816Par mIoU DP 0.8960.7220.6380.928Par wIoUDP0.8310.6140.6080.862into an object hypothesis to obtain the object level IoU.The second one is the “Weighted IoU”(wIoU)which is calcu-lated by accumulating each Parselet’s IoU score weighted by the ratio of its pixels occupying the whole object.Note that generally mIoU is higher than wIoU.Implementation Details:We extract dense SIFT [23],HOG [9]and color moment as low-level features for Parse-lets.The size of Gaussian Mixture Model in FK is set to 128.The training:testing ratio is 2:1for both datasets.The penalty parameter C is determined by 3-fold cross valida-tion in the training set.5.2.Hypotheses Comparison:Parselets vs.ObjectsWe first validate the assumption that segmentation can provide better hypotheses for Parselets than for objects with heterogeneous appearance (e.g .human)by comparing the best IoU scores of Parselets and objects.The best IoU score for a segmentation method is defined as the maximal IoU score between the segments produced by that method and the ground truth segments.The same hypothesis seg-ments,which are generated through the methods introduced in Section 3.2,are used for both Parselets and objects.We calculate the best IoU of Parselets and objects for differ-ent method on two datasets.The comparison results are displayed in Table 3,from which it can be observed that the best IoU of Parselets is much higher than that of ob-jects.This trend is consistent among different algorithms and datasets,which makes the usage of segments as Parse-let hypotheses more convincing.In addition,combining all three complementary algorithms leads to the best perfor-mance and we use this setting thereafter.The detailed best IoU for each type of Parselets based on combined hypothe-ses are shown in Table 2.5.3.Evaluation for Human ParsingHuman Parsing :We now compare our proposed frame-work with the work of Yamaguchi et al .[31]for humanparsing.This baseline works by first estimating the hu-man pose and then labeling the super-pixel based on the pose estimation results.We use the public available imple-mentation of version 0.2and carefully tune the parameter according to [31].The baseline method achieves 83%for FS dataset and 82%for DP dataset in terms of APA,which。
diffusion model paper list -回复关于扩散模型(diffusion model)的论文列表是一种整理和总结各种研究论文的方法,它有助于研究者了解该领域的发展动态、明确研究方向和提出新的研究问题。
本文将以中括号内的内容为主题,分步骤详细回答相关问题,并结合相关论文进行论述。
扩散模型是一种描述物质或信息传播的数学模型,也被广泛应用于各个领域,如社交网络分析、病毒传播、市场营销等。
下面将逐步回答在这个主题下的问题。
1. 什么是扩散模型?扩散模型是一种数学模型,用于描述物质或信息在一定条件下的传播过程。
它基于一系列假设和规律,通过构建数学方程或计算模拟来预测和解释传播现象。
扩散模型的基本思想是将传播过程抽象为个体之间的相互作用,并考虑各种因素对传播的影响。
2. 扩散模型的研究领域有哪些?扩散模型的研究领域十分广泛,涉及社会学、物理学、生物学、信息科学等多个学科。
其中一些常见的研究领域包括社交网络分析、病毒传播、传染病模型、市场营销、舆情传播等。
3. 为什么需要扩散模型?扩散模型在其应用领域中起到了重要的作用。
首先,它能帮助我们理解复杂系统中的传播模式和规律,提供洞察传播过程的方法。
其次,扩散模型能够预测和控制传播现象,为决策提供依据。
此外,以扩散模型为基础的研究也为相关学科的理论和方法提供了新的视角和框架。
接下来,我们将回答以下问题,以更具体的论文为例进一步展开论述。
4. 有哪些经典的扩散模型?经典的扩散模型包括SIR模型、SI模型、SEIR模型等。
其中,SIR模型是一种常见的传染病模型,用于描述感染者(S)、传染者(I)和康复者(R)之间的相互转变关系。
SI模型则假设感染者一直保持传染性,没有康复的情况。
SEIR模型在SIR模型的基础上添加了潜伏期(E),用于描述病毒感染前的潜伏期。
5. 扩散模型的应用有哪些?扩散模型在社交网络分析、病毒传播、市场营销等领域有广泛应用。
例如,在社交网络分析中,扩散模型可以用来预测信息在社交网络中的传播路径和转发行为;在病毒传播领域,扩散模型可以用来评估传染病的传播速度和规模;在市场营销中,扩散模型可以用来预测产品或信息在市场中的传播效果和影响因素。