The clustering instability of inertial particles spatial distribution in turbulent flows
- 格式:pdf
- 大小:304.49 KB
- 文档页数:15
国际自动化与计算杂志.英文版.1.Improved Exponential Stability Criteria for Uncertain Neutral System with Nonlinear Parameter PerturbationsFang Qiu,Ban-Tong Cui2.Robust Active Suspension Design Subject to Vehicle Inertial Parameter VariationsHai-Ping Du,Nong Zhang3.Delay-dependent Non-fragile H∞ Filtering for Uncertain Fuzzy Systems Based on Switching Fuzzy Model and Piecewise Lyapunov FunctionZhi-Le Xia,Jun-Min Li,Jiang-Rong Li4.Observer-based Adaptive Iterative Learning Control for Nonlinear Systems with Time-varying DelaysWei-Sheng Chen,Rui-Hong Li,Jing Li5.H∞ Output Feedback Control for Stochastic Systems with Mode-dependent Time-varying Delays and Markovian Jump ParametersXu-Dong Zhao,Qing-Shuang Zeng6.Delay and Its Time-derivative Dependent Robust Stability of Uncertain Neutral Systems with Saturating ActuatorsFatima El Haoussi,El Houssaine Tissir7.Parallel Fuzzy P+Fuzzy I+Fuzzy D Controller:Design and Performance EvaluationVineet Kumar,A.P.Mittal8.Observers for Descriptor Systems with Slope-restricted NonlinearitiesLin-Na Zhou,Chun-Yu Yang,Qing-Ling Zhang9.Parameterized Solution to a Class of Sylvester MatrixEquationsYu-Peng Qiao,Hong-Sheng Qi,Dai-Zhan Cheng10.Indirect Adaptive Fuzzy and Impulsive Control of Nonlinear SystemsHai-Bo Jiang11.Robust Fuzzy Tracking Control for Nonlinear Networked Control Systems with Integral Quadratic ConstraintsZhi-Sheng Chen,Yong He,Min Wu12.A Power-and Coverage-aware Clustering Scheme for Wireless Sensor NetworksLiang Xue,Xin-Ping Guan,Zhi-Xin Liu,Qing-Chao Zheng13.Guaranteed Cost Active Fault-tolerant Control of Networked Control System with Packet Dropout and Transmission DelayXiao-Yuan Luo,Mei-Jie Shang,Cai-Lian Chen,Xin-Ping Guanparison of Two Novel MRAS Based Strategies for Identifying Parameters in Permanent Magnet Synchronous MotorsKan Liu,Qiao Zhang,Zi-Qiang Zhu,Jing Zhang,An-Wen Shen,Paul Stewart15.Modeling and Analysis of Scheduling for Distributed Real-time Embedded SystemsHai-Tao Zhang,Gui-Fang Wu16.Passive Steganalysis Based on Higher Order Image Statistics of Curvelet TransformS.Geetha,Siva S.Sivatha Sindhu,N.Kamaraj17.Movement Invariants-based Algorithm for Medical Image Tilt CorrectionMei-Sen Pan,Jing-Tian Tang,Xiao-Li Yang18.Target Tracking and Obstacle Avoidance for Multi-agent SystemsJing Yan,Xin-Ping Guan,Fu-Xiao Tan19.Automatic Generation of Optimally Rigid Formations Using Decentralized MethodsRui Ren,Yu-Yan Zhang,Xiao-Yuan Luo,Shao-Bao Li20.Semi-blind Adaptive Beamforming for High-throughput Quadrature Amplitude Modulation SystemsSheng Chen,Wang Yao,Lajos Hanzo21.Throughput Analysis of IEEE 802.11 Multirate WLANs with Collision Aware Rate Adaptation AlgorithmDhanasekaran Senthilkumar,A. Krishnan22.Innovative Product Design Based on Customer Requirement Weight Calculation ModelChen-Guang Guo,Yong-Xian Liu,Shou-Ming Hou,Wei Wang23.A Service Composition Approach Based on Sequence Mining for Migrating E-learning Legacy System to SOAZhuo Zhang,Dong-Dai Zhou,Hong-Ji Yang,Shao-Chun Zhong24.Modeling of Agile Intelligent Manufacturing-oriented Production Scheduling SystemZhong-Qi Sheng,Chang-Ping Tang,Ci-Xing Lv25.Estimation of Reliability and Cost Relationship for Architecture-based SoftwareHui Guan,Wei-Ru Chen,Ning Huang,Hong-Ji Yang1.A Computer-aided Design System for Framed-mould in Autoclave ProcessingTian-Guo Jin,Feng-Yang Bi2.Wear State Recognition of Drills Based on K-means Cluster and Radial Basis Function Neural NetworkXu Yang3.The Knee Joint Design and Control of Above-knee Intelligent Bionic Leg Based on Magneto-rheological DamperHua-Long Xie,Ze-Zhong Liang,Fei Li,Li-Xin Guo4.Modeling of Pneumatic Muscle with Shape Memory Alloy and Braided SleeveBin-Rui Wang,Ying-Lian Jin,Dong Wei5.Extended Object Model for Product Configuration DesignZhi-Wei Xu,Ze-Zhong Liang,Zhong-Qi Sheng6.Analysis of Sheet Metal Extrusion Process Using Finite Element MethodXin-Cun Zhuang,Hua Xiang,Zhen Zhao7.Implementation of Enterprises' Interoperation Based on OntologyXiao-Feng Di,Yu-Shun Fan8.Path Planning Approach in Unknown EnvironmentTing-Kai Wang,Quan Dang,Pei-Yuan Pan9.Sliding Mode Variable Structure Control for Visual Servoing SystemFei Li,Hua-Long Xie10.Correlation of Direct Piezoelectric Effect on EAPap under Ambient FactorsLi-Jie Zhao,Chang-Ping Tang,Peng Gong11.XML-based Data Processing in Network Supported Collaborative DesignQi Wang,Zhong-Wei Ren,Zhong-Feng Guo12.Production Management Modelling Based on MASLi He,Zheng-Hao Wang,Ke-Long Zhang13.Experimental Tests of Autonomous Ground Vehicles with PreviewCunjia Liu,Wen-Hua Chen,John Andrews14.Modelling and Remote Control of an ExcavatorYang Liu,Mohammad Shahidul Hasan,Hong-Nian Yu15.TOPSIS with Belief Structure for Group Belief Multiple Criteria Decision MakingJiang Jiang,Ying-Wu Chen,Da-Wei Tang,Yu-Wang Chen16.Video Analysis Based on Volumetric Event DetectionJing Wang,Zhi-Jie Xu17.Improving Decision Tree Performance by Exception HandlingAppavu Alias Balamurugan Subramanian,S.Pramala,B.Rajalakshmi,Ramasamy Rajaram18.Robustness Analysis of Discrete-time Indirect Model Reference Adaptive Control with Normalized Adaptive LawsQing-Zheng Gao,Xue-Jun Xie19.A Novel Lifecycle Model for Web-based Application Development in Small and Medium EnterprisesWei Huang,Ru Li,Carsten Maple,Hong-Ji Yang,David Foskett,Vince Cleaver20.Design of a Two-dimensional Recursive Filter Using the Bees AlgorithmD. T. Pham,Ebubekir Ko(c)21.Designing Genetic Regulatory Networks Using Fuzzy Petri Nets ApproachRaed I. Hamed,Syed I. Ahson,Rafat Parveen1.State of the Art and Emerging Trends in Operations and Maintenance of Offshore Oil and Gas Production Facilities: Some Experiences and ObservationsJayantha P.Liyanage2.Statistical Safety Analysis of Maintenance Management Process of Excavator UnitsLjubisa Papic,Milorad Pantelic,Joseph Aronov,Ajit Kumar Verma3.Improving Energy and Power Efficiency Using NComputing and Approaches for Predicting Reliability of Complex Computing SystemsHoang Pham,Hoang Pham Jr.4.Running Temperature and Mechanical Stability of Grease as Maintenance Parameters of Railway BearingsJan Lundberg,Aditya Parida,Peter S(o)derholm5.Subsea Maintenance Service Delivery: Mapping Factors Influencing Scheduled Service DurationEfosa Emmanuel Uyiomendo,Tore Markeset6.A Systemic Approach to Integrated E-maintenance of Large Engineering PlantsAjit Kumar Verma,A.Srividya,P.G.Ramesh7.Authentication and Access Control in RFID Based Logistics-customs Clearance Service PlatformHui-Fang Deng,Wen Deng,Han Li,Hong-Ji Yang8.Evolutionary Trajectory Planning for an Industrial RobotR.Saravanan,S.Ramabalan,C.Balamurugan,A.Subash9.Improved Exponential Stability Criteria for Recurrent Neural Networks with Time-varying Discrete and Distributed DelaysYuan-Yuan Wu,Tao Li,Yu-Qiang Wu10.An Improved Approach to Delay-dependent Robust Stabilization for Uncertain Singular Time-delay SystemsXin Sun,Qing-Ling Zhang,Chun-Yu Yang,Zhan Su,Yong-Yun Shao11.Robust Stability of Nonlinear Plants with a Non-symmetric Prandtl-Ishlinskii Hysteresis ModelChang-An Jiang,Ming-Cong Deng,Akira Inoue12.Stability Analysis of Discrete-time Systems with Additive Time-varying DelaysXian-Ming Tang,Jin-Shou Yu13.Delay-dependent Stability Analysis for Markovian Jump Systems with Interval Time-varying-delaysXu-Dong Zhao,Qing-Shuang Zeng14.H∞ Synchronization of Chaotic Systems via Delayed Feedback ControlLi Sheng,Hui-Zhong Yang15.Adaptive Fuzzy Observer Backstepping Control for a Class of Uncertain Nonlinear Systems with Unknown Time-delayShao-Cheng Tong,Ning Sheng16.Simulation-based Optimal Design of α-β-γ-δ FilterChun-Mu Wu,Paul P.Lin,Zhen-Yu Han,Shu-Rong Li17.Independent Cycle Time Assignment for Min-max SystemsWen-De Chen,Yue-Gang Tao,Hong-Nian Yu1.An Assessment Tool for Land Reuse with Artificial Intelligence MethodDieter D. Genske,Dongbin Huang,Ariane Ruff2.Interpolation of Images Using Discrete Wavelet Transform to Simulate Image Resizing as in Human VisionRohini S. Asamwar,Kishor M. Bhurchandi,Abhay S. Gandhi3.Watermarking of Digital Images in Frequency DomainSami E. I. Baba,Lala Z. Krikor,Thawar Arif,Zyad Shaaban4.An Effective Image Retrieval Mechanism Using Family-based Spatial Consistency Filtration with Object RegionJing Sun,Ying-Jie Xing5.Robust Object Tracking under Appearance Change ConditionsQi-Cong Wang,Yuan-Hao Gong,Chen-Hui Yang,Cui-Hua Li6.A Visual Attention Model for Robot Object TrackingJin-Kui Chu,Rong-Hua Li,Qing-Ying Li,Hong-Qing Wang7.SVM-based Identification and Un-calibrated Visual Servoing for Micro-manipulationXin-Han Huang,Xiang-Jin Zeng,Min Wang8.Action Control of Soccer Robots Based on Simulated Human IntelligenceTie-Jun Li,Gui-Qiang Chen,Gui-Fang Shao9.Emotional Gait Generation for a Humanoid RobotLun Xie,Zhi-Liang Wang,Wei Wang,Guo-Chen Yu10.Cultural Algorithm for Minimization of Binary Decision Diagram and Its Application in Crosstalk Fault DetectionZhong-Liang Pan,Ling Chen,Guang-Zhao Zhang11.A Novel Fuzzy Direct Torque Control System for Three-level Inverter-fed Induction MachineShu-Xi Liu,Ming-Yu Wang,Yu-Guang Chen,Shan Li12.Statistic Learning-based Defect Detection for Twill FabricsLi-Wei Han,De Xu13.Nonsaturation Throughput Enhancement of IEEE 802.11b Distributed Coordination Function for Heterogeneous Traffic under Noisy EnvironmentDhanasekaran Senthilkumar,A. Krishnan14.Structure and Dynamics of Artificial Regulatory Networks Evolved by Segmental Duplication and Divergence ModelXiang-Hong Lin,Tian-Wen Zhang15.Random Fuzzy Chance-constrained Programming Based on Adaptive Chaos Quantum Honey Bee Algorithm and Robustness AnalysisHan Xue,Xun Li,Hong-Xu Ma16.A Bit-level Text Compression Scheme Based on the ACW AlgorithmHussein A1-Bahadili,Shakir M. Hussain17.A Note on an Economic Lot-sizing Problem with Perishable Inventory and Economies of Scale Costs:Approximation Solutions and Worst Case AnalysisQing-Guo Bai,Yu-Zhong Zhang,Guang-Long Dong1.Virtual Reality: A State-of-the-Art SurveyNing-Ning Zhou,Yu-Long Deng2.Real-time Virtual Environment Signal Extraction and DenoisingUsing Programmable Graphics HardwareYang Su,Zhi-Jie Xu,Xiang-Qian Jiang3.Effective Virtual Reality Based Building Navigation Using Dynamic Loading and Path OptimizationQing-Jin Peng,Xiu-Mei Kang,Ting-Ting Zhao4.The Skin Deformation of a 3D Virtual HumanXiao-Jing Zhou,Zheng-Xu Zhao5.Technology for Simulating Crowd Evacuation BehaviorsWen-Hu Qin,Guo-Hui Su,Xiao-Na Li6.Research on Modelling Digital Paper-cut PreservationXiao-Fen Wang,Ying-Rui Liu,Wen-Sheng Zhang7.On Problems of Multicomponent System Maintenance ModellingTomasz Nowakowski,Sylwia Werbinka8.Soft Sensing Modelling Based on Optimal Selection of Secondary Variables and Its ApplicationQi Li,Cheng Shao9.Adaptive Fuzzy Dynamic Surface Control for Uncertain Nonlinear SystemsXiao-Yuan Luo,Zhi-Hao Zhu,Xin-Ping Guan10.Output Feedback for Stochastic Nonlinear Systems with Unmeasurable Inverse DynamicsXin Yu,Na Duan11.Kalman Filtering with Partial Markovian Packet LossesBao-Feng Wang,Ge Guo12.A Modified Projection Method for Linear FeasibilityProblemsYi-Ju Wang,Hong-Yu Zhang13.A Neuro-genetic Based Short-term Forecasting Framework for Network Intrusion Prediction SystemSiva S. Sivatha Sindhu,S. Geetha,M. Marikannan,A. Kannan14.New Delay-dependent Global Asymptotic Stability Condition for Hopfield Neural Networks with Time-varying DelaysGuang-Deng Zong,Jia Liu hHTTp://15.Crosscumulants Based Approaches for the Structure Identification of Volterra ModelsHouda Mathlouthi,Kamel Abederrahim,Faouzi Msahli,Gerard Favier1.Coalition Formation in Weighted Simple-majority Games under Proportional Payoff Allocation RulesZhi-Gang Cao,Xiao-Guang Yang2.Stability Analysis for Recurrent Neural Networks with Time-varying DelayYuan-Yuan Wu,Yu-Qiang Wu3.A New Type of Solution Method for the Generalized Linear Complementarity Problem over a Polyhedral ConeHong-Chun Sun,Yan-Liang Dong4.An Improved Control Algorithm for High-order Nonlinear Systems with Unmodelled DynamicsNa Duan,Fu-Nian Hu,Xin Yu5.Controller Design of High Order Nonholonomic System with Nonlinear DriftsXiu-Yun Zheng,Yu-Qiang Wu6.Directional Filter for SAR Images Based on NonsubsampledContourlet Transform and Immune Clonal SelectionXiao-Hui Yang,Li-Cheng Jiao,Deng-Feng Li7.Text Extraction and Enhancement of Binary Images Using Cellular AutomataG. Sahoo,Tapas Kumar,B.L. Rains,C.M. Bhatia8.GH2 Control for Uncertain Discrete-time-delay Fuzzy Systems Based on a Switching Fuzzy Model and Piecewise Lyapunov FunctionZhi-Le Xia,Jun-Min Li9.A New Energy Optimal Control Scheme for a Separately Excited DC Motor Based Incremental Motion DriveMilan A.Sheta,Vivek Agarwal,Paluri S.V.Nataraj10.Nonlinear Backstepping Ship Course ControllerAnna Witkowska,Roman Smierzchalski11.A New Method of Embedded Fourth Order with Four Stages to Study Raster CNN SimulationR. Ponalagusamy,S. Senthilkumar12.A Minimum-energy Path-preserving Topology Control Algorithm for Wireless Sensor NetworksJin-Zhao Lin,Xian Zhou,Yun Li13.Synchronization and Exponential Estimates of Complex Networks with Mixed Time-varying Coupling DelaysYang Dai,YunZe Cai,Xiao-Ming Xu14.Step-coordination Algorithm of Traffic Control Based on Multi-agent SystemHai-Tao Zhang,Fang Yu,Wen Li15.A Research of the Employment Problem on Common Job-seekersand GraduatesBai-Da Qu。
摘要摘要基于植保无人机的低空药液喷施技术具有安全、高效的作业优势,已广泛应用于北方大田种植环境下的农作物精准施药。
南方农作物种植环境多以丘陵山地为主,主要特征是农作物种植不规整、随意性大,不适合直接采用大田种植环境下的均匀药液喷施模式。
为了提高山地果园种植环境中植保无人机药液喷施的作业效果,以柑橘果树为研究对象,提出一种复杂果园环境下基于单目视觉的果树检测方法,实现无人机在果园环境下获取果树大小、位置等基本的作物信息,可以为农药的精准对靶施药提供技术支持;为检验果园的施药效果,提出一种基于地—空信息融合的柑橘产量估计方法,实现单棵果树的产量估计。
本文主要研究内容和结论如下:1、基于单目机器视觉的果园环境下柑橘果树检测。
获取了山地柑橘果园环境下果树分布的低空RGB (Red/green/blue)图像数据,为了减少亮度对果树检测的干扰,采用选择性直方图改善果树区域的亮度,由于果树与杂草区域存在较为显著的视觉差异性,通过色差图方法实现潜在果树区域(Regions of Interest, RoI)的分割;利用色彩空间变换技术(Color spaces transformation technology, CST)提取了表征果树颜色信息的14种颜色特征,分别采用灰度共生矩阵(Gray-level co-occurrence matrix, GLCM)和局部二值模式(Local binary pattern, LBP)获取了RoIs的6种纹理特征,并建立支持向量机(Support vector machine, SVM)模型实现果树分割和半径检测。
结果表明,本文方法可在维持图像色彩信息不变的情况下实现果树亮度的自适应调整,提高了前背景间的对比度,确保了更准确的果树区域分割,平均分割准确率为83.09%;基于果树外观特征和SVM的柑橘果树检测模型能进一步抑制复杂果园环境中干扰场景参与物,检测准确率达到了85.27%。
a r X i v :0802.1452v 1 [a s t r o -p h ] 11 F eb 2008accepted in ApJPreprint typeset using L A T E X style emulateapj v.04/21/05THE SHAPE OF THE INITIAL CLUSTER MASS FUNCTION:WHAT IT TELLS US ABOUT THE LOCAL STAR FORMATION EFFICIENCYG.Parmentier 1,2,4,S.P.Goodwin 3,P.Kroupa 1and H.Baumgardt 1accepted in ApJABSTRACTWe explore how the expulsion of gas from star-cluster forming cloud-cores due to supernova explo-sions affects the shape of the initial cluster mass function,that is,the mass function of star clusters when effects of gas expulsion are over.We demonstrate that if the radii of cluster-forming gas cores are roughly constant over the core mass range,as supported by observations,then more massive cores undergo slower gas expulsion.Therefore,for a given star formation efficiency,more massive cores retain a larger fraction of stars after gas expulsion.The initial cluster mass function may thus dif-fer from the core mass function substantially,with the final shape depending on the star formation efficiency.A mass-independent star formation efficiency of about 20per cent turns a power-law core mass function into a bell-shaped initial cluster mass function,while mass-independent efficiencies of order 40per cent preserve the shape of the core mass function.Subject headings:galaxies:star clusters —stars:formation —stellar dynamics1.INTRODUCTIONOne of the greatest discoveries made by the Hubble Space Telescope is that the formation of star clusters with mass and compactness comparable to those of old globular clusters is still ongoing in violent star forming environments,such as starbursts and galaxy mergers.The young cluster mass function (CMF:the number of objects per logarithmic cluster mass interval,d N/dlog m )is often reported to be a power-law of spectral index α=−2(i.e.equivalent to d N ∝m −2d m )down to the detection limit (Fall &Zhang 2001;Bik et al.2003;Hunter et al.2003).Similar results are obtained for the cluster luminosity function (e.g.,Miller et al.1997;Whitmore et al.1999).However,old globular cluster systems are found to have a CMF which is Gaussian-like with a peak mass of ∼105M ⊙(Kavelaars &Hanes 1997).These very different CMFs between young and old systems present the greatest dissimilarity between the two populations,and perhaps the greatest barrier to identifying young massive clusters as analogues of young globular clusters.Yet,it is worth keeping in mind that the luminos-ity/mass distribution of some young star cluster systems does not obey a power-law.Based on high resolution Very Large Array observations,Johnson (2008)rules out a power-law cluster luminosity function for the young star clusters formed in the starburst galaxy Henize 2-10.In NGC 5253,an irregular dwarf galaxy in the Cen-taurus Group,Cresci et al.(2005)detect a turnover at 5×104M ⊙in the mass function of the young massive star clusters in the central st but not least,the case of the Antennae merger of galaxies NGC 4038/39remains1Argelander-Institut f¨u r Astronomie,University of Bonn,Auf dem H¨u gel 71,D-53121Bonn,Germany2Institute of Astrophysics &Geophysics,University of Li`e ge,All´e e du 6Aoˆu t 17,B-4000Li`e ge,Belgium3Department of Physics and Astronomy,University of Sheffield,Hicks Building,Hounsfield Road,Sheffield S37RH,United King-dom4Alexander von Humboldt Fellow and Research Fellow of Bel-gian Science Policyheavily debated.While Whitmore et al.(1999)infer from their HST/WFPC2imaging data a power-law clus-ter luminosity function with a spectral index α≃−2.1,Anders et al.(2007)report the first statistically robust detection of a turnover at M V ≃−8.5mag,correspond-ing to a mass of ≃16×103M ⊙.The origin of that discrepancy likely resides in differences in the data re-duction and the statistical analysis aimed at rejecting bright-star contamination and disentangling complete-ness effects from intrinsic cluster luminosity function sub-structures.These findings raise the highly interesting prospect that the shape of the CMF at young ages does vary from one galaxy to another.If it were conclusively proven that the shape of the initial cluster mass function (ICMF)is not universal among galaxies but,instead,depends on environmental conditions and/or on the star formation process,consequences would be far reaching since this would imply that the young CMF may help us decipher physical conditions prevailing in external galaxies.In this contribution,we report the first results of a project investigating the evolution of the mass function of star forming cores into that of bound gas-free star clusters which have survived expulsion of their resid-ual star forming gas.A similar study was carried out by Parmentier &Gilmore (2007)in which they showed that protoglobular cloud mass functions characterized by a mass-scale of ≃106M ⊙evolve into the observed globular cluster mass function.They propose that the origin of the universal Gaussian mass function of old globular clusters thus likely resides in a protoglobular cloud mass-scale common among galaxies and imposed in the protogalactic era.Present-day star forming cores,however,do not show any characteristic mass-scale as their mass function is a power-law down to the detec-tion limit.Previously,Kroupa &Boily (2002)showed that a power-law embedded-cluster mass function can evolve into a mass function with a turnover near 105M ⊙if the physics of gas removal is taken into account.Tak-ing advantage of the grid of N -body models recently compiled by Baumgardt &Kroupa (2007),we explore2Parmentier et al.in greater detail how a featureless core mass function evolves into the ICMF as a result of gas removal,which we model as the supernova-driven expansion of a su-pershell of gas.We also explore how the ICMF re-sponds to model parameter variations so as to under-stand what may make the mass function of young star clusters vary from one galaxy to another.In a companion paper (Baumgardt,Kroupa &Parmentier 2008),we ex-plore this issue from a different angle,that is,by defining an energy criterion,and we apply that alternative model to the mass function of the Milky Way old globular clus-ters.2.NOTION OF INITIAL CLUSTER MASSFUNCTIONBefore proceeding any further,it is worth defining the notion of the ICMF properly.Following the onset of massive star activity at an age of 0.5to 5Myr,a gas-embedded cluster expels its residual star forming gas.As a result,the potential in which the stars reside changes rapidly causing the cluster to violently relax in an attempt to reach a new equilibrium.This violent relaxation takes 10–50Myr during which time the cluster loses stars (”infant weight-loss”)and may be completely destroyed (”infant-mortality”)(this process has been studied in detail by Hills (1980);Mathieu (1983);Elmegreen (1983);Lada,Margulis &Dearborn (1984);Elmegreen &Pinto (1985);Pinto (1987);Verschueren &David (1989);Goodwin (1997a);Goodwin (1997b);Kroupa,Aarseth &Hurley (2001);Geyer &Burkert (2001);Boily &Kroupa (2003a);Boily &Kroupa (2003b);Goodwin &Bastian (2006);Baumgardt &Kroupa (2007)).After ∼50Myr the evolution of clusters is driven by internal two-body relaxation processes and tidal interactions with the host galaxy (see Spitzer 1987).Following Kroupa &Boily (2002)we define the ICMF as the mass function of clusters at the end of the violent relaxation phase ,that is,when their age is ≃50Myr.The effect of gas expulsion depends strongly on the virial ratio of the cluster immediately before the gas is removed.If the stars and gas are in virial equilibrium before gas expulsion the virial ratio Q ⋆of the stars de-pends entirely on the star formation efficiency (SFE,or ε)such that Q ⋆=1/(2ε)(where we define virial equi-librium to be Q =1/2)(Goodwin 2008).Formally,εis the effective SFE (eSFE:see Verschueren (1990);Goodwin &Bastian (2006)),that is,a measure of how far from virial equilibrium the cluster is at the onset of gas expulsion,rather than the actual efficiency with which the gas has formed stars.We note that even quite small deviations from virial equilibrium can equate to large differences between the eSFE and the SFE.How-ever,in this paper we will make the assumption that the stars have had sufficient time to come into virial equi-librium with the gas potential (ie.gas expulsion occurs after a few crossing times),and so the eSFE very closely matches the SFE.We discuss this assumption further in the conclusions.In this case,the radius of the embedded cluster is the same as that of the core and the three-dimensional ve-locity dispersion of the stars in the embedded cluster is dictated by thedepth of the core gravitational poten-tial,namely:σ=(Gm c /r c )1/2,where m c and r c are themass and radius of the core,respectively,and G is the gravitational constant.A star-cluster forming ‘core’of mass m c (ie.the region of a molecular cloud that forms a cluster)with a SFE of εwill form a cluster of mass ε×m c .However,gas expulsion will unbind a fraction (1−F bound )of the stars in that cluster such that the mass of the cluster at ∼50Myr will bem init =F bound ×ε×m c .(1)(Note that when F bound ∼zero the cluster is destroyed.)It is important to keep in mind that,for clusters whose age does not exceed,say,10-20Myr,the mass frac-tion of stars that have escaped after gas removal is still small,as most stars do not have a high enough veloc-ity to have become spatially dissociated from the cluster (see fig.1in Goodwin &Bastian (2006)and fig.1in Kroupa,Aarseth &Hurley (2001)).The instantaneous mass of these young clusters thus follows:m cl ε×m c and,in the absence of any explicit dependence of εon m c ,the CMF at such a young age mirrors the core mass func-tion.Any meaningful comparison of our model ICMFs therefore has to be limited to populations of clusters which have either re-virialised or been destroyed.In our Monte-Carlo simulations,we assume that the SFE of a core is independent of its mass.The core mass function is assumed to follow a power-law of spectral in-dex α=−2and we describe the probability distribution function of the SFE as a Gaussian of mean ¯εand stan-dard deviation σε,which we denote P (ε)=G (¯ε,σε).Clearly,if the bound fraction of stars after gas ex-pulsion,F bound ,depends upon the core mass,then the shape of the ICMF will differ from the (power-law)core mass function,an effect whose study was pioneered by Kroupa &Boily (2002).We now take this point fur-ther by combining the detailed N -body model grid of Baumgardt &Kroupa (2007),which provides the bound fraction F bound as a function of the SFE εand of the ratio between the gas removal time-scale τGR and the proto-cluster crossing-time τcross .3.THE GAS EXPULSION TIME-SCALEWe model the gas expulsion from an (embedded)clus-ter as an outwardly expanding supershell whose radius r s varies with time t as (Castor,McCray &Weaver 1975):r s (t )=125ρg 1/5t 3/5.(2)Here ˙E0is the energy input rate from Type II supernovae and ρg is the mass density of the residual star forming gas in the core.We assume the mass density profiles of the core and of the newly formed stars to be alike,that is,ρg =(1−ε)ρc .The energy input rate from massive stars expelling the gas is˙E0=N GR ×E SNInitial cluster mass function3˙E 0=N SN×E SN1253N SN E SN(1−ε)m c 1/3.(5)While gas-embedded cluster masses vary by many or-ders of magnitude,their radii are remarkably constant, always of order1pc(see table1in Kroupa(2005)).As N SN∝m c,it follows from eq.5that,for a givenε,the gas removal time-scaleτGR is independent of the core mass m c.However,the impact of gas expulsion upon a cluster is not governed by the absolute value ofτGR,but by its ratio with the core crossing timeτcross,ie.F bound is a function ofτGR/τcross.The dynamical crossing-time of the star forming core is given byτcross=(15/πGρc)0.5 which,combined with eq.5,gives:τGRE511−ε1M⊙ 1/2r c4Parmentier et al.100101102103104105106100101102103104105106l o g ( d N c l / d l o g m c l )10101102103104105106log( m cl . M o -1)Fig.2.—How the form of the ICMF produced from a −2power-law cluster mass function responds to variations in the SFE Gaus-sian distribution G(¯ε,σε),where ¯εand σεare the mean SFE and the standard deviation,respectively.Solid lines depict the original core mass function.These different behaviours stem from how large the variations of the bound fraction F bound with the core mass m c are.In order to illustrate this,let us focus on the top panel of Fig.2for which σε=0.A SFE as high as ε=0.4guarantees that F bound is only a weak function of the core mass m c as 0.35 F bound 0.95over the core mass range 103M ⊙ m c 108M ⊙(see Fig.1).As a result,the shape of the ICMF mirrors that of the core mass function and is close to a power-law of spectral index α=−2.In the case of a SFE of ¯ε=0.33,the same core mass range corresponds to an increase of the bound fraction with the core mass of almost two orders of magnitude (i.e.0.01 F bound 0.90),making the ICMF significantly shallower than the core mass function.It is worth keep-ing in mind that the detailed form of this ICMF is uncer-tain,as the determination of the bound fraction F bound depends on the fine details of the N -body modelling ofgas expulsion when F bound 0.1.As for ε=0.33,F bound ≃0.01−0.04when log(τGR /τcross ) −0.5(see Fig.1).If the bound fraction were larger over that range of gas removal time-scale to crossing time ratio,the shape of the ICMF would be closer to that of the core mass function,while if F bound were zero before increasing at,say,m core ≃105−3×105M ⊙,this would result in a bell-shaped ICMF (see the case of ¯ε=0.20below).The core mass function is most affected when ¯ε=0.20.Fig.1shows that the formation of a bound gas-free star cluster requires its parent core to be more massive than 106M ⊙,since the bound fraction is zero otherwise (note that the top x -axis is rightward-shifted by ≃0.1in log(τGR /τcross )when ε=0.20,see eq.6).Owing to the wide variations of F bound over the core mass range (0 F bound 0.85),the transformation of the core mass function into the ICMF is,in this case,heavily core mass dependent:the featureless power-law core mass func-tion evolves into a bell-shaped ICMF.The time-scale for this evolution is that required for the exposed cluster to get back into virial equilibrium following gas expul-sion,that is,50Myr.Examination of the model grid computed by Baumgardt &Kroupa (2007)for the SFE and the gas removal time-scale of relevance (ε=0.20and τGR τcross )shows that the end of the violent re-laxation phase occurs at about that age and the cluster mass is then as defined in eq.1.As indicated by the arrows in Fig.1and in the top panel of Fig.2,the turnover location is determined by the core mass at which the bound fraction is a few per cent.For instance,a core mass of 2×106M ⊙with ε=0.20leads to log(τGR /τcross )=0.13and F bound =0.06.The corresponding initial cluster mass is thus m init =(0.06)×(0.20)×(2×106M ⊙)≃2.4×104M ⊙,roughly the mass of the ICMF turnover.That ε=0.20leads to a bell-shaped ICMF directly re-sults from the zero bound fraction of stars for core masses less than ≃106M ⊙.A SFE of ε=0.25would result in the same effect (see Fig.1).This is equivalent to creating a truncated power-law for the cluster forming cores,even though the star forming core mass function is a feature-less power-law down to the low-mass regime.That result is reminiscent of the detailed investigation performed by Parmentier &Gilmore (2007)who found that a system of cluster parent clouds devoid of low-mass objects,such as a power-law mass function truncated at low-mass,re-sults in a bell-shaped ICMF.We emphasize that,if these clusters were observed at an age of 50Myr,their observed mass would closely match the mass predicted by our model.Actually,all clusters of the bell-shaped ICMF,irrespective of their mass,stem from gaseous progenitors more massive than 106M ⊙.This implies that unbound stars formed in the core leave the exposed cluster with velocities of order 60km.s −1and do not linger as part of a halo around the bound part of the cluster.As a related consequence,note that if the very early Milky Way disc went through a violent star formation phase producing such massive clusters,then the corresponding high veloc-ity dispersions lead naturally to the creation of a thick disc component (Kroupa 2002).In the middle and bottom panels of Fig.2we show the ICMF that results from the core mass function for SFE distributions with variance σε=0.03and 0.05re-spectively.The general features of the panels are theInitial cluster mass function5100 200 300 400 500 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5d N c l / d l o g m c llog( m cl . M o -1)Fig. 3.—How the shape and position of the ICMF produced by a −2power-law core mass function with a SFE of ¯ε=0.20,responds to variations in the width of the SFE distribution,and in the core radiussame as for zero variance,however there are increasing differences,especially at the low-mass end of the ICMF.These differences are due to the effect of the SFE not being symmetric around the mean ¯ε.For ¯ε=0.2,low-mass gas-embedded clusters with ε=0.2are completely destroyed (F bound ∼0)and this is the case for any clus-ter selected below the mean.However,a low-mass gas-embedded cluster can be drawn with ε=0.3–0.4,in which case it will be able to survive and retain a sig-nificant fraction of its mass.For this reason,the ICMF flattens rather than turning-over in the bottom panel of Fig.2.A key assumption underlying the above analysis is the hypothesis of a constant radius for all star forming cores,regardless of their mass.In Fig.3,we examine the effect of the core radius on the form of the ICMF produced from a core mass function with ε=0.20(ie.a bell-shaped ICMF).We explore how the cluster mass at the turnover responds to core radius variations.A linear scaling for the y -axis is adopted as it enables us to pinpoint the turnover location more accurately.For the sake of com-parison,the ICMFs obtained with σε=0.00and 0.03and with r c =1pc (dotted curves with crosses in top and middle panels of Fig.2,respectively)are shown as solid lines.ICMFs depicted with open symbols correspond to constant core radius of r c =0.5pc and r c =2pc.A larger core radius implies smaller core densities,thereby increasing the disruptive effect of gas expulsion and low-ering the number of bound gas-free star clusters.That the turnover is at higher cluster mass arises from the leftward-shift of the mass-scaling of the top x -axis of Fig.1when the core radius is larger (see eq.6).As stated in Section 3,observations support the as-sumption of an almost constant core radius.This,how-ever,is at variance with theoretical expectations that themass-radius relation of virialised cores obeys r c ∝m 1/2c (e.g.Harris &Pudritz 1994).ICMFs obtained with r c [pc]=10−3(m c /1M ⊙)1/2are shown as lines with filled symbols in Fig.3.The normalisation is such that a core of mass 106M ⊙has a radius of 1pc.Higher normal-isations,corresponding to less dense cores,lead to the disruption of practically the whole original population ofembedded clusters.While,for a constant core radius of r c =1pc,the bell-shape of the ICMF is preserved when σεis increased from 0.00to 0.03,the virial mass-radius relation leads to a power-law ICMF for σε=0.03.That is,the sensitivity of the ICMF shape to the width of the distribution function for the SFE is greater than in case of a constant core radius.5.SUMMARYThe shape of the mass function of young star clusters remains much debated.Observational results are sorted into two distinct,often opposed,categories:power-law or bell-shaped mass functions.We have shown in this con-tribution that both shapes can arise from the same phys-ical process,namely,the supernova-driven gas expulsion from newly formed star clusters on a core mass dependent time-scale.Specifically,more massive cores have a higher gas removal time-scale to dynamical crossing-time ratio by virtue of their deeper potential wells (assuming the near constancy of core radii over the core mass range).For a given SFE,this results in larger bound fractions F bound of stars at the end of the violent relaxation which follows gas expulsion.The shape of the ICMF is chiefly governed by the probability distribution of the SFE ε.Bell-shaped ICMFs arise when most star forming cores have ε 0.25.In that case,however,bound gas-free star clusters only form out of cores more massive than 106M ⊙.This implies that bell-shaped ICMFs are also associated with gas-rich environments so as to guarantee that the core mass function is sampled to such a high mass.In contrast,power-law ICMFs mirroring the core mass function arise when ε∼0.4for most star forming cores.This implies that the shape of the ICMF also de-pends on the width σεof the SFE distribution.When ¯ε 0.25,an increasing σεenhances the contribution of cores with ε≃0.40and turns a bell-shaped ICMF into a power-law.The amplitude of that evolution is greater when core radii are assumed to follow the mass-radius relation of virialised cores than when the core radius is assumed to be constant.As illustrated by Figs.2and 3,that the mass function of young clusters observed in a variety of environments appears not to be universal is not surprising .Vesperini (1998)pointed out that a power-law ICMF of spectral index (-2)evolves to a bell-shaped mass func-tion the turnover of which is located at a significantly lower cluster mass than what is observed for the uni-versal globular cluster mass function.For such models,Parmentier &Gilmore (2005)showed that a power-law ICMF with α=−2also leads to a radial gradient in the peak of the evolved mass function contrary to what is observed.Baumgardt,Kroupa &Parmentier (2008)demonstrated that these two issues do not arise if the ICMF is depleted in low-mass clusters compared to a power-law of spectral index (-2)as a result of gas expul-sion.In that case,it evolves within 13Gyr into a cluster mass function similar to that of old globular clusters both in the inner and outer Galactic halo (see their fig.4).While a mean SFE of ¯ε 0.40preserves the shape of the core mass function,a mean SFE not exceeding ¯ε≃0.25leads to a low-mass cluster depleted ICMF (see Fig.2).The universality of the globular cluster mass function may thus stem from the existence of an upper limit of,say,25per cent on the mean SFE in all protogalaxies.6Parmentier et al.The origin of that constraint on the mean SFE in the protogalactic era remains to be explained,however.In this preliminary study,the energy input rate from massive stars encompasses the contribution made by su-pernovae only,i.e.that of stellar winds has been ne-glected.Consequently,the above results apply to the sole case of metal-poor environments.As noted in Section2, we have assumed that the effect of gas expulsion depends on the actual star formation efficiency-i.e.we assume that the stars and gas(potential)are in virial equilib-rium at the onset of gas expulsion.This assumption is reasonable if(the stellar component of)the cluster has had time to relax.A cluster will have been able to relax if the onset of gas expulsion occurs after a few crossing times.In a low-metallicity regime such as we are con-sidering in this paper,gas expulsion will be driven by supernovae rather than stellar winds and so will not be-gin until a few Myr after the stars have formed.Such a time-scale is several crossing times,and so we could ex-pect the eSFE to closely match the SFE.We note how-ever,that there may well be a mass-eSFE dependence in higher metallicity cluster populations that may alter our conclusions(which would be applicable to populations such as those in the Antennae merger of disc galaxies). In a follow-up paper,we will consider the additional impact of stellar winds and the case of star clusters form-ing out of dense environments where strong tidalfields prevail.GP acknowledges support from the Alexander von Humboldt Foundation in the form of a Research Fellow-ship and from the Belgian Science Policy Office in the form of a Return Grant.GP and SG are grateful for re-search support and hospitality at the International Space Science Institute in Bern(Switzerland),as part of an In-ternational Team Programme.We acknowledge partial financial support from the UK’s Royal Society through an International Joint Project grant aimed at facilitating networking activities between the universities of Sheffield and Bonn.REFERENCESAnders,P.,Bissantz,N.,Boysen,L.,de Grijs,R.,Fritze-v.Alvensleben,U.2007,MNRAS,377,91Baumgardt,H.,Kroupa,P.2007,MNRAS,380,1589 Baumgardt,H.,Kroupa,P.,Parmentier,G.,accepted in MNRAS, arXiv:0712.1591Bik,A.,Lamers,H.J.G.L.M.,Bastian,N.,Panagia,N.,Romaniello, M.2003,A&A,397,473Boily,C.M.&Kroupa,P.2003a,MNRAS,338,643Boily,C.M.&Kroupa,P.2003b,MNRAS,338,673Castor,J.,McCray,R.,Weaver,R.1975ApJ,200,107Cresci,G.,Vanzi,L.,Sauvage,M.2005,A&A,433,447 Elmegreen,B.G.1983,MNRAS,203,1011Elmegreen,B.G.&Clemens,C.1985,ApJ,294,523Fall,S.M.,Zhang,Q.2001,ApJ,561,751Geyer,M.P.,Burkert,A.2001,MNRAS,323,988Goodwin,S.P.1997a,MNRAS,284,785Goodwin,S.P.1997b,MNRAS,286,669Goodwin,S.P.,Bastian,N.2006,MNRAS,373,752Goodwin,S.P.2008,in:Young massive star clusters:Initial conditions and environments,Perez, E.,de Grijs R.,and Gonzalez Delgado,R.(eds),Springer,in pressHarris W.E.,Pudritz R.E.1994,ApJ,429,177Hills,J.,G.1980,ApJ,235,986Hunter,D.A.,Elmegreen,B.G.,Dupuy,T.J.,Mortonson,M. 2003,AJ,126,1836Johnson,K.E.2008,in:Young massive star clusters:Initial conditions and environments,Perez, E.,de Grijs R.,and Gonzalez Delgado,R.(eds),Springer,in press Kavelaars,J;J.,Hanes D.A.1997,MNRAS,285,L31Kroupa P.2001,MNRAS,322,231Kroupa,P.,Aarseth,S.,Hurley,J.,2001,MNRAS,321,699 Kroupa P.2002,MNRAS,330,707Kroupa,P.,Boily,C.M.2002,MNRAS,336,1188Kroupa,P.2005,In:Proceedings of”The Three-Dimensional Universe with Gaia”(ESA SP-576)C.Turon,K.S.O’Flaherty, M.A.C.Perryman(eds),p.629Lada,C.J.,Margulis,M.,Dearborn,D.1984,ApJ,285,141 Mathieu,R.D.1983,ApJ,267,97Miller,B..,Whitmore,B.C.,Schweizer,F.,Fall,S.M.1997,AJ, 114,2381Parmentier,G.,Gilmore,G.F.2005,MNRAS,363,326 Parmentier,G.,Gilmore,G.F.2007,MNRAS,377,352Pinto,F.1987,PASP,99,1161Spitzer,L.,1987,in:Dynamical evolution of globular clusters, Princeton University PressVerschueren,W.&David,M.1989,A&A,219,105 Verschueren,W.1990,A&A,234,156Vesperini E.1998,MNRAS,299,1019Weidner,C.,Kroupa,P.2004,MNRAS,348,187Whitmore,B.C.,Zhang Q.,Leitherer,C.,Fall,S.M.,Schweizer, F.,;Miller,B.W.,1999,AJ,118,1551。
a rXiv:as tr o-ph/98528v115May1998The evolution of clustering and bias in the galaxy distribution B y J.A.Peacock Institute for Astronomy,Royal Observatory,Edinburgh EH93HJ,UK This paper reviews the measurements of galaxy correlations at high redshifts,and discusses how these may be understood in models of hierarchical gravita-tional collapse.The clustering of galaxies at redshift one is much weaker than at present,and this is consistent with the rate of growth of structure expected in an open universe.If Ω=1,this observation would imply that bias increases at high redshift,in conflict with observed M/L values for known high-z clusters.At redshift 3,the population of Lyman-limit galaxies displays clustering which is of similar amplitude to that seen today.This is most naturally understood if the Lyman-limit population is a set of rare recently-formed objects.Knowing both the clustering and the abundance of these objects,it is possible to deduce em-pirically the fluctuation spectrum required on scales which cannot be measured today owing to gravitational nonlinearities.Of existing physical models for the fluctuation spectrum,the results are most closely matched by a low-density spa-tially flat universe.This conclusion is reinforced by an empirical analysis of CMB anisotropies,in which the present-day fluctuation spectrum is forced to have the observed form.Open models are strongly disfavoured,leaving ΛCDM as the most successful simple model for structure formation.2J.A.Peacockcommon parameterization for the correlation function in comoving coordinates:ξ(r,z)=[r/r0]−γ(1+z)−(3−γ+ǫ),(1.2) whereǫ=0is stable clustering;ǫ=γ−3is constant comoving clustering;ǫ=γ−1isΩ=1linear-theory evolution.Although this equation is frequently encountered,it is probably not appli-cable to the real world,because most data inhabit the intermediate regime of 1<∼ξ<∼100.Peacock(1997)showed that the expected evolution in this quasilin-ear regime is significantly more rapid:up toǫ≃3.(b)General aspects of biasOf course,there are good reasons to expect that the galaxy distribution will not follow that of the dark matter.The main empirical argument in this direction comes from the masses of rich clusters of galaxies.It has long been known that attempts to‘weigh’the universe by multiplying the overall luminosity density by cluster M/L ratios give apparent density parameters in the rangeΩ≃0.2to0.3 (e.g.Carlberg et al.1996).An alternative argument is to use the abundance of rich clusters of galaxies in order to infer the rms fractional density contrast in spheres of radius8h−1Mpc. This calculation has been carried out several different ways,with general agree-ment on afigure close to(1.3)σ8≃0.57Ω−0.56m(White,Efstathiou&Frenk1993;Eke,Cole&Frenk1996;Viana&Liddle1996). The observed apparent value ofσ8in,for example,APM galaxies(Maddox,Efs-tathiou&Sutherland1996)is about0.95(ignoring nonlinear corrections,which are small in practice,although this is not obvious in advance).This says that Ω=1needs substantial positive bias,but thatΩ<∼0.4needs anti bias.Although this cluster normalization argument depends on the assumption that the density field obeys Gaussian statistics,the result is in reasonable agreement with what is inferred from cluster M/L ratios.What effect does bias have on common statistical measures of clustering such as correlation functions?We could be perverse and assume that the mass and lightfields are completely unrelated.If however we are prepared to make the more sensible assumption that the light density is a nonlinear but local function of the mass density,then there is a very nice result due to Coles(1993):the bias is a monotonic function of scale.Explicitly,if scale-dependent bias is defined asb(r)≡[ξgalaxy(r)/ξmass(r)]1/2,(1.4) then b(r)varies monotonically with scale under rather general assumptions about the densityfield.Furthermore,at large r,the bias will tend to a constant value which is the linear response of the galaxy-formation process.There is certainly empirical evidence that bias in the real universe does work this way.Consider Fig.1,taken from Peacock(1997).This compares dimen-sionless power spectra(∆2(k)=dσ2/d ln k)for IRAS and APM galaxies.The comparison is made in real space,so as to avoid distortions due to peculiar veloc-ities.For IRAS galaxies,the real-space power was obtained from the the projectedThe evolution of galaxy clustering and bias3Figure1.The real-space power spectra of optically-selected APM galaxies(solid circles)and IRAS galaxies(open circles),taken from Peacock(1997).IRAS galaxies show weaker clustering, consistent with their suppression in high-density regions relative to optical galaxies.The relative bias is a monotonic but slowly-varying function of scale.correlation function:Ξ(r)= ∞−∞ξ[(r2+x2)1/2]dx.(1.5)Saunders,Rowan-Robinson&Lawrence(1992)describe how this statistic can be converted to other measures of real-space correlation.For the APM galaxies, Baugh&Efstathiou(1993;1994)deprojected Limber’s equation for the angular correlation function w(θ)(discussed below).These different methods yield rather similar power spectra,with a relative bias that is perhaps only about1.2on large scale,increasing to about1.5on small scales.The power-law portion for k>∼0.2h Mpc−1is the clear signature of nonlinear gravitational evolution,and the slow scale-dependence of bias gives encouragement that the galaxy correlations give a good measure of the shape of the underlying massfluctuation spectrum.2.Observations of high-redshift clustering(a)Clustering at redshift1At z=0,there is a degeneracy betweenΩand the true normalization of the spectrum.Since the evolution of clustering with redshift depends onΩ,studies at higher redshifts should be capable of breaking this degeneracy.This can be done without using a complete faint redshift survey,by using the angular clustering of aflux-limited survey.If the form of the redshift distribution is known,the projection effects can be disentangled in order to estimate the3D clustering at the average redshift of the sample.For small angles,and where the redshift shell being studied is thicker than the scale of any clustering,the spatial and angular4J.A.Peacockcorrelation functions are related by Limber’s equation(e.g.Peebles1980): w(θ)= ∞0y4φ2(y)C(y)dy ∞−∞ξ([x2+y2θ2]1/2,z)dx,(2.1)where y is dimensionless comoving distance(transverse part of the FRW metric is[R(t)y dθ]2),and C(y)=[1−ky2]−1/2;the selection function for radius y is normalized so that y2φ(y)C(y)dy=1.Less well known,but simpler,is the Fourier analogue of this relation:π∆2θ(K)=The evolution of galaxy clustering and bias5 ever,the M/L argument is more powerful since only a single cluster is required, and a complete survey is not necessary.Two particularly good candidates at z≃0.8are described by Clowe et al.(1998);these are clusters where significant weak gravitational-lensing distortions are seen,allowing a robust determination of the total cluster mass.The mean V-band M/L in these clusters is230Solar units,which is close to typical values in z=0clusters.However,the comoving V-band luminosity density of the universe is higher at early times than at present by about a factor(1+z)2.5(Lilly et al.1996),so this is equivalent to M/L≃1000, implying an apparent‘Ω’of close to unity.In summary,the known degree of bias today coupled with the moderate evolution in correlation function back to z=1 implies that,forΩ=1,the galaxy distribution at this time would have to consist very nearly of a‘painted-on’pattern that is not accompanied by significant mass fluctuations.Such a picture cannot be reconciled with the healthy M/L ratios that are observed in real clusters at these redshifts,and this seems to be a strong argument that we do not live in an Einstein-de Sitter universe.(b)Clustering of Lyman-limit galaxies at redshift3The most exciting recent development in observational studies of galaxy clus-tering is the detection by Steidel et al.(1997)of strong clustering in the popula-tion of Lyman-limit galaxies at z≃3.The evidence takes the form of a redshift histogram binned at∆z=0.04resolution over afield8.7′×17.6′in extent.For Ω=1and z=3,this probes the densityfield using a cell with dimensionscell=15.4×7.6×15.0[h−1Mpc]3.(2.3) Conveniently,this has a volume equivalent to a sphere of radius7.5h−1Mpc,so it is easy to measure the bias directly by reference to the known value ofσ8.Since the degree of bias is large,redshift-space distortions from coherent infall are small; the cell is also large enough that the distortions of small-scale random velocities at the few hundred km s−1level are also ing the model of equation (11)of Peacock(1997)for the anisotropic redshift-space power spectrum and integrating over the exact anisotropic window function,the above simple volume argument is found to be accurate to a few per cent for reasonable power spectra:σcell≃b(z=3)σ7.5(z=3),(2.4) defining the bias factor at this scale.The results of section1(see also Mo& White1996)suggest that the scale-dependence of bias should be weak.In order to estimateσcell,simulations of synthetic redshift histograms were made,using the method of Poisson-sampled lognormal realizations described by Broadhurst,Taylor&Peacock(1995):using aχ2statistic to quantify the nonuni-formity of the redshift histogram,it appears thatσcell≃0.9is required in order for thefield of Steidel et al.(1997)to be typical.It is then straightforward to ob-tain the bias parameter since,for a present-day correlation functionξ(r)∝r−1.8,σ7.5(z=3)=σ8×[8/7.5]1.8/2×1/4≃0.146,(2.5) implyingb(z=3|Ω=1)≃0.9/0.146≃6.2.(2.6) Steidel et al.(1997)use a rather different analysis which concentrates on the highest peak alone,and obtain a minimum bias of6,with a preferred value of8.6J.A.PeacockThey use the Eke et al.(1996)value ofσ8=0.52,which is on the low side of the published range of ingσ8=0.55would lower their preferred b to 7.6.Note that,with both these methods,it is much easier to rule out a low value of b than a high one;given a singlefield,it is possible that a relatively‘quiet’region of space has been sampled,and that much larger spikes remain to be found elsewhere.A more detailed analysis of several furtherfields by Adelberger et al. (1998)in fact yields a biasfigure very close to that given above,so thefirstfield was apparently not unrepresentative.Having arrived at afigure for bias ifΩ=1,it is easy to translate to other models,sinceσcell is observed,independent of cosmology.For lowΩmodels, the cell volume will increase by a factor[S2k(r)dr]/[S2k(r1)dr1];comparing with present-dayfluctuations on this larger scale will tend to increase the bias.How-ever,for lowΩ,two other effects increase the predicted densityfluctuation at z=3:the cluster constraint increases the present-dayfluctuation by a factor Ω−0.56,and the growth between redshift3and the present will be less than a factor of4.Applying these corrections givesb(z=3|Ω=0.3)The evolution of galaxy clustering and bias7 87GB survey(Loan,Lahav&Wall1997),but these were of only bare significance (although,in retrospect,the level of clustering in87GB is consistent with the FIRST measurement).Discussion of the87GB and FIRST results in terms of Limber’s equation has tended to focus on values ofǫin the region of0.Cress et al.(1996)concluded that the w(θ)results were consistent with the PN91 value of r0≃10h−1Mpc(although they were not very specific aboutǫ).Loan et al.(1997)measured w(1◦)≃0.005for a5-GHz limit of50mJy,and inferred r0≃12h−1Mpc forǫ=0,falling to r0≃9h−1Mpc forǫ=−1.The reason for this strong degeneracy between r0andǫis that r0parame-terizes the z=0clustering,whereas the observations refer to a typical redshift of around unity.This means that r0(z=1)can be inferred quite robustly to be about7.5h−1Mpc,without much dependence on the rate of evolution.Since the strength of clustering for optical galaxies at z=1is known to correspond to the much smaller number of r0≃2h−1Mpc(e.g.Le F`e vre et al.1996),we see that radio galaxies at this redshift have a relative bias parameter of close to 3.The explanation for this high degree of bias is probably similar to that which applies in the case of QSOs:in both cases we are dealing with AGN hosted by rare massive galaxies.3.Formation and bias of high-redshift galaxiesThe challenge now is to ask how these results can be understood in cur-rent models for cosmological structure formation.It is widely believed that the sequence of cosmological structure formation was hierarchical,originating in a density power spectrum with increasingfluctuations on small scales.The large-wavelength portion of this spectrum is accessible to observation today through studies of galaxy clustering in the linear and quasilinear regimes.However,non-linear evolution has effectively erased any information on the initial spectrum for wavelengths below about1Mpc.The most sensitive way of measuring the spectrum on smaller scales is via the abundances of high-redshift objects;the amplitude offluctuations on scales of individual galaxies governs the redshift at which these objectsfirst undergo gravitational collapse.The small-scale am-plitude also influences clustering,since rare early-forming objects are strongly correlated,asfirst realized by Kaiser(1984).It is therefore possible to use obser-vations of the abundances and clustering of high-redshift galaxies to estimate the power spectrum on small scales,and the following section summarizes the results of this exercise,as given by Peacock et al.(1998).(a)Press-Schechter apparatusThe standard framework for interpreting the abundances of high-redshift objects in terms of structure-formation models,was outlined by Efstathiou& Rees(1988).The formalism of Press&Schechter(1974)gives a way of calculating the fraction F c of the mass in the universe which has collapsed into objects more massive than some limit M:F c(>M,z)=1−erf δc2σ(M) .(3.1)8J.A.PeacockHere,σ(M)is the rms fractional density contrast obtained byfiltering the linear-theory densityfield on the required scale.In practice,thisfiltering is usually performed with a spherical‘top hat’filter of radius R,with a corresponding mass of4πρb R3/3,whereρb is the background density.The numberδc is the linear-theory critical overdensity,which for a‘top-hat’overdensity undergoing spherical collapse is1.686–virtually independent ofΩ.This form describes numerical simulations very well(see e.g.Ma&Bertschinger1994).The main assumption is that the densityfield obeys Gaussian statistics,which is true in most inflationary models.Given some estimate of F c,the numberσ(R)can then be inferred.Note that for rare objects this is a pleasingly robust process:a large error in F c will give only a small error inσ(R),because the abundance is exponentially sensitive toσ.Total masses are of course ill-defined,and a better quantity to use is the velocity dispersion.Virial equilibrium for a halo of mass M and proper radius r demands a circular orbital velocity ofV2c=GMΩ1/2m(1+z c)1/2f 1/6c.(3.3)Here,z c is the redshift of virialization;Ωm is the present value of the matter density parameter;f c is the density contrast at virialization of the newly-collapsed object relative to the background,which is adequately approximated byf c=178/Ω0.6m(z c),(3.4) with only a slight sensitivity to whetherΛis non-zero(Eke,Cole&Frenk1996).For isothermal-sphere haloes,the velocity dispersion isσv=V c/√The evolution of galaxy clustering and bias9 and the more recent estimate of0.025from Tytler et al.(1996),thenΩHIF c=2for the dark halo.A more recent measurement of the velocity width of the Hαemission line in one of these objects gives a dispersion of closer to100km s−1(Pettini,private communication),consistent with the median velocity width for Lyαof140km s−1 measured in similar galaxies in the HDF(Lowenthal et al.1997).Of course,these figures could underestimate the total velocity dispersion,since they are dominated by emission from the central regions only.For the present,the range of values σv=100to320km s−1will be adopted,and the sensitivity to the assumed velocity will be indicated.In practice,this uncertainty in the velocity does not produce an important uncertainty in the conclusions.(3)Red radio galaxies An especially interesting set of objects are the reddest optical identifications of1-mJy radio galaxies,for which deep absorption-line spectroscopy has proved that the red colours result from a well-evolved stellar population,with a minimum stellar age of3.5Gyr for53W091at z=1.55(Dun-10J.A.Peacocklop et al.1996;Spinrad et al.1997),and4.0Gyr for53W069at z=1.43(Dunlop 1998;Dey et al.1998).Such ages push the formation era for these galaxies back to extremely high redshifts,and it is of interest to ask what level of small-scale power is needed in order to allow this early formation.Two extremely red galaxies were found at z=1.43and1.55,over an area 1.68×10−3sr,so a minimal comoving density is from one galaxy in this redshift range:N(Ω=1)>∼10−5.87(h−1Mpc)−3.(3.9) Thisfigure is comparable to the density of the richest Abell clusters,and is thus in reasonable agreement with the discovery that rich high-redshift clusters appear to contain radio-quiet examples of similarly red galaxies(Dickinson1995).Since the velocity dispersions of these galaxies are not observed,they must be inferred indirectly.This is possible because of the known present-day Faber-Jackson relation for ellipticals.For53W091,the large-aperture absolute magni-tude isM V(z=1.55|Ω=1)≃−21.62−5log10h(3.10) (measured direct in the rest frame).According to Solar-metallicity spectral syn-thesis models,this would be expected to fade by about0.9mag.between z=1.55 and the present,for anΩ=1model of present age14Gyr(note that Bender et al.1996have observed a shift in the zero-point of the M−σv relation out to z=0.37of a consistent size).If we compare these numbers with theσv–M V relation for Coma(m−M=34.3for h=1)taken from Dressler(1984),this predicts velocity dispersions in the rangeσv=222to292km s−1.(3.11) This is a very reasonable range for a giant elliptical,and it adopted in the following analysis.Having established an abundance and an equivalent circular velocity for these galaxies,the treatment of them will differ in one critical way from the Lyman-αand Lyman-limit galaxies.For these,the normal Press-Schechter approach as-sumes the systems under study to be newly born.For the Lyman-αand Lyman-limit galaxies,this may not be a bad approximation,since they are evolving rapidly and/or display high levels of star-formation activity.For the radio galax-ies,conversely,their inactivity suggests that they may have existed as discrete systems at redshifts much higher than z≃1.5.The strategy will therefore be to apply the Press-Schechter machinery at some unknown formation redshift,and see what range of redshift gives a consistent degree of inhomogeneity.4.The small-scalefluctuation spectrum(a)The empirical spectrumFig.2shows theσ(R)data which result from the Press-Schechter analysis, for three cosmologies.Theσ(R)numbers measured at various high redshifts have been translated to z=0using the appropriate linear growth law for density perturbations.The open symbols give the results for the Lyman-limit(largest R)and Lyman-α(smallest R)systems.The approximately horizontal error bars showThe evolution of galaxy clustering and bias11Figure2.Theradius R.Thecircles)Theredshifts2,4,...The horizontal errors correspond to different choices for the circular velocities of the dark-matter haloes that host the galaxies.The shaded region at large R gives the results inferred from galaxy clustering.The lines show CDM and MDM predictions,with a large-scale normalization ofσ8=0.55forΩ=1orσ8=1for the low-density models.the effect of the quoted range of velocity dispersions for afixed abundance;the vertical errors show the effect of changing the abundance by a factor2atfixed velocity dispersion.The locus implied by the red radio galaxies sits in between. The different points show the effects of varying collapse redshift:z c=2,4,...,12 [lowest redshift gives lowestσ(R)].Clearly,collapse redshifts of6–8are favoured12J.A.Peacockfor consistency with the other data on high-redshift galaxies,independent of the-oretical preconceptions and independent of the age of these galaxies.This level of power(σ[R]≃2for R≃1h−1Mpc)is also in very close agreement with the level of power required to produce the observed structure in the Lyman alpha forest(Croft et al.1998),so there is a good case to be made that thefluctu-ation spectrum has now been measured in a consistent fashion down to below R≃1h−1Mpc.The shaded region at larger R shows the results deduced from clustering data (Peacock1997).It is clear anΩ=1universe requires the power spectrum at small scales to be higher than would be expected on the basis of an extrapolation from the large-scale spectrum.Depending on assumptions about the scale-dependence of bias,such a‘feature’in the linear spectrum may also be required in order to satisfy the small-scale present-day nonlinear galaxy clustering(Peacock1997). Conversely,for low-density models,the empirical small-scale spectrum appears to match reasonably smoothly onto the large-scale data.Fig.2also compares the empirical data with various physical power spectra.A CDM model(using the transfer function of Bardeen et al.1986)with shape parameterΓ=Ωh=0.25is shown as a reference for all models.This appears to have approximately the correct shape,although it overpredicts the level of small-scale power somewhat in the low-density cases.A better empirical shape is given by MDM withΩh≃0.4andΩν≃0.3.However,this model only makes physical sense in a universe with highΩ,and so it is only shown as the lowest curve in Fig.2c,reproduced from thefitting formula of Pogosyan&Starobinsky(1995; see also Ma1996).This curve fails to supply the required small-scale power,by about a factor3inσ;loweringΩνto0.2still leaves a very large discrepancy. This conclusion is in agreement with e.g.Mo&Miralda-Escud´e(1994),Ma& Bertschinger(1994),Ma et al.(1997)and Gardner et al.(1997).All the models in Fig.2assume n=1;in fact,consistency with the COBE results for this choice ofσ8andΩh requires a significant tilt forflat low-density CDM models,n≃0.9(whereas open CDM models require n substantially above unity).Over the range of scales probed by LSS,changes in n are largely degenerate with changes inΩh,but the small-scale power is more sensitive to tilt than to Ωh.Tilting theΩ=1models is not attractive,since it increases the tendency for model predictions to lie below the data.However,a tilted low-Ωflat CDM model would agree moderately well with the data on all scales,with the exception of the ‘bump’around R≃30h−1Mpc.Testing the reality of this feature will therefore be an important task for future generations of redshift survey.(b)Collapse redshifts and ages for red radio galaxiesAre the collapse redshifts inferred above consistent with the age data on the red radio galaxies?First bear in mind that in a hierarchy some of the stars in a galaxy will inevitably form in sub-units before the epoch of collapse.At the time offinal collapse,the typical stellar age will be some fractionαof the age of the universe at that time:age=t(z obs)−t(z c)+αt(z c).(4.1) We can rule outα=1(i.e.all stars forming in small subunits just after the big bang).For present-day ellipticals,the tight colour-magnitude relation only allows an approximate doubling of the mass through mergers since the termination ofThe evolution of galaxy clustering and bias13Figure3.The age of a galaxy at z=1.5,as a function of its collapse redshift(assuming an instantaneous burst of star formation).The various lines showΩ=1[solid];openΩ=0.3 [dotted];flatΩ=0.3[dashed].In all cases,the present age of the universe is forced to be14 Gyr.star formation(Bower at al.1992).This corresponds toα≃0.3(Peacock1991).A non-zeroαjust corresponds to scaling the collapse redshift asapparent(1+z c)∝(1−α)−2/3,(4.2) since t∝(1+z)−3/2at high redshifts for all cosmologies.For example,a galaxy which collapsed at z=6would have an apparent age corresponding to a collapse redshift of7.9forα=0.3.Converting the ages for the galaxies to an apparent collapse redshift depends on the cosmological model,but particularly on H0.Some of this uncertainty may be circumvented byfixing the age of the universe.After all,it is of no interest to ask about formation redshifts in a model with e.g.Ω=1,h=0.7when the whole universe then has an age of only9.5Gyr.IfΩ=1is to be tenable then either h<0.5against all the evidence or there must be an error in the stellar evolution timescale.If the stellar timescales are wrong by afixed factor,then these two possibilities are degenerate.It therefore makes sense to measure galaxy ages only in units of the age of the universe–or,equivalently,to choose freely an apparent Hubble constant which gives the universe an age comparable to that inferred for globular clusters.In this spirit,Fig.3gives apparent ages as a function of effective collapse redshift for models in which the age of the universe is forced to be14 Gyr(e.g.Jimenez et al.1996).This plot shows that the ages of the red radio galaxies are not permitted very much freedom.Formation redshifts in the range6to8predict an age of close to 3.0Gyr forΩ=1,or3.7Gyr for low-density models,irrespective of whetherΛis nonzero.The age-z c relation is ratherflat,and this gives a robust estimate of age once we have some idea of z c through the abundance arguments.It is therefore14J.A.Peacockrather satisfying that the ages inferred from matching the rest-frame UV spectra of these galaxies are close to the abovefigures.(c)The global picture of galaxy formationIt is interesting to note that it has been possible to construct a consistent picture which incorporates both the large numbers of star-forming galaxies at z<∼3and the existence of old systems which must have formed at very much larger redshifts.A recent conclusion from the numbers of Lyman-limit galaxies and the star-formation rates seen at z≃1has been that the global history of star formation peaked at z≃2(Madau et al.1996).This leaves open two possibilities for the very old systems:either they are the rare precursors of this process,and form unusually early,or they are a relic of a second peak in activity at higher redshift,such as is commonly invoked for the origin of all spheroidal components. While such a bimodal history of star formation cannot be rejected,the rareness of the red radio galaxies indicates that there is no difficulty with the former picture. This can be demonstrated quantitatively by integrating the total amount of star formation at high redshift.According to Madau et al.,The star-formation rate at z=4is˙ρ∗≃107.3h M⊙Gyr−1Mpc−3,(4.3) declining roughly as(1+z)−4.This is probably a underestimate by a factor of at least3,as indicated by suggestions of dust in the Lyman-limit galaxies(Pettini et al.1997),and by the prediction of Pei&Fall(1995),based on high-z element abundances.If we scale by a factor3,and integrate tofind the total density in stars produced at z>6,this yieldsρ∗(z f>6)≃106.2M⊙Mpc−3.(4.4) Since the red mJy galaxies have a density of10−5.87h3Mpc−3and stellar masses of order1011M⊙,there is clearly no conflict with the idea that these galaxies are thefirst stellar systems of L∗size which form en route to the general era of star and galaxy formation.(d)Predictions for biased clustering at high redshiftsAn interesting aspect of these results is that the level of power on1-Mpc scales is only moderate:σ(1h−1Mpc)≃2.At z≃3,the correspondingfigure would have been much lower,making systems like the Lyman-limit galaxies rather rare.For Gaussianfluctuations,as assumed in the Press-Schechter analysis,such systems will be expected to display spatial correlations which are strongly biased with respect to the underlying mass.The linear bias parameter depends on the rareness of thefluctuation and the rms of the underlyingfield asb=1+ν2−1δc(4.5)(Kaiser1984;Cole&Kaiser1989;Mo&White1996),whereν=δc/σ,andσ2is the fractional mass variance at the redshift of interest.In this analysis,δc=1.686is assumed.Variations in this number of order10 per cent have been suggested by authors who have studied thefit of the Press-Schechter model to numerical data.These changes would merely scale b−1by a small amount;the key parameter isν,which is set entirely by the collapsed。
华南农业大学学报 Journal of South China Agricultural University 2023, 44(4): 628-637DOI: 10.7671/j.issn.1001-411X.202208008赵润茂, 朱政, 陈建能, 等. 3D LiDAR感知的植物行信息提取方法与试验[J]. 华南农业大学学报, 2023, 44(4): 628-637.ZHAO Runmao, ZHU Zheng, CHEN Jianneng, et al. 3D LiDAR sensing method and experiment of plant row information extraction[J]. Journal of South China Agricultural University, 2023, 44(4): 628-637.3D LiDAR 感知的植物行信息提取方法与试验赵润茂1,2 ,朱 政1,陈建能1,2 ,范国帅1,王麒程1,黄培奎3(1 浙江理工大学 机械与自动控制学院, 浙江 杭州 310018; 2 浙江省种植装备技术重点实验室,浙江 杭州 310018; 3 华南农业大学 工程学院, 广东 广州 510642)摘要: 【目的】针对林间或冠层下等卫星信号严重遮挡的区域,提出一种面向农业机器人导航环境感知的低成本3D激光雷达(LiDAR)点云信息处理与植物行估计方法。
【方法】利用直通滤波器滤除感兴趣区域外的目标无关点;提出均值漂移聚类、扫描区域自适应的方法分割每棵植物主干,垂直投影主干点云估算中心点;利用最小二乘法拟合主干中心,估计植物行。
分别在开阔地的仿真果园与水杉树林进行模拟试验与田间试验,以植物行向量与正东方夹角为指标,计算本研究提出的方法识别的植物行信息与GNSS卫星天线定位测得的植物行真值间的角度误差。
【结果】采用提出的3D LiDAR点云信息处理与植物行估计方法,模拟试验和田间试验对植物行识别误差平均值分别为0.79°和1.48°,最小值分别为0.12°和0.88°,最大值分别为1.49°和2.33°。
8 Cluster Analysis:Basic Concepts andAlgorithmsCluster analysis divides data into groups(clusters)that are meaningful,useful, or both.If meaningful groups are the goal,then the clusters should capture the natural structure of the data.In some cases,however,cluster analysis is only a useful starting point for other purposes,such as data summarization.Whether for understanding or utility,cluster analysis has long played an important role in a wide variety offields:psychology and other social sciences,biology, statistics,pattern recognition,information retrieval,machine learning,and data mining.There have been many applications of cluster analysis to practical prob-lems.We provide some specific examples,organized by whether the purpose of the clustering is understanding or utility.Clustering for Understanding Classes,or conceptually meaningful groups of objects that share common characteristics,play an important role in how people analyze and describe the world.Indeed,human beings are skilled at dividing objects into groups(clustering)and assigning particular objects to these groups(classification).For example,even relatively young children can quickly label the objects in a photograph as buildings,vehicles,people,ani-mals,plants,etc.In the context of understanding data,clusters are potential classes and cluster analysis is the study of techniques for automaticallyfinding classes.The following are some examples:488Chapter8Cluster Analysis:Basic Concepts and Algorithms •Biology.Biologists have spent many years creating a taxonomy(hi-erarchical classification)of all living things:kingdom,phylum,class, order,family,genus,and species.Thus,it is perhaps not surprising that much of the early work in cluster analysis sought to create a discipline of mathematical taxonomy that could automaticallyfind such classifi-cation structures.More recently,biologists have applied clustering to analyze the large amounts of genetic information that are now available.For example,clustering has been used tofind groups of genes that have similar functions.•Information Retrieval.The World Wide Web consists of billions of Web pages,and the results of a query to a search engine can return thousands of pages.Clustering can be used to group these search re-sults into a small number of clusters,each of which captures a particular aspect of the query.For instance,a query of“movie”might return Web pages grouped into categories such as reviews,trailers,stars,and theaters.Each category(cluster)can be broken into subcategories(sub-clusters),producing a hierarchical structure that further assists a user’s exploration of the query results.•Climate.Understanding the Earth’s climate requiresfinding patterns in the atmosphere and ocean.To that end,cluster analysis has been applied tofind patterns in the atmospheric pressure of polar regions and areas of the ocean that have a significant impact on land climate.•Psychology and Medicine.An illness or condition frequently has a number of variations,and cluster analysis can be used to identify these different subcategories.For example,clustering has been used to identify different types of depression.Cluster analysis can also be used to detect patterns in the spatial or temporal distribution of a disease.•Business.Businesses collect large amounts of information on current and potential customers.Clustering can be used to segment customers into a small number of groups for additional analysis and marketing activities.Clustering for Utility Cluster analysis provides an abstraction from in-dividual data objects to the clusters in which those data objects reside.Ad-ditionally,some clustering techniques characterize each cluster in terms of a cluster prototype;i.e.,a data object that is representative of the other ob-jects in the cluster.These cluster prototypes can be used as the basis for a489 number of data analysis or data processing techniques.Therefore,in the con-text of utility,cluster analysis is the study of techniques forfinding the most representative cluster prototypes.•Summarization.Many data analysis techniques,such as regression or PCA,have a time or space complexity of O(m2)or higher(where m is the number of objects),and thus,are not practical for large data sets.However,instead of applying the algorithm to the entire data set,it can be applied to a reduced data set consisting only of cluster prototypes.Depending on the type of analysis,the number of prototypes,and the accuracy with which the prototypes represent the data,the results can be comparable to those that would have been obtained if all the data could have been used.•Compression.Cluster prototypes can also be used for data compres-sion.In particular,a table is created that consists of the prototypes for each cluster;i.e.,each prototype is assigned an integer value that is its position(index)in the table.Each object is represented by the index of the prototype associated with its cluster.This type of compression is known as vector quantization and is often applied to image,sound, and video data,where(1)many of the data objects are highly similar to one another,(2)some loss of information is acceptable,and(3)a substantial reduction in the data size is desired.•Efficiently Finding Nearest Neighbors.Finding nearest neighbors can require computing the pairwise distance between all points.Often clusters and their cluster prototypes can be found much more efficiently.If objects are relatively close to the prototype of their cluster,then we can use the prototypes to reduce the number of distance computations that are necessary tofind the nearest neighbors of an object.Intuitively,if two cluster prototypes are far apart,then the objects in the corresponding clusters cannot be nearest neighbors of each other.Consequently,to find an object’s nearest neighbors it is only necessary to compute the distance to objects in nearby clusters,where the nearness of two clusters is measured by the distance between their prototypes.This idea is made more precise in Exercise25on page94.This chapter provides an introduction to cluster analysis.We begin with a high-level overview of clustering,including a discussion of the various ap-proaches to dividing objects into sets of clusters and the different types of clusters.We then describe three specific clustering techniques that represent490Chapter8Cluster Analysis:Basic Concepts and Algorithms broad categories of algorithms and illustrate a variety of concepts:K-means, agglomerative hierarchical clustering,and DBSCAN.Thefinal section of this chapter is devoted to cluster validity—methods for evaluating the goodness of the clusters produced by a clustering algorithm.More advanced clustering concepts and algorithms will be discussed in Chapter9.Whenever possible, we discuss the strengths and weaknesses of different schemes.In addition, the bibliographic notes provide references to relevant books and papers that explore cluster analysis in greater depth.8.1OverviewBefore discussing specific clustering techniques,we provide some necessary background.First,we further define cluster analysis,illustrating why it is difficult and explaining its relationship to other techniques that group data. Then we explore two important topics:(1)different ways to group a set of objects into a set of clusters,and(2)types of clusters.8.1.1What Is Cluster Analysis?Cluster analysis groups data objects based only on information found in the data that describes the objects and their relationships.The goal is that the objects within a group be similar(or related)to one another and different from (or unrelated to)the objects in other groups.The greater the similarity(or homogeneity)within a group and the greater the difference between groups, the better or more distinct the clustering.In many applications,the notion of a cluster is not well defined.To better understand the difficulty of deciding what constitutes a cluster,consider Figure 8.1,which shows twenty points and three different ways of dividing them into clusters.The shapes of the markers indicate cluster membership.Figures 8.1(b)and8.1(d)divide the data into two and six parts,respectively.However, the apparent division of each of the two larger clusters into three subclusters may simply be an artifact of the human visual system.Also,it may not be unreasonable to say that the points form four clusters,as shown in Figure 8.1(c).Thisfigure illustrates that the definition of a cluster is imprecise and that the best definition depends on the nature of data and the desired results.Cluster analysis is related to other techniques that are used to divide data objects into groups.For instance,clustering can be regarded as a form of classification in that it creates a labeling of objects with class(cluster)labels. However,it derives these labels only from the data.In contrast,classification8.1Overview491(a)Original points.(b)Two clusters.(c)Four clusters.(d)Six clusters.Figure8.1.Different ways of clustering the same set of points.in the sense of Chapter4is supervised classification;i.e.,new,unlabeled objects are assigned a class label using a model developed from objects with known class labels.For this reason,cluster analysis is sometimes referred to as unsupervised classification.When the term classification is used without any qualification within data mining,it typically refers to supervised classification.Also,while the terms segmentation and partitioning are sometimes used as synonyms for clustering,these terms are frequently used for approaches outside the traditional bounds of cluster analysis.For example,the term partitioning is often used in connection with techniques that divide graphs into subgraphs and that are not strongly connected to clustering.Segmentation often refers to the division of data into groups using simple techniques;e.g., an image can be split into segments based only on pixel intensity and color,or people can be divided into groups based on their income.Nonetheless,some work in graph partitioning and in image and market segmentation is related to cluster analysis.8.1.2Different Types of ClusteringsAn entire collection of clusters is commonly referred to as a clustering,and in this section,we distinguish various types of clusterings:hierarchical(nested) versus partitional(unnested),exclusive versus overlapping versus fuzzy,and complete versus partial.Hierarchical versus Partitional The most commonly discussed distinc-tion among different types of clusterings is whether the set of clusters is nested492Chapter8Cluster Analysis:Basic Concepts and Algorithmsor unnested,or in more traditional terminology,hierarchical or partitional.A partitional clustering is simply a division of the set of data objects into non-overlapping subsets(clusters)such that each data object is in exactly one subset.Taken individually,each collection of clusters in Figures8.1(b–d)is a partitional clustering.If we permit clusters to have subclusters,then we obtain a hierarchical clustering,which is a set of nested clusters that are organized as a tree.Each node(cluster)in the tree(except for the leaf nodes)is the union of its children (subclusters),and the root of the tree is the cluster containing all the objects. Often,but not always,the leaves of the tree are singleton clusters of individual data objects.If we allow clusters to be nested,then one interpretation of Figure8.1(a)is that it has two subclusters(Figure8.1(b)),each of which,in turn,has three subclusters(Figure8.1(d)).The clusters shown in Figures8.1 (a–d),when taken in that order,also form a hierarchical(nested)clustering with,respectively,1,2,4,and6clusters on each level.Finally,note that a hierarchical clustering can be viewed as a sequence of partitional clusterings and a partitional clustering can be obtained by taking any member of that sequence;i.e.,by cutting the hierarchical tree at a particular level. Exclusive versus Overlapping versus Fuzzy The clusterings shown in Figure8.1are all exclusive,as they assign each object to a single cluster. There are many situations in which a point could reasonably be placed in more than one cluster,and these situations are better addressed by non-exclusive clustering.In the most general sense,an overlapping or non-exclusive clustering is used to reflect the fact that an object can simultaneously belong to more than one group(class).For instance,a person at a university can be both an enrolled student and an employee of the university.A non-exclusive clustering is also often used when,for example,an object is“between”two or more clusters and could reasonably be assigned to any of these clusters. Imagine a point halfway between two of the clusters of Figure8.1.Rather than make a somewhat arbitrary assignment of the object to a single cluster, it is placed in all of the“equally good”clusters.In a fuzzy clustering,every object belongs to every cluster with a mem-bership weight that is between0(absolutely doesn’t belong)and1(absolutely belongs).In other words,clusters are treated as fuzzy sets.(Mathematically, a fuzzy set is one in which an object belongs to any set with a weight that is between0and1.In fuzzy clustering,we often impose the additional con-straint that the sum of the weights for each object must equal1.)Similarly, probabilistic clustering techniques compute the probability with which each8.1Overview493 point belongs to each cluster,and these probabilities must also sum to1.Be-cause the membership weights or probabilities for any object sum to1,a fuzzy or probabilistic clustering does not address true multiclass situations,such as the case of a student employee,where an object belongs to multiple classes. Instead,these approaches are most appropriate for avoiding the arbitrariness of assigning an object to only one cluster when it may be close to several.In practice,a fuzzy or probabilistic clustering is often converted to an exclusive clustering by assigning each object to the cluster in which its membership weight or probability is highest.Complete versus Partial A complete clustering assigns every object to a cluster,whereas a partial clustering does not.The motivation for a partial clustering is that some objects in a data set may not belong to well-defined groups.Many times objects in the data set may represent noise,outliers,or “uninteresting background.”For example,some newspaper stories may share a common theme,such as global warming,while other stories are more generic or one-of-a-kind.Thus,tofind the important topics in last month’s stories,we may want to search only for clusters of documents that are tightly related by a common theme.In other cases,a complete clustering of the objects is desired. For example,an application that uses clustering to organize documents for browsing needs to guarantee that all documents can be browsed.8.1.3Different Types of ClustersClustering aims tofind useful groups of objects(clusters),where usefulness is defined by the goals of the data analysis.Not surprisingly,there are several different notions of a cluster that prove useful in practice.In order to visually illustrate the differences among these types of clusters,we use two-dimensional points,as shown in Figure8.2,as our data objects.We stress,however,that the types of clusters described here are equally valid for other kinds of data. Well-Separated A cluster is a set of objects in which each object is closer (or more similar)to every other object in the cluster than to any object not in the cluster.Sometimes a threshold is used to specify that all the objects in a cluster must be sufficiently close(or similar)to one another.This idealistic definition of a cluster is satisfied only when the data contains natural clusters that are quite far from each other.Figure8.2(a)gives an example of well-separated clusters that consists of two groups of points in a two-dimensional space.The distance between any two points in different groups is larger than494Chapter8Cluster Analysis:Basic Concepts and Algorithmsthe distance between any two points within a group.Well-separated clusters do not need to be globular,but can have any shape.Prototype-Based A cluster is a set of objects in which each object is closer (more similar)to the prototype that defines the cluster than to the prototype of any other cluster.For data with continuous attributes,the prototype of a cluster is often a centroid,i.e.,the average(mean)of all the points in the clus-ter.When a centroid is not meaningful,such as when the data has categorical attributes,the prototype is often a medoid,i.e.,the most representative point of a cluster.For many types of data,the prototype can be regarded as the most central point,and in such instances,we commonly refer to prototype-based clusters as center-based clusters.Not surprisingly,such clusters tend to be globular.Figure8.2(b)shows an example of center-based clusters. Graph-Based If the data is represented as a graph,where the nodes are objects and the links represent connections among objects(see Section2.1.2), then a cluster can be defined as a connected component;i.e.,a group of objects that are connected to one another,but that have no connection to objects outside the group.An important example of graph-based clusters are contiguity-based clusters,where two objects are connected only if they are within a specified distance of each other.This implies that each object in a contiguity-based cluster is closer to some other object in the cluster than to any point in a different cluster.Figure8.2(c)shows an example of such clusters for two-dimensional points.This definition of a cluster is useful when clusters are irregular or intertwined,but can have trouble when noise is present since, as illustrated by the two spherical clusters of Figure8.2(c),a small bridge of points can merge two distinct clusters.Other types of graph-based clusters are also possible.One such approach (Section8.3.2)defines a cluster as a clique;i.e.,a set of nodes in a graph that are completely connected to each other.Specifically,if we add connections between objects in the order of their distance from one another,a cluster is formed when a set of objects forms a clique.Like prototype-based clusters, such clusters tend to be globular.Density-Based A cluster is a dense region of objects that is surrounded by a region of low density.Figure8.2(d)shows some density-based clusters for data created by adding noise to the data of Figure8.2(c).The two circular clusters are not merged,as in Figure8.2(c),because the bridge between them fades into the noise.Likewise,the curve that is present in Figure8.2(c)also8.1Overview495 fades into the noise and does not form a cluster in Figure8.2(d).A density-based definition of a cluster is often employed when the clusters are irregular or intertwined,and when noise and outliers are present.By contrast,a contiguity-based definition of a cluster would not work well for the data of Figure8.2(d) since the noise would tend to form bridges between clusters.Shared-Property(Conceptual Clusters)More generally,we can define a cluster as a set of objects that share some property.This definition encom-passes all the previous definitions of a cluster;e.g.,objects in a center-based cluster share the property that they are all closest to the same centroid or medoid.However,the shared-property approach also includes new types of clusters.Consider the clusters shown in Figure8.2(e).A triangular area (cluster)is adjacent to a rectangular one,and there are two intertwined circles (clusters).In both cases,a clustering algorithm would need a very specific concept of a cluster to successfully detect these clusters.The process offind-ing such clusters is called conceptual clustering.However,too sophisticated a notion of a cluster would take us into the area of pattern recognition,and thus,we only consider simpler types of clusters in this book.Road MapIn this chapter,we use the following three simple,but important techniques to introduce many of the concepts involved in cluster analysis.•K-means.This is a prototype-based,partitional clustering technique that attempts tofind a user-specified number of clusters(K),which are represented by their centroids.•Agglomerative Hierarchical Clustering.This clustering approach refers to a collection of closely related clustering techniques that producea hierarchical clustering by starting with each point as a singleton clusterand then repeatedly merging the two closest clusters until a single,all-encompassing cluster remains.Some of these techniques have a natural interpretation in terms of graph-based clustering,while others have an interpretation in terms of a prototype-based approach.•DBSCAN.This is a density-based clustering algorithm that producesa partitional clustering,in which the number of clusters is automaticallydetermined by the algorithm.Points in low-density regions are classi-fied as noise and omitted;thus,DBSCAN does not produce a complete clustering.Chapter 8Cluster Analysis:Basic Concepts and Algorithms (a)Well-separated clusters.Eachpoint is closer to all of the points in itscluster than to any point in anothercluster.(b)Center-based clusters.Each point is closer to the center of its cluster than to the center of any other cluster.(c)Contiguity-based clusters.Eachpoint is closer to at least one pointin its cluster than to any point inanother cluster.(d)Density-based clusters.Clus-ters are regions of high density sep-arated by regions of low density.(e)Conceptual clusters.Points in a cluster share some generalproperty that derives from the entire set of points.(Points in theintersection of the circles belong to both.)Figure 8.2.Different types of clusters as illustrated by sets of two-dimensional points.8.2K-meansPrototype-based clustering techniques create a one-level partitioning of the data objects.There are a number of such techniques,but two of the most prominent are K-means and K-medoid.K-means defines a prototype in terms of a centroid,which is usually the mean of a group of points,and is typically8.2K-means497 applied to objects in a continuous n-dimensional space.K-medoid defines a prototype in terms of a medoid,which is the most representative point for a group of points,and can be applied to a wide range of data since it requires only a proximity measure for a pair of objects.While a centroid almost never corresponds to an actual data point,a medoid,by its definition,must be an actual data point.In this section,we will focus solely on K-means,which is one of the oldest and most widely used clustering algorithms.8.2.1The Basic K-means AlgorithmThe K-means clustering technique is simple,and we begin with a description of the basic algorithm.Wefirst choose K initial centroids,where K is a user-specified parameter,namely,the number of clusters desired.Each point is then assigned to the closest centroid,and each collection of points assigned to a centroid is a cluster.The centroid of each cluster is then updated based on the points assigned to the cluster.We repeat the assignment and update steps until no point changes clusters,or equivalently,until the centroids remain the same.K-means is formally described by Algorithm8.1.The operation of K-means is illustrated in Figure8.3,which shows how,starting from three centroids,the final clusters are found in four assignment-update steps.In these and other figures displaying K-means clustering,each subfigure shows(1)the centroids at the start of the iteration and(2)the assignment of the points to those centroids.The centroids are indicated by the“+”symbol;all points belonging to the same cluster have the same marker shape.1:Select K points as initial centroids.2:repeat3:Form K clusters by assigning each point to its closest centroid.4:Recompute the centroid of each cluster.5:until Centroids do not change.In thefirst step,shown in Figure8.3(a),points are assigned to the initial centroids,which are all in the larger group of points.For this example,we use the mean as the centroid.After points are assigned to a centroid,the centroid is then updated.Again,thefigure for each step shows the centroid at the beginning of the step and the assignment of points to those centroids.In the second step,points are assigned to the updated centroids,and the centroids498Chapter8Cluster Analysis:Basic Concepts and Algorithms(a)Iteration1.(b)Iteration2.(c)Iteration3.(d)Iteration4.ing the K-means algorithm tofind three clusters in sample data.are updated again.In steps2,3,and4,which are shown in Figures8.3(b), (c),and(d),respectively,two of the centroids move to the two small groups of points at the bottom of thefigures.When the K-means algorithm terminates in Figure8.3(d),because no more changes occur,the centroids have identified the natural groupings of points.For some combinations of proximity functions and types of centroids,K-means always converges to a solution;i.e.,K-means reaches a state in which no points are shifting from one cluster to another,and hence,the centroids don’t change.Because most of the convergence occurs in the early steps,however, the condition on line5of Algorithm8.1is often replaced by a weaker condition, e.g.,repeat until only1%of the points change clusters.We consider each of the steps in the basic K-means algorithm in more detail and then provide an analysis of the algorithm’s space and time complexity. Assigning Points to the Closest CentroidTo assign a point to the closest centroid,we need a proximity measure that quantifies the notion of“closest”for the specific data under consideration. Euclidean(L2)distance is often used for data points in Euclidean space,while cosine similarity is more appropriate for documents.However,there may be several types of proximity measures that are appropriate for a given type of data.For example,Manhattan(L1)distance can be used for Euclidean data, while the Jaccard measure is often employed for documents.Usually,the similarity measures used for K-means are relatively simple since the algorithm repeatedly calculates the similarity of each point to each centroid.In some cases,however,such as when the data is in low-dimensional8.2K-means499Table8.1.Table of notation.Symbol Descriptionx An object.C i The i th cluster.c i The centroid of cluster C i.c The centroid of all points.m i The number of objects in the i th cluster.m The number of objects in the data set.K The number of clusters.Euclidean space,it is possible to avoid computing many of the similarities, thus significantly speeding up the K-means algorithm.Bisecting K-means (described in Section8.2.3)is another approach that speeds up K-means by reducing the number of similarities computed.Centroids and Objective FunctionsStep4of the K-means algorithm was stated rather generally as“recompute the centroid of each cluster,”since the centroid can vary,depending on the proximity measure for the data and the goal of the clustering.The goal of the clustering is typically expressed by an objective function that depends on the proximities of the points to one another or to the cluster centroids;e.g., minimize the squared distance of each point to its closest centroid.We illus-trate this with two examples.However,the key point is this:once we have specified a proximity measure and an objective function,the centroid that we should choose can often be determined mathematically.We provide mathe-matical details in Section8.2.6,and provide a non-mathematical discussion of this observation here.Data in Euclidean Space Consider data whose proximity measure is Eu-clidean distance.For our objective function,which measures the quality of a clustering,we use the sum of the squared error(SSE),which is also known as scatter.In other words,we calculate the error of each data point,i.e.,its Euclidean distance to the closest centroid,and then compute the total sum of the squared errors.Given two different sets of clusters that are produced by two different runs of K-means,we prefer the one with the smallest squared error since this means that the prototypes(centroids)of this clustering are a better representation of the points in their ing the notation in Table8.1,the SSE is formally defined as follows:。
数据库英⽂版第六版课后答案(32)C H A P T E R11Indexing and HashingThis chapter covers indexing techniques ranging from the most basic one tohighly specialized ones.Due to the extensive use of indices in database systems,this chapter constitutes an important part of a database course.A class that has already had a course on data-structures would likely be famil-iar with hashing and perhaps even B+-trees.However,this chapter is necessaryreading even for those students since data structures courses typically cover in-dexing in main memory.Although the concepts carry over to database accessmethods,the details(e.g.,block-sized nodes),will be new to such students.The sections on B-trees(Sections11.4.5)and bitmap indexing(Section11.9) may be omitted if desired. Exercises11.15When is it preferable to use a dense index rather than a sparse index?Explain your answer.Answer:It is preferable to use a dense index instead of a sparse indexwhen the?le is not sorted on the indexed?eld(such as when the indexis a secondary index)or when the index?le is small compared to the sizeof memory.11.16What is the difference between a clustering index and a secondary index?Answer:The clustering index is on the?eld which speci?es the sequentialorder of the?le.There can be only one clustering index while there canbe many secondary indices.11.17For each B+-tree of Practice Exercise11.3,show the steps involved in thefollowing queries:a.Find records with a search-key value of11.b.Find records with a search-key value between7and17,inclusive.Answer:With the structure provided by the solution to Practice Exer-cise11.3a:9798Chapter11Indexing and Hashinga.Find records with a value of11i.Search the?rst level index;follow the?rst pointer.ii.Search next level;follow the third pointer.iii.Search leaf node;follow?rst pointer to records with key value11.b.Find records with value between7and17(inclusive)i.Search top index;follow?rst pointer.ii.Search next level;follow second pointer.iii.Search third level;follow second pointer to records with keyvalue7,and after accessing them,return to leaf node.iv.Follow fourth pointer to next leaf block in the chain.v.Follow?rst pointer to records with key value11,then return.vi.Follow second pointer to records with with key value17.With the structure provided by the solution to Practice Exercise12.3b:a.Find records with a value of11i.Search top level;follow second pointer.ii.Search next level;follow second pointer to records with key value11.b.Find records with value between7and17(inclusive)i.Search top level;follow second pointer.ii.Search next level;follow?rst pointer to records with key value7,then return.iii.Follow second pointer to records with key value11,then return.iv.Follow third pointer to records with key value17.With the structure provided by the solution to Practice Exercise12.3c:a.Find records with a value of11i.Search top level;follow second pointer.ii.Search next level;follow?rst pointer to records with key value11.b.Find records with value between7and17(inclusive)i.Search top level;follow?rst pointer.ii.Search next level;follow fourth pointer to records with key value7,then return.iii.Follow eighth pointer to next leaf block in chain.iv.Follow?rst pointer to records with key value11,then return.v.Follow second pointer to records with key value17.Exercises99 11.18The solution presented in Section11.3.4to deal with nonunique searchkeys added an extra attribute to the search key.What effect could this change have on the height of the B+-tree?Answer:The resultant B-tree’s extended search key is unique.This results in more number of nodes.A single node(which points to mutiple records with the same key)in the original tree may correspond to multiple nodes in the result tree.Dependingon how they are organized the height of the tree may increase;it might be more than that of the original tree.11.19Explain the distinction between closed and open hashing.Discuss therelative merits of each technique in database applications.Answer:Open hashing may place keys with the same hash function value in different buckets.Closed hashing always places such keys together in the same bucket.Thus in this case,different buckets can be of different sizes,though the implementation may be by linking together?xed size buckets using over?ow chains.Deletion is dif?cult with open hashing as all the buckets may have to inspected before we can ascertain that a key value has been deleted,whereas in closed hashing only that bucket whose address is obtained by hashing the key value need be inspected.Deletions are more common in databases and hence closed hashing is more appropriate for them.For a small,static set of data lookups may be more ef?cient using open hashing.The symbol table of a compiler would be a good example.11.20What are the causes of bucket over?ow in a hash?le organization?Whatcan be done to reduce the occurrence of bucket over?ows?Answer:The causes of bucket over?ow are:-a.Our estimate of the number of records that the relation will have wastoo low,and hence the number of buckets allotted was not suf?cient.b.Skew in the distribution of records to buckets.This may happeneither because there are many records with the same search keyvalue,or because the the hash function chosen did not have thedesirable properties of uniformity and randomness.To reduce the occurrence of over?ows,we can:-a.Choose the hash function more carefully,and make better estimatesof the relation size.b.If the estimated size of the relation is n r and number of records perblock is f r,allocate(n r/f r)?(1+d)buckets instead of(n r/f r)buckets.Here d is a fudge factor,typically around0.2.Some space is wasted:About20percent of the space in the buckets will be empty.But thebene?t is that some of the skew is handled and the probability ofover?ow is reduced.11.21Why is a hash structure not the best choice for a search key on whichrange queries are likely?100Chapter11Indexing and HashingAnswer:A range query cannot be answered ef?ciently using a hashindex,we will have to read all the buckets.This is because key valuesin the range do not occupy consecutive locations in the buckets,they aredistributed uniformly and randomly throughout all the buckets.11.22Suppose there is a relation r(A,B,C),with a B+-tree index with searchkey(A,B).a.What is the worst-case cost of?nding records satisfying10using this index,in terms of the number of records retrieved n1and the height h of the tree?b.What is the worst-case cost of?nding records satisfying1050∧5n2that satisfy this selection,as well as n1and h de?ned above?c.Under what conditions on n1and n2would the index be an ef?cient way of?nding records satisfying10Answer:a.What is the worst case cost of?nding records satisfying10using this index,in terms of the number of records retrieved n1and the height h of the tree?This query does not correspond to a range query on the search key as the condition on the?rst attribute if the search key is a comparison condition.It looks up records which have the value of A between10and50.However,each record is likely to be in a different block, because of the ordering of records in the?le,leading to many I/O operation.In the worst case,for each record,it needs to traverse the whole tree(cost is h),so the total cost is n1?h.b.What is the worst case cost of?nding records satisfying1050∧5n2that satisfy this selection,as well as n1and h de?ned above.This query can be answered by using an ordered index on the search key(A,B).For each value of A this is between10and50,the system located records with B value between5and10.However,each record could is likely to be in a different disk block.This amounts to exe-cuting the query based on the condition on A,this costs n1?h.Then these records are checked to see if the condition on B is satis?ed.So, the total cost in the worst case is n1?h.c.Under what conditions on n1and n2would the index be an ef?cient way of?nding records satisfying10n1records satisfy the?rst condition and n2records satisfy the second condition.When both the conditions are queried,n1records are output in the?rst stage.So,in the case where n1=n2,no extra records are output in the furst stage.Otherwise,the records whichExercises101 dont satisfy the second condition are also output with an additionalcost of h each(worst case).11.23Suppose you have to create a B+-tree index on a large number of names,where the maximum size of a name may be quite large(say40characters) and the average name is itselflarge(say10characters).Explain how pre?x compression can be used to maximize the average fanout of nonleaf nodes. Answer:There arises2problems in the given scenario.The?rst prob-lem is names can be of variable length.The second problem is names can be long(maximum is40characters),leading to a low fanout and a correspondingly increased tree height.With variable-length search keys, different nodes can have different fanouts even if they are full.The fanout of nodes can be increased bu using a technique called pre?x compres-sion.With pre?x compression,the entire search key value is not stored at internal nodes.Only a pre?x of each search key which is suf?ceint to distinguish between the key values in the subtrees that it seperates.The full name can be stored in the leaf nodes,this way we dont lose any information and also maximize the average fanout of internal nodes. 11.24Suppose a relation is stored in a B+-tree?le organization.Suppose sec-ondary indices stored record identi?ers that are pointers to records on disk.a.What would be the effect on the secondary indices if a node splithappens in the?le organization?b.What would be the cost of updating all affected records in a sec-ondary index?c.How does using the search key of the?le organization as a logicalrecord identi?er solve this problem?d.What is the extra cost due to the use of such logical record identi?ers?Answer:a.When a leaf page is split in a B+-tree?le organization,a numberof records are moved to a new page.In such cases,all secondaryindices that store pointers to the relocated records would have tobe updated,even though the values in the records may not havechanged.b.Each leaf page may contain a fairly large number of records,andeach of them may be in different locations on each secondary index.Thus,a leaf-page split may require tens or even hundreds of I/Ooperations to update all affected secondary indices,making it a veryexpensive operation.c.One solution is to store the values of the primary-index search keyattributes in secondary indices,in place of pointers to the indexed102Chapter11Indexing and Hashingrecords.Relocation of records because of leaf-page splits then doesnot require any update on any secondary index.d.Locating a record using the secondary index now requires two steps:First we use the secondary index to?nd the primary index search-key values,and then we use the primary index to?nd the corre-sponding records.This approach reduces the cost of index updatedue to?le reorganization,although it increases the cost of accesingdata using a secondary index.11.25Show how to compute existence bitmaps from other bitmaps.Make sure that your technique works even in the presence of null values,by using a bitmap for the value null.Answer:The existence bitmap for a relation can be calculated by takingthe union(logical-or)of all the bitmaps on that attribute,including thebitmap for value null.11.26How does data encryption affect index schemes?In particular,how might it affect schemes that attempt to store data in sorted order?Answer:Note that indices must operate on the encrypted data orsomeone could gain access to the index to interpret the data.Otherwise,the index would have to be restricted so that only certain users could access it.To keep the data in sorted order,the index scheme would haveto decrypt the data at each level in a tree.Note that hash systems wouldnot be affected.11.27Our description of static hashing assumes that a large contiguous stretch of disk blocks can be allocated to a static hash table.Suppose you can allocate only C contiguous blocks.Suggest how to implement the hash table,if it can be much larger than C blocks.Access to a block should stillbe ef?cient.Answer:A separate list/table as shown below can be created.Starting address of?rst set of C blocksCStarting address of next set of C blocks2Cand so onDesired block address=Starting adress(from the table depending onthe block number)+blocksize*(blocknumber%C)For each set of C blocks,a single entry is added to the table.In thiscase,locating a block requires2steps:First we use the block number tond the actual block address,and then we can access the desired block.。
a rX iv:c ond-ma t/37186v1[c ond-m at.str-el]9J ul23Charge stripes due to electron correlations in the two-dimensional spinless Falicov-Kimball model R.Lema´n ski ∗Institute of Low Temperature and Structure Research,Polish Academy of Sciences,Wroc l aw,Poland J.K.Freericks †Department of Physics,Georgetown University,Washington,DC 20057G.Banach Daresbury Laboratory,Cheshire WA44AD,Daresbury,United Kingdom (Dated:February 2,2008)Abstract We calculate the restricted phase diagram for the Falicov-Kimball model on a two-dimensional square lattice.We consider the limit where the conduction electron density is equal to the localized electron density,which is the limit related to the S z =0states of the Hubbard model.After considering over 20,000different candidate phases (with a unit cell of 16sites or less)and their thermodynamic mixtures,we find only about 100stable phases in the ground-state phase diagram.We analyze these phases to describe where stripe phases occur and relate these discoveries to the physics behind stripe formation in the Hubbard model.Keywords:charge-stripes–Falicov-Kimball–Hubbard–phase-diagramI.INTRODUCTIONWefind itfitting to write a paper on the spinless Falicov-Kimball(FK) model[1]to celebrate Elliott Lieb’s seventieth birthday.Elliott,and his collaborators,provided two seminal results on this model:(i)thefirst,with Tom Kennedy,proved that there was afinite temperature phase transition to a checkerboard charge-density-wave(CDW)phase in two or more di-mensions for the symmetric half-filled case[2,3],and(ii)the second,with Daniel Ueltschi and Jim Freericks,proved that the segregation principle holds for all dimensions[4,5](which states that if the total particle density is less than one,then the ground state is phase separated if the interac-tion strength is large enough[6]).The Kennedy-Lieb result(along with an independent Brandt-Schmidt paper[7,8])inspired dozens of follow-up papers by researchers across the world.The Freericks-Lieb-Ueltschi paper generalized Lemberger’s proof[9]from one dimension to all dimensions, whichfinally proved the decade old Freericks-Falicov conjecture[6].Both papers are important,because they are the only examples where long-range order and phase separation can be proved to occur in a correlated electronic system.The Falicov-Kimball model has an interesting history too.Leo Falicov and John Kimball invented the spin-one-half version of the model in1969to describe metal-insulator transitions of rare-earth compounds[1].It turns out that John Hubbard actually“discovered”the spinless version of the FK model four years earlier in1965[10],when he developed the alloy-analogy solution to the Hubbard model[11](the so-called Hubbard III solution). This latter version was rediscovered by Kennedy and Lieb in1986[2]when they formulated it as a simple model for how crystallization can be driven by the Pauli exclusion principle.In this contribution,we focus on another problem that can be analyzed in the FK model—the problem of stripe formation in two dimensions.The question of the relation between charge stripes,correlated electrons,and high-temperature superconductivity has been asked ever since static stripes werefirst seen in the nickelate[12,13,14,15]and cuprate[16,17,18,19] materials starting in1993.Two schools of thought emerged to describe the theoretical basis for stripe formation in the Hubbard model.The Kivelson-Emery scenario[20,21,22,23]says that at large U the Hubbard model is close to a phase separation instability but the long-range Coulomb force restricts the phase separation on the nanoscale;a compromise results in static stripe-like order.The Scalapino-White scenario[24,25,26,27]says that stripes can form due to a subtle balance between kinetic-energy effects and potential-energy effects,mediated by spinfluctuations.No long range Coulomb interaction or phase separation is needed to form these stripes. There are numerous numerical studies that have tried to shed light onto this problem.Unfortunately,they have conflicting results.High temperature2series expansions on the related t−J model[28,29]show that phase separation exists,but only when J is large enough,so it is not present in the large-correlation-strength limit of the Hubbard model(where J→0). Monte Carlo calculations[30,31,32]and exact diagonalization studies[33, 34]give different results:some calculations predict the stripe formation to occur,others show a linkage between the stripe formation and the boundary conditions selected for the problem.Mean-field-theory analyses[35,36] seem to predict stripe formation without any phase separation.One way to make sense of these disparate results is that both the energy of the intrinsic stripe phases and the energy for phase separation are quite close to one another,so any small change(induced byfinite-size effects,statistical errors,effects of correlations not included in the perturbative expansions,or due to terms dropped or added to the Hamiltonian)can have a large effect on the phase diagram by producing a small relative change in the energetics of the different many-body states(because of their near degeneracy).We take an alternate point of view here.We choose to examine a model that can be analyzed rigorously,and can be continuously connected to the Hubbard model.We choose the regime that connects directly with the S z=0states of the Hubbard model.The model we analyze is the spinless Falicov-Kimball model on a square latticeH=−t ij (c†i c j+c†j c i)+U|Λ| i=1w i c†i c i,(1) where c†i(c i)creates(destroys)a spinless conduction electron at site i, t is the hopping matrix element( ij denotes a summation over nearest-neighbor pairs on a square lattice),w i=0or1is a classical variable denot-ing the localized electron number at site i,and U is the on-site Coulomb interaction energy.The Fermionic operators satisfy anticommutation re-lations(c†i,c†j)+=0,(c i,c j)+=0,and(c†i,c j)+=δij.The symbol|Λ| denotes the total number of lattice sites in the square latticeΛ.We will always be dealing with periodic configurations of localized electrons,which means we can always consider our lattice to have a large butfinite number of lattice sites and periodic boundary conditions.A short presentation of these results has already appeared[37].The Falicov-Kimball model can be viewed as a Fermionic quantum ana-logue of the Ising model,while the Hubbard model can be viewed as the Fermionic quantum analogue of the Heisenberg model(indeed in the large-U limit at halffilling,the Falicov-Kimball model maps onto an effective nearest-neighbor Ising model,while the Hubbard model maps onto an ef-fective nearest-neighbor Heisenberg model).The way to link the Falicov-Kimball model to the Hubbard model is to imagine a generalization of the Hubbard model where the down-spin hopping matrix element differs from the up-spin hopping matrix element.Then as t↓→0,the down spins3become heavy and are localized on the lattice;the quantum-mechanical ground state is determined by the configuration of down-spin electrons that minimizes the energy of the up-spin electrons.This is precisely the Falicov-Kimball model!In order to maintain the connection to the Hubbard model in zero magneticfield,we must choose the conduction electron densityρe= |Λ|i=1 c†i c i /|Λ|to be equal to the localized electron densityρf= |Λ|i=1w i/|Λ|,which we do here.We study the evolution from the checker-board phase at halffilling(ρe=ρf=1/2)to the segregated phase,which appears whenρe=ρf is small enough.Since these two phases are dras-tically different from each other,the transition is likely to include many different intermediate phases.Indeed,the ground state phase diagram of the Falicov-Kimball model can be quite complex.There are many different periodic phases that can be stabilized for different values of U orρe=ρf. As U becomes large though,the phase diagram simplifies,as the segregated phase becomes the ground state for wider and wider ranges of the electron densities.II.FORMALISMOur strategy to examine the FK model is a brute-force approach which is straightforward to describe,but tedious to carry out.We employ the so-called restricted phase diagram approach,where we consider the grand-canonical thermodynamic potential of the system for all possible periodic phases of the localized electrons,selected from afinite set of candidate phases.In this work,we consider23,755phases,which corresponds to the set of all inequivalent phases with a unit cell that includes16or fewer lattice sites.In order to calculate the thermodynamic potential,wefirst must determine the electronic band structure for the conduction electrons for each candidate periodic phase.We employ a Brillouin-zone grid of 110×110momentum points for each bandstructure.This requires us to diagonalize up to16×16matrices at each discrete momentum point in the Brillouin zone and results in at most16different energy bands.Hence, our calculations can be viewed asfinite-size cluster calculations with cluster sizes ranging from110×110×1up to110×110×16depending on the number of atoms in the unit cell.An example of such a bandstructure is shown in Fig.1.The eigenvalues of the band structure are summed to determine the ground-state energy for each number of conduction electrons.The Gibbs thermodynamic potential is then calculated for all possible values of the chemical potentials of the conduction and localized electrons through the4formula1G({w i})=We separate the different stable phases into10different groups.Every stable phase is labeled by a number and depicted in Fig.3.The small dots indicate the absence of a localized electron,while the large dots indicate the positions of the localized electrons.In the lower left corner,we shade in the unit cell of the configuration and we show with the two solid lines the translation vectors of the unit cell that allow the square lattice to be tiled by the unit cell.The different families of configurations are as follows: (i)the empty lattice(ρe=0andρf=0)denoted E which contains no lo-calized electrons(configuration1);(ii)the full lattice(ρe=0andρf=1) denoted F which contains a localized electron at each site(configuration 2);(iii)the checkerboard phase(ρe=ρf=1/2)denoted Ch which has the localized electrons occupying the A sublattice only of the square lattice in a checkerboard arrangement(configuration3);(iv)diagonal non-neutral stripe phases(ρe=1−ρf)denoted DS which consist of diagonal checker-board phases separated by empty diagonal stripes of slope1(configuration 4);(v)axial non-neutral checkerboard stripes(ρe=1−ρf)denoted AChS which consist of checkerboard regions arranged in stripes oriented parallel to the x-axis and separated by empty stripes with slope0(configurations 5–10);(vi)diagonal neutral stripe phases(ρe=1−ρf)denoted DNS which consist of localized electrons arranged in the checkerboard phase separated by fully occupied striped regions of slope1,or equivalently,checkerboard phases with diagonal antiphase boundaries(configurations11–19);(vii)ax-ial non-neutral stripe phases(ρe=1−ρf)denoted AS which consist of fully occupied vertical(or horizontal)stripes separated by empty stripes, which are translationally invariant in the vertical(or horizontal)direction (configurations20–54);(viii)neutral phases(ρe=1−ρf)denoted N which consist of neutral phases in an arrangement that does not look like any simple stripe phase(some neutral phases can be described in a stripe pic-ture,such as configuration61which has a slope1/3empty lattice stripe, but we prefer to refer to them as non-stripe phases)(configurations55–70); (ix)four-molecule phases(ρe=1−ρf)denoted4M which can be described as a“bound”four-molecule square of empty sites tiled inside an occupied lattice framework(configurations71–74);(x)two-dimensional non-neutral phases(ρe=1−ρf)denoted2D which consist of phases with the local-ized electrons arranged in a fashion that is not stripe-like and requires a two-dimensional unit cell to describe them(once again,some phases like configuration75could be described as a slope3/2stripe,but appears to us more like a2D phase)(configurations75–111).Generically,wefind the canonical phase diagram does not contain pure phases from one of the111stable phases,but rather forms mixtures of two or three periodic phases,or one or two periodic phases and the empty lattice(which is often needed to get the conduction-electronfilling correct in the mixture).When we are doped sufficiently far from halffilling,we are in the segregated phase,which is a mixture of the E and F phases. We consider5different values of U in our computations:U=1,2,4,6,6and8.The phase diagram is quite complex,with many of the different111 phases appearing for different values of U.We summarize which phases appear in Table I.We begin our discussion with the weak-coupling value U=1where50 phases appear.The phase diagram is summarized in Fig.4.We use a solid line to indicate the region of the particle density where a particular phase appears in the ground state(either as a pure phase or as a mixture). The phases that appear in a mixture at a given density are found by de-termining the solid vertical lines that intersect a horizontal line drawn to pass through the given particle density.The phase diagram has shading included to separate the regions of the different categories of phases.The numeric labels are shown to make it easier to determine the actual phases present in the diagram.We plot similar phase diagrams for U=2(38 phases),4(42phases),6(30phases),and8(25phases)in Figs.5–8,re-spectively.A schematic phase diagram that illustrates the generic features of the phase diagram in the electron density,interaction-strength plane appears in Fig.9.As can be seen from thesefigures,the generic phase diagram is quite complex,and by looking at the different phases in Fig.3,many of the phases have stripe-like structures to them.To begin our discussion of these results,we mustfirst recall the rigorous results known for this model.When ρe=ρf=1/2,the ground state is the checkerboard phase(configuration 3)for all U.This can be seen in all of the phase diagrams plotted.When ρe=ρf=1/2,the ground state becomes the phase separated segregated phase when U is large enough.So there is a simplification in the phase diagram as we increase U,and the most complex phase diagram appears in the U→0limit.That limit is also the most difficult computationally, because the differences in the energies between different configurations also becomes small for small U,and the numerical accuracy must be huge in order to achieve trustworthy results.This is why we do not report any phase diagrams with U<1here.Looking at the U=8case shown in Fig.8,we see that as we move away from halffilling,we initiallyfind mixtures between the checkerboard phase,other diagonal stripe phases,and the empty lattice.When we ex-amine the structure factors associated with the diagonal stripe phases,we find that they tend to have more weight along the Brillouin zone diagonal than elsewhere.Hence,these diagonal stripe phases are being stabilized by a“near-nesting”instability of the noninteracting Fermi surface,and the overall mixtures are required to maintain the averagefillings of the conduction and localized electrons.As we move farther from halffilling, the checkerboard phase disappears from the mixtures,and then a series of neutral phases enter the mix which retain some appearance of diago-nal stripes,but with more and more“defects”to the stripes that make them look more two-dimensional.Wefind the localized electron density of these phases increases as we reduce the totalfilling,which is what we7expect as we move toward the segregated phase which involves a mixture of the E and F configurations.Note that the formation of many different stripe phases,occurs without needing the long-range Coulomb interaction to oppose the tendency towards phase separation,when we are close to half filling.Indeed,the ground state is often a phase separated mixture,but it is a mixture of stripe-like phases,which occur automatically,without the need to add any other physics to the system.This regime,is the closest to the Kivelson-Emery picture,but we see it has more complex behavior than what they envisioned when they examined the Hubbard model.Moving on to the U=6case in Fig.7,wefind a significant change in the phase diagram.The grouping of diagonal stripes near halffilling disappears and we insteadfind the ground state to initially be a mixture between the checkerboard phase,a truly two-dimensional screen-like phase (configuration108)and the empty lattice.Here,if we include a long-range Coulomb interaction,we would likely form diagonal stripes,but the mix-ture would be more complicated because it would include this screen-like structure as well.As we dope further away,we see a smaller number of the neutral phases,which look somewhat like diagonal stripes with a large number of defects in them,and then we go to a very different class of mixtures,dominated by the presence of the axial stripe phase in configu-ration33.As that phase becomes destabilized,wefind a cascade of many other axial stripes entering,before the segregated phase takes over.This transition from diagonal stripes to axial stripes as a function of the elec-tronfilling,also occurs because of a“near-nesting”effect.The structure factors of the axial stripe phases are peaked predominantly along the zone edge,and as we dope further from halffilling,this is where nesting is more likely to occur.The cascade of stable phases that enter after configura-tion33is destabilized,have a progression of the peaks in their structure factor moving towards the zone center,which is also expected,since they are progressively heading towards the segregated phase.A similar kind of transition from diagonal stripes to axial stripes is seen in the Hubbard model studies,with the critical density lying near0.375,as we see here too. By the time we decrease to U=4shown in Fig.6,wefind even more interesting behavior.Now,when we are near halffilling,wefind two more configurations,a nonneutral phase(configuration59)and a two-dimensional phase(configuration109)joining with the checkerboard phase and configuration108in the initial mixtures.Each of these phases looks like a“square-lattice screen”with differing size“holes”in the screen.These two-dimensional structures are not stripe-like and it would be interesting to see if they could appear in the Hubbard model.As we dope further away,we enter the axial stripe region,now dominated by configuration20first,then there is a cascade to configuration33,then a cascade to the seg-regated phase.This value of U is a truly intermediate value,where many different mechanisms for ordering are present and the system can change very rapidly in response to a modification in the density.8As U=2(Fig.5),we see more modifications in the phase diagram.Now we see other diagonal stripe phases mixing with the checkerboard phase near halffilling.This region would correspond to the Scalapino-White regime,where the stripe formation is driven more by interplays between the kinetic and potential energies and nesting effects(driven by charge fluctuations in the FK model and spinfluctuations in the Hubbard model). In addition,a much larger number of the2D phases enter also close to half filling,illustrating the prevalence of these“screen-like”phases as well.The axial stripes also enter as we dope further away from halffilling,but the configurations20and33are not nearly as stable as they are for slightly larger U.Here,we see the four-molecule phases being stabilized just before the system phase separates into the segregated phase.Finally,for U=1,shown in Fig.4,the predominance of the diagonal stripes,near halffilling increases now supplemented by the axial checker-board stripes,but then there is a plethora of different2D phases that also enter as the system is doped somewhat farther from halffilling,then we see a similar evolution,first to AS and then to4M phases before the segre-gated phase.Here there is a tremendous complexity to the phase diagram, with many different mixtures being present due to the competition be-tween kinetic energy and potential energy minimization brought about by the many-body aspects of the problem.The general picture,illustrated schematically in Fig.9,now emerges: near halffilling,we oftenfind diagonal stripes and screen-like two-dimensional phases,then a rapid transition to the segregated phase for large U.As U is reduced,we can dope farther away from halffilling before segregating,which allows many other phases to enter.In particular,there is a large region of stability for axial stripes,and as U is reduced further,we see the emergence of axial checkerboard stripes close to halffilling,near the diagonal stripes,and four-molecule phases appearing near the segregation boundary.IV.CONCLUSIONSIn this manuscript we have numerically studied how the FK model makes the transition from the checkerboard phase at halffilling to the segregated phase as the density is lowered.Since these two phases are very different from one another,there are many different pathways that one might imag-ine the system to take in making this crossover.Indeed wefind that the pathway varies dramatically as a function of U.For large U,we have a relatively simple transition between diagonal stripe-like phases which be-come more two-dimensional as the localized electron density increases,until the system gives way to the segregated phase.As U is lowered,wefirst see two-dimensional-“screen”-like phases enter,then we see axial stripes9emerge,followed by four-molecule phases and axial checkerboard stripes. The complexity of the phase diagram greatly increases as the interaction strength decreases.It is interesting to ask how we might expect these results to change if we allowed more configurations into our restricted phase diagram.We don’t know this answer in particular,but we do know,that of the23,755 candidate phases only a small fraction(111or0.5%)appear in the phase diagram,so we don’t expect too many additional phases to appear. Another interesting question to ask is how do these results for the FK model shed light on the stripe-formation problem in the Hubbard model. By continuity,we expect these results not to change too dramatically as we turn on a small hopping for the localized electrons(although now we must summarize our results in terms of correlation functions for the two kinds of electrons,since both are now mobile).But we also know for many fillings,there will be a“phase transition”as a function of the hopping, since the ground state of the Hubbard model is not ferromagnetic for all fillings and large U(which is what the segregated phase maps to in the Hubbard model).The results are likely to be closer to what happens in the Hubbard model close to halffilling,because the analogue of the anti-ferromagnetic phase is the checkerboard phase,and that is present for all U in the Hubbard model at T=0.In general,we also feel that the FK model phase diagram must be more complicated than the Hubbard model phase diagram because of the mobility of both electrons in the latter.We feel one of the most important results of this work is that there may be two-dimensional phases that are not stripe like that form ground-state con-figurations for some values of thefilling in the Hubbard model,and such configurations will be worthwhile to investigate with the numerical tech-niques that currently exist.In conclusion,we are delighted to be able to shed some light on the inter-esting question for the FK model of how one makes a transition from the checkerboard phase at halffilling to the segregated phase away from half filling.Since Elliott Lieb has had an important impact in proving the sta-bilization of these two phases,wefind itfitting to ask the questions about how the two phases inter-relate.Perhaps these numerical calculations can further inspire new rigorous work that helps to identify the pathway be-tween these two phases.V.ACKNOWLEDGMENTSR.L.and G.B.acknowledge support from the Polish State Committee for Scientific Research(KBN)under Grant No.2P03B13119and J.K.F. acknowledges support from the NSF under grant No.DMR-0210717.We also acknowledge support from Georgetown University for a travel grant in10the fall of 2001.[1] L. M. Falicov and J. C. Kimball, Simple Model for Semiconductor-Metal Transitions: SmB6 and Transition-Metal Oxides, Phys. Rev. Lett. 22, 997– 1000 (1969). [2] T. Kennedy and E. H. Lieb, An Itinerant Electron Model with Crystalline or Magnetic Long Range Order, Physica 138A, 320–358 (1986). [3] E. H. Lieb, A Model for Crystallization: A Variation on the Hubbard Model, Physica 140A, 240–250 (1986). [4] J. K. Freericks, E. H. Lieb, and D. Ueltschi, Phase Separation due to Quantum Mechanical Correlations, Phys. Rev. Lett. 88, 106401–1-4 (2002). [5] J. K. Freericks, E. H. Lieb, and D. Ueltschi, Segregation in the FalicovKimball model, Commun. Math. Phys. 227, 243–279 (2002). [6] J. K. Freericks and L. M. Falicov, Two-state one-dimensional spinless Fermi gas, Phys. Rev. B 41, 2163–2172 (1990). [7] U. Brandt and R. Schmidt, Exact Results for the Distribution of f–Level Ground State Occupation in the Spinless Falicov-Kimball Model, Z. Phys. B 63, 45–53 (1986). [8] U. Brandt and R. Schmidt, Ground State Properties of a Spinless FalicovKimball Model; Additional Features, Z. Phys. B 67, 43–51 (1987). [9] P. Lemberger, Segregation in the Falicov-Kimball model, J. Phys. A 25, 715– 733 (1992). [10] J. Hubbard, Electron correlations in narrow energy bands III. An improved solution, Proc. Roy. Soc. (London) A281, 401–419 (1965). [11] J. Hubbard, Electron correlations in narrow energy bands, Proc. Roy. Soc. (London) A276, 238–257 (1963). [12] C. H. Chen, S.-W. Cheong, and A. S. Cooper, Charge modulations in La2−x Srx NiO4+y : Ordering of polarons, Phys. Rev. Lett. 71, 2461–2464 (1993). [13] J. M. Tranquada, D. J. Buttrey, V. Sachan, and J. E. Lorenzo, Simultaneous Ordering of Holes and Spins in La2 NiO4.125 , Phys. Rev. Lett. 73, 1003–1006 (1994). [14] V. Sachan, D. J. Buttrey, J. M. Tranquada, J. E. Lorenzo, and G. Shirane, Charge and spin ordering in La2−x Srx NiO4.00 with x = 0.135 and 0.20, Phys. Rev. B 51, 12742–12746 (1995). [15] J. M. Tranquada, J. E. Lorenzo, D. J. Buttrey, and V. Sachan, Cooperative ordering of holes and spins in La2 NiO4.125 , Phys. Rev. B 52, 3581–3595 (1995). [16] J. M. Tranquada, B. J. Sternlieb, J. D. Axe, Y. Nakamura, and S. Uchida, Evidence for stripe correlations of spins and holes in copper oxide superconductors, Nature (London) 375, 561–563 (1995). [17] J. M. Tranquada, J. D. Axe, N. I. Y. Nakamura, S. Uchida, and B. Nachumi, Neutron-scattering study of stripe-phase order of holes and spins in La1.48 Nd0.4 Sr0.12 CuO4 , Phys. Rev. B 54, 7489–7499 (1996). [18] J. M. Tranquada, J. D. Axe, N. Ichikawa, A. R. Moodenbaugh, Y. Nakamura,11[19] [20] [21] [22] [23] [24][25] [26] [27][28] [29][30] [31][32] [33] [34] [35] [36] [37] [38]and S. Uchida, Coexistence of, and Competition between, Superconductivity and Charge-Stripe Order in La1.6−x Nd0.4 Srx CuO4 , Phys. Rev. Lett. 78, 338– 341 (1997). H. A. Mook, P. Dai, and F. Doˇan, Charge and Spin Structure in g YBa2 Cu3 O6.35 , Phys. Rev. Lett. 88, 097004–1-4 (2002). V. J. Emery, S. A. Kivelson, and H. Q. Lin, Phase separation in the t–J model, Phys. Rev. Lett. 64, 475–478 (1990). E. W. Carlson, S. A. Kivelson, Z. Nussinov, and V. J. Emery, Doped antiferromagnets in high dimension, Phys. Rev. B 57, 14704–14721 (1998). L. P. Pryadko, S. A. Kivelson, and D. W. Hone, Instability of Charge Ordered States in Doped Antiferromagnets, Phys. Rev. Lett. 80, 5651–5654 (1998). L. P. Pryadko, S. A. Kivelson, V. J. Emery, Y. B. Bazaliy, and E. A. Demler, Topological doping and the stability of stripe phases, Phys. Rev. B 60, 7541– 7557 (1999). S. R. White and D. J. Scalapino, Density Matrix Renormalization Group Study of the Striped Phase in the 2D t–J Model, Phys. Rev. Lett. 80, 1272– 1275 (1998). S. R. White and D. J. Scalapino, Energetics of Domain Walls in the 2D t–J Model, Phys. Rev. Lett. 81, 3227–3230 (1998). S. R. White and D. J. Scalapino, Competition between stripes and pairing in a t–t’–J model, Phys. Rev. B 60, R753–R756 (1999). S. R. White and D. J. Scalapino, Phase separation and stripe formation in the two-dimensional t–J model: A comparison of numerical results, Phys. Rev. B 61, 6320–6326 (2000). W. O. Putikka, M. U. Luchini, and T. M. Rice, Aspects of the phase diagram of the two-dimensional t–J model, Phys. Rev. Lett. 68, 538–541 (1992). W. O. Putikka and M. U. Luchini, Limits on phase separation for twodimensional strongly correlated electrons, Phys. Rev. B 62, 1684–1687 (2000). A. C. Cosentini, M. Capone, L. Guidoni, and G. B. Bachelet, Phase separation in the two-dimensional Hubbard model: A fixed-node quantum Monte Carlo study, Phys. Rev. B 58, R14685–R14688 (1998). M. Calandra, F. Becca, and S. Sorella, Charge Fluctuations Close to Phase Separation in the Two-Dimensional t–J Model, Phys. Rev. Lett. 81, 5185– 5188 (1998). C. S. Hellberg and E. Manousakis, Green’s-function Monte Carlo for lattice fermions: Application to the t–J model, Phys. Rev. B 61, 11787–11806 (2000). E. Dagotto, Correlated electrons in high-temperature superconductors, Rev. Mod. Phys. 66, 763–840 (1994). C. S. Hellberg and E. Manousakis, Stripes and the t–J Model, Phys. Rev. Lett. 83, 132–135 (1999). J. Zaanen and A. M. Ole´, Striped phase in the cuprates as a semiclassical s phenomenon, Ann. Phys. (Leipzig) 5, 224–246 (1996). D. G´ra, K. Rosciszewski, and A. M. Ole´, Electron correlations in stripe o s phases for doped antiferromagnets, Phys. Rev. B 60, 7429–7439 (1999). R. Lema´ski, J. K. Freericks, and G. Banach, Stripe Phases in the Twon Dimensional Falicov-Kimball Model, Phys. Rev. Lett. 89, 196403–1-4 (2002). G. I. Watson and R. Lema´ski, The ground-state phase diagram of the twon12。
Reading Material 12Principles of Momentum transfer1. IntroductionThe flow and behavior of fluid is important in many of the unit operations in process engineering. A fluid may be defined as a substance that does not permanently resist distortion and, hence, will change its shape. In this text gases, liquids, and vapors are considered to have the characteristics of fluids and to obey many of the same laws.2. Fluid FlowThe principles of the statics of fluids are almost an exact science. On the other hand, the principles of the motions of fluids are quite complex. The basic relations describing the motions of a fluid are the equations for the overall balances of mass, energy, and momentum, which will be covered in the following sections.The study of momentum transfer, or fluids mechanics as it is often called, can be divided into tow branches: fluid statics, or fluid at rest, and fluid dynamics, or fluids in motion. In other sections we treat fluid statics; in the remaining sections, fluid dynamics. Since in fluid dynamics momentum is being transferred, the term “momentum transfer” or “transfer” is usually used.In momentum transfer we treat the fluid as a continuous distribution of matter or as a “continuum”. This treatment as a continuum is valid when the smallest volume of fluid contains a large enough number of molecules so that a statistical average is meaningful and the macroscopic properties of the fluid such as density, pressure, and so on, vary smoothly or continuously from point to point.Like all physical matter, a fluid is composed of an extremely large number of molecules per unit volume. A theory such as the kinetic theory of gases or statistical mechanics treats the motions of molecules in terms of statistical groups and not in terms of individual molecules. In engineering we are mainly concerned with the bulk or macroscopic behavior of a fluid rather than with the individual molecular or microscopic behavior.In the process industries, many of the materials are in fluid form and must be stored, handled, pumped, and processed, so it is necessary that we become familiar with the principles that govern the flow of fluids and also with the equipment used. Typical fluids encountered include water, air, CO2, oil, slurries, and thick syrups.If a fluid is inappreciably affected by change in pressure, it is said to be incompressible. Most liquids are incompressible. Gases are considered to be compressible fluids. However, if gases are subjected to small percentage changes in pressure and temperature, their density changes will be small and they can be considered to be incompressible.3. Laminar and Turbulent FlowThe type of flow occurring in a fluid in a channel is important in fluid dynamics problems. When fluids move through a closed channel of any cross section, either of tow distinct types of flow can be observed according to the conditions present.These two types of flow can be commonly seen in a flowing open stream or river. When the velocity of flow is slow, the flow patterns are smooth. However, when the velocity is quite high, an unstable pattern is observed in which eddies or small packets of fluid particles are present moving in all directions and at all angles to the normal line of flow.The first type of flow at low velocities where the layers of fluid seem to slide by one another without eddies or swirls being present is called laminar flow and Newton ’s law of viscosity holds. The second type of flow at higher velocities where eddies are present giving the fluid a fluctuating nature is called turbulent flow. The existence of laminar and turbulent flow is most easily visualized by the experiments of Reynolds. Water was allowed to flow at steady state through a transparent pipe with the flow rate controlled by a valve at the end of the pipe.A fine steady stream of dye-colored water was introduced from a fine jet as shown and its flow pattern observed. At low rates of water flow, the dye pattern was regular and formed a single line or stream similar to a thread. There was no lateral of the fluid, and it flowed in streamlines down the tube. By putting in additional jets at other points in the pipe cross section, it was shown that there was no mixing in any parts of the tube and the fluid flowed in straight parallel lines. This type of flow is called laminar or viscous flow.As the velocity was increased, it was found that at a define velocity the thread of dye become dispersed and the pattern was very erratic. This type of flow is known as turbulent flow. The velocity at which the flow changes is known as the critical velocity.4. Reynolds NumberStudies have shown that the transition from laminar to turbulent flow in tubes is not only a function of velocity but also of density and viscosity of the fluid and the tube diameter. These variables are combined into the Reynolds number, which is dimensionless.R e =μρv DWhere Re is the Reynolds number, D the diameter in m, ρ the fluid density in kg/m3,u the fluid viscosity in Pa*s, and ν the average velocity of the fluid in m/s .(where average velocity is defined as the volumetric rate of flow divided by the cross sectional area of the pipe.)The instability of the flow that leads to disturbed or turbulent flow is determined by the ratio of the kinetic or inertial forces to the viscous forces in the fluid stream.The inertial forces are proportional to ρν2 and the viscous forces to D μν,and the ratio )(2D yv pv is the Reynolds number y vp DFor a straight circular pipe when the value of the Reynolds number is less than 2100, the flow is always laminar. When the value is over 4000, the flow will be turbulent, except in very special cases. In between, which is called the transition region, the flow can be viscous or turbulent, depending upon the apparatus details, whichcannot be predicted.5. Simple Mass BalancesIn fluid dynamics fluids are in motion. Generally, they are moved from place to place by means of mechanical devices such as pumps or blowers, by gravity head, or by pressure, and flow through systems of piping and/or process equipment .The first step in the solution of flow problem is generally to apply the conservation of mass to the whole system or to any part of the system. We will consider an elementary balance on a simple geometry. Simple mass or material balances were followed. Input=output + accumulationsince, in fluid flow, we are usually working with rates of flow and usually at steady state, the rate of accumulation is zero and we obtainrate of input = rate of outputWhen making overall balances on mass, energy, and momentum we are not interested in the details of what occurs inside the enclosure. For example, in an overall balance average inlet and outlet velocities are considered. However ,in a differential balance the velocity distribution inside an enclosure can be obtained with the use of Newton’s law of viscosity.These overall or macroscopic balances will be applied to a finite enclosure or control volume fixed in space. We use the term “overall” because we wish to describe these balances from outside the enclosure. The changes inside the enclosure are determined in terms of the properties of the streams entering and leaving and the exchanges of energy between the enclosure and its surroundings.阅读材料 12动量交换定理1. 介绍在许多工程的单元操作中流体的流量和状态是很重要的。
a r X i v :a s t r o -p h /9608068v 1 13 A u g 1996FERMILAB-Pub-96/067-AWeak Lensing and the Measurement of q 0from Type Ia SupernovaeJoshua A.Frieman NASA/Fermilab Astrophysics Center,Fermi National Accelerator Laboratory,P.O.Box 500,Batavia,IL 60510Department of Astronomy and Astrophysics,University of Chicago,Chicago,IL 60637ABSTRACT On-going projects to discover Type Ia supernovae at redshifts z ∼0.3−1,coupled with improved techniques to narrow the dispersion in SN Ia peak mag-nitudes,have renewed the prospects for determining the cosmic deceleration pa-rameter q 0.We estimate the expected uncertainty in the Hubble diagram deter-mination of q 0due to weak lensing by structure in the universe,which stochas-tically shifts the apparent brightness of distant standard candles.Although the results are sensitive to the density power spectrum on small scales,the induced flux dispersion σm <∼0.04Ω1/2m,0mag for sources at z ≤0.5,well below the “in-trinsic”spread of nearby SN Ia magnitudes,σM ≃0.2mag.Thus,density inhomogeneities do not significantly impact the current program to measure q 0,in contrast to a recent claim.If,however,light-curve shape and other calibrators can reduce the effective intrinsic spread to 0.1mag at high z ,then weak lensing could increase the observed spread by 30%in an Ωm,0=1universe for SNe atz >∼1.Subject headings:cosmology:large-scale structure of the universe,supernovae1.IntroductionThe determination of the cosmological parameters via the “classical”cosmological tests,such as the redshift-magnitude relation (Hubble diagram),remains a holy grail for observa-tional astronomy.Past attempts to measure the deceleration parameter q 0using galaxies as standard candles have foundered on the uncertainty in galaxy luminosity evolution (Tinsley1972,Peebles1993).Recently,there has been renewed hope that q0can be determined, thereby constraining the mean mass density and the cosmological constant,by using distant Type Ia supernovae(e.g.,Goobar&Perlmutter1995).Progress has come on two fronts.First,there is growing evidence that samples of SNe Ia,when suitably culled to excise peculiar lightcurves or spectra and cases of significant host-galaxy extinction,may provide reasonably good standard candles,with a dispersion σM≃0.3in peak absolute magnitude.Moreover,using the observed correlation between lightcurve shape and peak luminosity(Hamuy etal.1995,Riess,Press,&Kirshner1995)as well as spectroscopic features that have been found to correlate with luminosity(Branch, Nugent,&Fisher1996,Nugent,etal.1995),the effective dispersion of these‘standardized candles’can apparently be reduced toσM≃0.1−0.2mag.Second,several groups have begun observing campaigns to discover a large number of high-redshift Ia supernovae,with coordinated follow-up programs to measure the lightcurves of SN candidates;more than20 candidates at z∼0.4−0.6have been found so far(e.g.,Perlmutter,etal.1995,Schmidt, etal.1995,Perlmutter,etal.1996).The case for SNe Ia as a population of‘standard’candles is also on reasonable physical footing.While the details of the explosion mechanism remain poorly understood,there is general consensus that SNe Ia are thermonuclear explosions of accreting Carbon-Oxygen white dwarfs.On the other hand,it is not yet clear whether the white dwarf progenitors must be near the Chandrasekhar mass(e.g.,Hoeflich etal.1996)or not(Livne&Arnett 1995),and thus the theoretical intrinsic spread in SN Ia luminosity is still a matter of some debate.In either case,there is optimism that,based on the correlations above,SNe Ia can be used to measure extragalactic distances and thereby to determine the cosmological parameters.Recently,however,Kantowski,Vaughan,&Branch(1995)have argued that large-scale structure provides a stumbling block to the accurate determination of q0from standard can-dles.A bundle of light rays from a distant source is sheared and focused due to deflection by intervening density inhomogeneities.Since gravitational lensing conserves surface brightness, the change in the cross-section of the bundle implies that the image of the source appears magnified(or demagnified)relative to a homogeneous ing the“swiss-cheese”model of large-scale structure,Kantowski etal.find that the resulting change in received flux(apparent magnitude)could lead to a systematic underestimate of q0if one interprets the observations assuming a homogeneous universe.For example,for a source at z=0.458 (the redshift of SN1992bi(Perlmutter,etal.1995)),theyfind that the resulting error in q0 can be as large asδq0≃−0.33q0.This claim is consonant with other studies which found that large-scale structure can significantly change the proportionality between angular-sizedistance and redshift,and therefore lead to difficulties,e.g.,for the determination of H0from gravitational lens time delays(Kantowski1969,Dyer&Roeder1972,Alcock&Anderson 1985,1986,Watanabe etal.1992,Sasaki1993).In this Comment,we reevaluate this issue by estimating the rmsfluctuation in the am-plification of distant sources in a perturbed Friedmann-Robertson-Walker(FRW)universe. Wefind that the expected effects are smaller than those found by Kantowski,etal.,of order several percent at most for sources at z<∼0.5,and subdominant in comparison to the∼20 %intrinsic spread of nearby SN Ia magnitudes.Moreover,flux conservation implies that the magnification shift is random,with zero mean over all lines of sight(Weinberg1976), not a systematic offset.As a result,the amplification due to large-scale structure will not seriously impact the accuracy of q0measurements in current surveys.The primary reason for this different conclusion is that the swiss-cheese model does not conform to our current understanding of the large-scale mass distribution of the uni-verse.In particular,it does not accurately reflect the observational information gained from galaxy redshift and peculiar velocity surveys and the cosmic microwave background(CMB) anisotropy.This point has been made recently by Seljak(1994)and Bar-kana(1995)in the context of the determination of H0from QSO lens time delays.The argument that these effects should be small has also been made qualititatively by Peebles(1993).The issue was first laid out clearly by Gunn(1967),who showed that the rmsfluctuation in the apparent brightness of a distant source can be expressed as a radial integral of the two-point correlation function of the mass distribution.Gunn’s argument was updated to reflect the substantial advances in our understanding of large-scale structure in more recent numerical(Jaroszynski, etal.1990)and analytic(Babul&Lee1991)studies(unbeknownst to the present author until this work was completed).In fact,Babul&Lee’s‘DP’model is close to the model for the power spectrum that we adopt below,although there are some important quantitative differences which we outline later.From the smallness of the CMB anisotropy on large scales to observations of galaxies and galaxy clusters on small scales,we have strong indications that the spacetime metric of the universe is well-described by a weakly perturbed FRW model.In the longitudinal gauge, the line element can be writtends2=a2(τ) −(1+2φ)dτ2+(1−2φ)[dχ2+F2(χ)(dθ2+sin2θdφ2)] (1)where a(τ)is the cosmic scale factor,χdenotes the comoving radial coordinate,andτ= dt/a is the conformal time,withτ0denoting the present.The function F(χ)dependson the spatial curvature K:F(χ)=K−1/2sin K1/2χfor K>0,F=χfor K=0,andF=(−K)−1/2sinh(−K)1/2χfor K<0.The curvature can be expressed in terms of the present density parameter and the Hubble parameter H0=H(τ0),K=(Ω0−1)H20a20, whereΩ=Ωm+ΩΛincludes both non-relativistic matter(m)and vacuum energy density (the cosmological constantΛ)andΩm=¯ρ/ρc=8πG¯ρ/3H2,with¯ρ(τ)the mean density of matter.The metric perturbation variableφis the relativistic analog of the Newtonian gravitational potential;over scales less than the Hubble length H−1it obeys the Poisson equation,3∇2φ=∇⊥φ(χ,τ=τ0−χ)dχ.(3)F(χs)Here we have implicitly replaced the perturbed by the unperturbed path in the integral; while this is not a good approximation(e.g.,Frieman,Harari,&Surpi1994),we will only make use of the weaker assumption that the statistical properties of the potentialfieldφare identical along the perturbed and unperturbed paths(e.g.,Kaiser1992).The amplification of the observed image relative to the(unlensed)source is given by A=1/det M,where the2×2amplification matrix M ij=∂βi/∂θj=δij+Φij,and Φij=∂δθi/∂θj.We can decompose the deformation tensor asΦij=−κδij+γij,where the trace(κ),the expansion,describes the uniform dilation or contraction of ray bundles,and the traceless part(γij)describes their shear.In the limit of small deflection angles,we thus haveA=1/[(1−κ)2−γ2]≃1+2κ,where±γare the eigenvalues ofγij andκ=−(Φ11+Φ22)/2. Theflux perturbation is thereforeδA=A−1=−TrΦ.Using eqns.(2,3),wefindδA=3(H0a0)2Ωm,0a(τ0−χ)δ(χ,τ=τ0−χ)(4)whereΩm,0is the present matter density parameter.Tofind the rmsflux amplification along a random line of sight,we assumeδ(x,τ)canbe described as a continuous,homogeneous random process;this ignores discreteness in themass density,an excellent approximation since the dark matter very likely consists of objectsof mass less than106M⊙.It is convenient to Fourier-transform the densityfield,δ(χ)=(2π)−3 d3kδ(k)exp(i k·χ),where theflat-space transform suffices for the small angles we are considering.Defining the density power spectrum via δ(k)δ∗(k′) =(2π)3P(k)δ3D(k−k′),we have(definingσ2A= (δA)2 )σ2A=9πΩ2m,0(H0a0)4 χs0dχF2(χ)F2(χs−χ)k dkH0a0 zdz′small scales,although with the caveat that the galaxy and mass distributions may differ significantly on these scales(see below).A third more radical possibility is that the observed clustering is merely“painted on”and expands with the Hubbleflow;in this case,ǫ=0. For the numerical estimates below,we will thus consider the rangeǫ=0−2.Since the rms amplification is dominated by structure in the non-linear regime,we expectǫ≃1.2to be the most accurate representation of the evolution.Recent N-body simulations confirm that the growth factor is intermediate between the stable clustering and linear regimes on small scales at recent epochs(Colin,Carlberg,&Couchman1996),while it may be faster than linear(ǫ>2)on intermediate scales;in either case,the results presented below constitute an upper limit on the lensing effect.To model the present power-spectrum on small scales,we could simply Fourier trans-form the galaxy correlation function,ξg(r)=(r/r0)−1.8,where the galaxy correlation length r0,g≃5.4h−1Mpc.However,this does not take into account the bias between the galaxy and mass distributions,i.e.,the likelihood that light does not trace mass on these scales. To remedy this,we can incorporate dynamical information.From the(simplified version of the)cosmic virial theorem(Peebles1980,1993),the predicted pairwise velocity disper-sion on small scales is roughlyσ2v(r)∼[3/2(3−γ)]H20Ωm,0r2(r0/r)γ.Assuming the matter correlation function has the same slope as that of the galaxies,this yields an estimate for the density correlation length,r0≃5.4h−1Mpc(0.1/Ωm,0)0.55(σv(1h−1Mpc)/300km/sec)1.1. The standard estimates of the galaxy pairwise velocity dispersion at r≃1h−1Mpc sep-aration have beenσv∼300km/sec(Davis&Peebles1983,Bean,etal.1983).Recent redshift surveys,however,have yielded higher values,σv∼500−700km/sec(Guzzo,etal. 1995,Marzke,etal.1995).Moreover,Somerville,Davis,&Primack(1996)and Somerville, Primack,&Nolthenius(1996)have found that estimates ofσv(r)in both simulations and galaxy catalogs show large scatter.However,when the cores of rich clusters,which contain only a small fraction of the mass,are excluded,it appears thatσv∼300km/sec.In any case,to obtain an upper bound for the weak lensing effect,we will takeσv(1h−1Mpc)=650 km/sec,implying∆2(k)≃8.7(k/h Mpc−1)1.8Ω−1m,0(σv(1)/650)2.Note that the assumption made here that the galaxy and density correlation functions have the same small-scale slope is not well motivated on highly non-linear scales,and the results below should be interpreted with this caveat.For this model of the power spectrum,and in general forγ>1,the k-integral in(5) diverges in the ultraviolet and must be cut-off.Physically,the slope of the density correlation function must fall below unity below some length scale(even though the galaxy correlation function is an unbrokenγ≃1.8power law down to r∼10h−1kpc).Thisflattening is expected to happen at least at the scale of individual galaxy halos(Kaiser1992):below this scale,ξ(r)is dominated by correlations within individual halos,yieldingξ(r)∼r−(2ν−3)for halos with density profileρ(r)∼r−ν(Peebles1974,McClelland&Silk1977,Sheth& Jain1996);for nearly isothermal halos,ν≃2,the corresponding slope isγ≃1.Up to logarithmic corrections,we model this effect by imposing a cutoffin(5)at the approximate halo scale,k c=1/(100h−1kpc).Using this model in(5),the rmsflux perturbation atfixed z s scales approximately asΩ1/2m,0(σv(1)/650)(k c/10h Mpc−1)0.4.Fig.1shows the dispersion in apparent magnitude for distant standard candles as a function of redshift,σm=1.086σA mag,for cosmological models withΩm,0=0.1,0.3,and 1.ForΩm,0=1,we show results forǫ=0,1.2,and2to bracket the plausible range of evolution models.For the other cases,we show results forǫ=1.2only;since structure formation freezes out early inΩm,0<1models(at1+z f≃Ω−1m,0),stable clustering should be a reliable prescription here.ForΩm,0=0.3,we show results for a spatiallyflat model with non-zero cosmological constant,ΩΛ=0.7,and for an open model withΛ=0.Since the imposition of a sharp cutoffin the density power spectrum appears to be a rather crude approximation,as a check we have also explored models for the small-scale power spectrum based on the cold dark matter(CDM)model and CDM with non-zero Λ,with scale-invariant primordialfluctuations,extended into the non-linear regime using scaling formulae derived from N-body simulations(Hamilton etal.1991,Peacock&Dodds 1994,Jain,Mo,&White1995,Baugh&Gaztanaga1996,Peacock&Dodds1996).In these models,P(k)scales as k at small wavenumber but turns over to k−3at large k,yielding much better convergence for theflux dispersionσA.We normalize such models by requiring that they reproduce the observed galaxy cluster X-ray temperature distribution function according to the predictions of Press-Schechter theory(White,Efstathiou,&Frenk1993); for CDM models,this corresponds approximately to imposing the constraint that the linear,where theory rms massfluctuation in a sphere of radius8h−1Mpc isσ8=0.6Ω−C(Ωm,0,Λ)m,0C=0.36+0.31Ωm,0−0.28Ω2m,0for open models(Λ=0)and C=0.59−0.16Ωm,0+0.06Ω2m,0 for spatiallyflat(non-zeroΛ)models(Viana&Liddle1995).The results for the amplification dispersion in these cluster-normalized CDM models are shown in Fig.2for the same model parameters as in Fig.1.The results forσA agree reasonably well with those of the power-law model,although they are somewhat higher for the models withΩm,0<1.This difference is due in part to the fact that the cluster normalization yields more small-scale power in these models than the cosmic virial theorem normalization for the corresponding power-law model.Thus,while the uncertainties inσv and in the accuracy of the cosmic virial theorem are still significant,this comparison suggests that our estimate forσA should be accurate to within a factor of two.The implications of weak lensing for the determination of q0follow directly from Figs. 1and2.For sources at z≤0.7,σm≤0.06,well below the“intrinsic”0.2−0.3mag spreadin nearby SN Ia magnitudes.At z≪1,from the redshift-magnitude relation,the resulting ‘1σ’uncertainty in the deceleration parameter,q0=Ωm,0/2−ΩΛ,for a single source atredshift z isσq0≃σA/z.For example,for z=0.5andΛ=0,wefindσq≃0.1q1/20(<∼0.07forΩm,0≤1).In the future,if SN Ia searches discover sources at z>∼1,then weak lensing may be a significant factor.In particular,if light-curve shape and other correlators with peak magnitude can reduce the effective intrinsic spread to0.1mag(even at high z), then densityfluctuations could increase the observed dispersion in anΩm,0=1universe for sources at z=1by of order30%.Since the amplification is caused by the foreground mass distribution,one could in principle use the angular correlation function ofδA to probe the large-scale mass power spectrum;in practice,this will require thousands of well-measured SNe Ia at redshifts z>∼0.5spread over hundreds of square degrees.This may be possible with the Next Generation Space Telescope.The results shown here can be compared to those of Babul&Lee(1991).While they considered only the Einstein-de SitterΩm,0=1case,we have presented results for arbitrary Ωm andΛ.Quantitatively,their‘DP’model for the power spectrum is quite similar to that used here,but it overestimated the rms amplificationσA because it did not include the constraint from galaxy pairwise velocities on small scales.They also presented results for the Ω=1CDM model with bias factor b≃2.5;this normalization,which reflected earlier,lower estimates of the small-scale pairwise velocity dispersion,is now known to be unacceptably low and therefore substantially underestimatesσA.Thus,while their results bracket those here,the estimates in Figs.1and2should be more accurate.Afinal issue of concern for these surveys is amplification bias.A fraction of the SNe in a magnitude-limited survey are strongly amplified(δA≫σA)by foreground mass concentra-tions very close to the line of sight;this would cause large systematic errors in the estimate of q0.However,the lensing galaxy or cluster responsible could generally be seen and such events removed from the sample.Even if the lenses are too faint to be detected,the probability for such strong lensing events at moderate redshift,z s<∼1,is known to be very small,based on the low incidence of multiply imaged QSOs.While the optical depthτfor significant amplification by foreground galaxies(e.g.,δA>∼0.1)is much higher than that for multiple imaging,it is still quite low.For example,forΩm,0=1and assuming isothermal galaxy halos extend to r>∼50h−1kpc,integration over the galaxy luminosity function with the Faber-Jackson relation givesτ(δA>0.1,z s=1)≃6.5×10−3for amplification by individual foreground galaxies;for a non-zero cosmological constant satisfyingΩΛ=1−Ωm<0.8,τ(>0.1,1)<0.02.The result of these near encounters is a very small non-Gaussian tail in the amplification distribution at largeδA.On the other hand,if a substantial fraction of the cosmic density is in compact objects,the high amplification tail can be much more significant(Schneider&Wagoner1987,Linder,Schneider,&Wagoner1988,Rauch1991).Ifhigh amplification(>∼1mag)events are not soon found in the high-redshift SN Ia searches, one will be able to constrain the contribution toΩfrom objects of mass>∼10−3M⊙.For completeness,I note that weak lensing does not affect the shape of SN Ia lightcurves: for a source at redshift z s,the fall-offfrom the peak is given by the usual time dilation factor,∆t=∆t i(1+z s),where∆t i is the fall-offtimescale in the SN rest frame.Note added:After this work was accepted for publication,two recent numerical efforts have clarified the weak lensing effects of large-scale structure(Wambsganss,etal.1996, D.E.Holz and R.M.Wald,to be published).Using ray-shooting techniques through N-body and phenomenologically constructed universes,it has been directly confirmed that the meanflux from a distant standard candle is that given by the standard FRW luminosity distance.However,the distribution of the amplification is not symmetric about this mean. The majority of lines of sight pass through regions underdense compared to the mean, leading to modest de-amplification,while a smaller number of rays pass near dense mass concentrations and suffer substantial amplification.Thus,for a survey with afinite number of sources,there can be a small bias in the results for q0,but the effect is very small at the current redshifts(z<∼0.6)probed by the SN Ia surveys.I thank D.Holz,A.Olinto,S.Perlmutter,U.Seljak,J.Silk,A.Stebbins,and R.Wald for conversations,and the Institute for Nuclear and Particle Astrophysics(LBL)and the Center for Particle Astrophysics(UC Berkeley)for hospitality while this work was being completed.This research was supported in part by the DOE and by NASA grant NAG5-2788at Fermilab.REFERENCESAlcock,C.&Anderson,N.1985,ApJ,291,L29Alcock,C.&Anderson,N.1986,ApJ,302,43Babul,A.&Lee,M.H.1991,MNRAS,250,407Bar-kana,R.1995,ApJ,to appearBaugh,C.&Gaztanaga,E.1996,preprint,astro-ph/9601085Bean,A.J.,etal.1983,MNRAS,205,605Blandford,R.D.,Saust,A.B.,Brainerd,T.G.,&Villumsen,J.1991,MNRAS,251,600Branch,D.,Nugent,P.,&Fisher,A.1996,in“Thermonuclear Supernovae”eds.Canal,R., Ruiz-Lapuente,P.,&Isern,J.(Dordrecht,Kluwer Academic Press),astro-ph/9601006 Colin,P.,Carlberg,R.,&Couchman,H.1996,preprint astro-ph/9604071.Davis,M.&Peebles,P.J.E.1983,ApJ,267,465Dyer,C.C.&Roeder,R.C.1972,ApJ,172,L115Frieman,J.,Harari,D.,&Surpi,G.1994,Phys.Rev.D,50,4895Goobar,A.G.&Perlmutter,S.1995,ApJ,450,14Guzzo,L.,Fisher,K.B.,Strauss,M.S.,Giovanelli,R.,&Haynes,M.P.1995,in Astroph.Lett.&Communications,Proc.of3rd Italian Cosmology Meeting,astro-ph/9503114 Gunn,J.E.1967,ApJ,150,737Hamilton,A.J.S.,Kumar,P.,Lu,E.,&Matthews,A.1991,ApJ,374,L1Hamuy,M.,Phillips,M.M.,Maza,J.,Suntzeff,N.B.,Schommer,R.,&Aviles,R.1995, AJ,109,1Hoeflich,P.,Khokhlov,A.,Wheeler,J.C.,Nomoto,K.,&Thielemann,F.K.1996,in “Thermonuclear Supernovae”eds.Canal,R.,Ruiz-Lapuente,P.,&Isern,J.(Dor-drecht,Kluwer Academic Press),astro-ph/9602017Jain,B.,Mo,H.J.,&White,S.D.M.1995,MNRAS,276,L25Jaroszynski,M.,Park,C.,Paczynski,B.,&Gott,J.R.1990,ApJ,365,22Kaiser,N.1992,ApJ,388,272Kantowski,R.1969,ApJ,166,89Kantowski,R.,Vaughan,T.,&Branch,D.1995,ApJ,447,35Linder,E.,Schneider,P.,&Wagoner,R.1988,ApJ,324,786Livne,E.&Arnett,D.1995,ApJ,452,62Marzke,R.,Geller,M.,da Costa,L.,&Huchra,J.1995,AJ,110,477Mattig,W.1958,Astron.Nachr.,284,109McClelland,J.&Silk,J.1977,ApJ,217,331Nugent,P.,Phillips,M.,Baron,E.,Branch,D.,&Hauschildt,P.1995,ApJ,455,147 Peacock,J.A.&Dodds,S.J.1994,MNRAS,267,1020Peacock,J.A.&Dodds,S.J.1996,preprint,astro-ph/9603031Peebles,P.J.E.1974,A&A,32,1974Peebles,P.J.E.1980,The Large-Scale Structure of the Universe,(Princeton:Princeton University Press)Peebles,P.J.E.1993,Principles of Physical Cosmology,(Princeton:Princeton University Press)Perlmutter,S.,Pennypacker,C.,Goldhaber,G.,Goobar,A.etal.1995,ApJ,440,L41 Perlmutter,S.,etal.1996,in“Thermonuclear Supernovae”eds.Canal,R.,Ruiz-Lapuente, P.,&Isern,J.(Dordrecht,Kluwer Academic Press)Pyne,T.&Birkinshaw,M.1996,ApJ,458,46Rauch,K.1991,ApJ,374,83Riess,A.G.,Press,W.H.,&Kirshner,R.P.1995,ApJ,438,17Sasaki,M.1993,Prog.Theor.Phys.,90,753Schmidt,B.,etal.1995,IAU Circular No.6160.Seljak,U.1994,ApJ,436,509Seljak,U.1996,ApJ,463,1Sheth,R.&Jain,B.1996,preprint,astro-ph/9602103Schneider,P.&Wagoner,R.1987,ApJ,314,154Somerville,R.,Davis,M.,&Primack,J.1996,preprint astro-ph/9604041.Somerville,R.,Primack,J.,&Nolthenius,R.,preprint astro-ph/9604051.Tinsley,B.M.1972,ApJ,178,319Viana,P.&Liddle,A.1995,preprint astro-ph/9511007.Villumsen,J.,preprint,astro-ph/9503011Wambsganss,J.,Cen,R.,Hu,G.,&Ostriker,J.1996,preprint astro-ph/9607084. Watanabe,K.,Sasaki,M.,&Tomita,K.1992,ApJ,394,38Weinberg,S.1976,ApJ,208,L1White,S.,Efstathiou,G.,&Frenk,C.1993,MNRAS,262,1023.Fig.1.—Fig.1.Dispersion influx(in magnitudes)for a source at redshift z,forΩm,0=0.1, 0.3,1.0.Solid curves assume stable clustering,ǫ=1.2.ForΩm,0=1,dashed curve corresponds to linear theory evolution,ǫ=2,and dotted curve corresponds to‘painted on’structure(ǫ=0).Lower curves markedΩm,0=0.1,0.3are open models;curve marked ‘Λ’is aflat universe withΩΛ=0.7.The small-scale power spectrum has been cut offat k c=1/(100h−1kpc);the dispersion scales asσ∝k0.4c.Fig. 2.—Fig. 2.Flux dispersion vs.redshift for models with the same cosmological parameters as in Fig.1,but for the CDM model of structure formation instead of the phenomenological power-law model.The models are standard CDMΩm,0=1,h=0.5 (solid),aflatΛCDM model withΩΛ=0.7,h=0.7(dotted),and an open CDM model with Ωm,0=0.3,h=0.7(dashed).。
Le´vy flights in external force fields:Langevin and fractional Fokker-Planck equations and their solutionsSune Jespersen *Institute of Physics and Astronomy,University of Århus,8000Århus C,DenmarkRalf Metzler †School of Chemistry,Tel Aviv University,69978Tel Aviv,IsraelHans C.Fogedby ‡Institute of Physics and Astronomy,University of Århus,8000Århus C,Denmarkand NORDITA,Blegdamsvej 17,2100Ko ”benhavn O ”,Denmark͑Received 14October 1998͒We consider Le´vy flights subject to external force fields.This anomalous transport process is described by two approaches,a Langevin equation with Le´vy noise and the corresponding generalized Fokker-Planck equation containing a fractional derivative in space.The cases of free flights,constant force,and linear Hookean force are analyzed in detail,and we corroborate our findings with results from numerical simulations.We discuss the non-Gibbsian character of the stationary solution for the case of the Hookean force,i.e.,the deviation from Boltzmann equilibrium for long times.The possible connection to Tsallis’s q statistics is studied.͓S1063-651X ͑99͒09903-1͔PACS number ͑s ͒:05.40.Fb,05.60.Cd,02.50.Ey,05.70.LnI.INTRODUCTIONIn recent years there has been growing interest in anoma-lous diffusion in various fields of physics and related sci-ences.In one dimension anomalous diffusion is character-ized by a mean square displacement of the form͗͑⌬x ͒2͘ϰ2Dt ␥,͑1͒deviating from the linear dependence on time found for Brownian motion ͓1–3͔.The generalized diffusion constant has the dimension ͓D ͔ϭcm 2sec Ϫ␥.Subdiffusive transport (0Ͻ␥Ͻ1)is encountered in a di-versity of systems,including the charge carrier transport in amorphous semiconductors ͓4,5͔,NMR diffusometry on per-colation structures ͓6͔,and the motion of a bead in a polymer network ͓7͔.On fractal structures in general,subdiffusion prevails due to the occurrence of holes of all length scales ͓2͔.Examples of enhanced diffusion (␥Ͼ1)include tracer particles in vortex arrays in a rotating flow ͓8͔,layered ve-locity fields ͓9͔,and Richardson diffusion ͓10͔.Le´vy flights are used to model a variety of processes such as bulk mediated surface diffusion ͓11͔and applications in porous glasses and eye lenses ͓12͔,transport in micelle sys-tems or heterogeneous rocks ͓13͔,special problems in reac-tion dynamics ͓14͔,single molecule spectroscopy ͓15͔,and even the flight of an albatross ͓16͔.Among the different frameworks for describing anoma-lous diffusion are fractional Brownian motion ͓17͔,the con-tinuous time random walk scheme ͓4,18͔,fractional diffusion equations ͓19,20͔,generalized Langevin and Fokker-Planck equations ͑FPEs ͓͒21–25͔,and generalized thermostatistics ͓26,27͔.Common to all these approaches is the violation of the central limit theorem of probability theory ͓28,29͔,and this is achieved either by correlations or by long-tailed sta-tistics.Le´vy statistics ͓28,29͔is an example of the latter,and has been used extensively to model both enhanced and dis-persive diffusion ͓3,18,30͔.The two most fundamental prop-erties of the Le´vy distributions are the stability under addi-tion,following from the generalized central limit theoremvalid for Le´vy distributions,and the asymptotic power-law decay.These features are responsible for the anomalous character of the diffusion processes we have in mind.A fractional Fokker-Planck equation ͑FFPE ͒describing anomalous transport close to thermal equilibrium was pre-sented recently ͓25͔.Since it describes subdiffusion in the force-free case,it involves a strong,i.e.,slowly decaying,memory.In the present paper,we focus on FFPEs which areconnected with Le´vy flights,and are based on the following Langevin equation for the coordinate x (t )͓21,22͔:d dt x ͑t ͒ϭF ͑x ͒␥mϩ͑t ͒.͑2͒Here,m is the mass of the diffusing particle,and ␥denotes afriction coefficient.F (x )is the external force field.For sim-plicity we shall work in one dimension,with obvious modi-fications in the general case.The noise (t )is the source of the anomalous behavior.We assume (t )to be uncorrelatedat different times,and to obey Le´vy statistics ͓31͔.In Fourier space we thus define the characteristic function p (k )of the noise variablep ͑k ͒ϭ͵d e Ϫik p ͑͒ϭexp ͑ϪD ͉k ͉͒,͑3͒*Electronic address:sune@ifa.au.dk†Electronic address:metzler@post.tau.ac.il ‡Electronic address:fogedby@ifa.au.dkPHYSICAL REVIEW E MARCH 1999VOLUME 59,NUMBER 3PRE 591063-651X/99/59͑3͒/2736͑10͒/$15.002736©1999The American Physical Societywhere0ϽϽ2.The probability density function͑PDF͒p()has an asymptotic power-law behavior according to p()ϳ͉͉Ϫ1Ϫ͓28,29,32͔.Here and in the following,we denote the Fourier transform of a function by using the ex-plicit dependence on the wave number k,and analogously ufor the Laplace transform of a t-dependent function.For the special caseϭ2in Eq.͑3͒,i.e.,for a Gaussian noise,we are led back to the Brownian case.In this case2D is the variance of the PDF,but even in the general case the param-eter D characterizes the width of the PDF in some sense.The Langevin equation,Eq.͑2͒,is a stochastic differentialequation.Often it is more convenient to work with the de-terministic equation for the distribution function,the FPE ͓33,34͔.For the power-law noise(t)defined through Eq.͑3͒,we are led to the following FFPE͓21,22͔:ץץt W͑x,t͒ϭϪץץxͩF͑x͒W͑x,t͒␥mͪϩDٌW͑x,t͒.͑4͒Here,D denotes the generalized diffusion coefficient with the dimension͓D͔ϭcmsecϪ1.The Riesz fractional de-rivative in Eq.͑4͒is defined through its Fourier transform ͓35,36ٌ͔ϭϪ͵d d k͑2͒d e i k•x͉k͉͑5͒in d dimensions.Note that in the FFPE Eq.͑4͒thefirst order differential operator acting upon the force term is not af-fected by the introduction of the Le´vy distribution Eq.͑3͒, see Ref.͓37͔where a unifying derivation of FFPEs from a generalized master equation is discussed.In the following we will consider the FFPE Eq.͑4͒for the three cases of the freeflight Fϭ0,the constant force F(x)ϭF0,and the Hookean force F(x)ϭϪx,comparing to the Brownian case as we go along.We will discuss the differ-ences from the subdiffusive FFPE of Ref.͓25͔,where a frac-tional operator in time is encountered and the spatial part of the standard FPE remains unchanged,as well as the possible connection to Tsallis’s q statistics.Numerical simulations corroborate our theoreticalfindings.We then exemplify the method of solution for the Langevin equation,Eq.͑2͒,for a linear force with an additional drift term.Before drawing the conclusions,we give some remarks on the simulations.Some additional calculations on the nature of the correlation func-tions are presented in the Appendix.II.FREE LE´VY FLIGHTIn this case we have to solve the anomalous diffusion equationץץt W͑x,t͒ϭDٌW͑x,t͒.͑6͒Fourier transforming Eq.͑6͒and utilizing the definition of the fractional Riesz operator,Eq.͑5͒,we haveץץt W͑k,t͒ϭϪD͉k͉W͑k,t͒,͑7͒with the solutionW͑k,t͒ϭeϪDt͉k͉,͑8͒demanding the sharp initial condition x(0)ϭ0,correspond-ing to W(x,0)ϭ␦(x)or W(k,0)ϭparing to Eq.͑3͒we recognize the characteristic function of the Le´vy distri-bution,and we thusfind in real space the stable law L:W͑x,t͒ϭ͑Dt͒Ϫ1/L͉ͩx͉͓Dt͔1/ͪϭ͉x͉H2,21,1͉ͫx͉͑Dt͒1/ͯ͑1,1/͒,͑1,1/2͒͑1,1͒,͑1,1/2͒ͬ.͑9͒In Eq.͑9͒,we have expressed the Le´vy distribution exactly in terms of Fox’s H functions͓38,39͔.This result,Eq.͑9͒,is expected,due to the stable law nature of the underlying Le´vy distribution.The asymptotic behavior of the propagator W(x,t)can be derived from Eq.͑9͒and readsW͑x,t͒ϳDt͉x͉1ϩ͑10͒for͉x͉/͓Dt͔ӷ1,and thus we encounter a divergence of the mean square displacement at all times:͗x2(t)͘ϭϱ.This is intuitively clear due to the occurrence of arbitrarily long jumps in the Le´vyflight,see Fig.1.Mathematically,the divergence is evident from Eq.͑8͒by using the properties of the characteristic function:͗x n(t)͘ϭi n d n W(k,t)/dk n͉kϭ0.In order to extract the scaling form implied by Eq.͑9͒operationally,one could enclose the‘‘walker’’in an imagi-nary growing box͑see Sec.VI͒:͗x2͑t͒͘Lϳ͵L1t1/L2t1/dxx2W͑x,t͒ϳt2/.͑11͒This has been implemented numerically,and,as can be seen from Fig.2,where we have a straight line on a log-log plot of͗x2(t)͘L as a function of t,for afixed,the expected power-law index2/according to Eq.͑11͒is found.This scaling result͗x2(t)͘Lϳt2/is not to be confused with the mean square displacement͗x2(t)͘ϭ͐dxx2W(x,t) FIG.1.Typical Le´vyflight for the Le´vy indexϭ1.4.The clustering is obvious.Each cluster is statistically self-similar to the unmagnified picture.The fractal dimension of theflight is d fϭ͓3,21͔.PRE592737LE´VY FLIGHTS IN EXTERNAL FORCE FIELDS:...ϭϱfor the Le´vy flight,resulting from Eq.͑10͒.However,for Ͼ1,the squared absolute mean͉͗x ͑t ͉͒͘2ϭͩ͵dx ͉x ͉W ͑x ,t ͒ͪ2͑12͒converges,and is proportional to ͗x 2(t )͘L from Eq.͑11͒,seethe discussion in Ref.͓40͔.Figure 3shows this proportion-ality for ϭ1.5͓41͔.In Fig.4,we graph the curve 2/as a function of for avariety of values for the Le ´vy index ,and we obtain excel-lent agreement with Eq.͑11͒.In the case ϭ2we see from Eq.͑8͒and the results of the simulations in Fig.4that the usual Brownian behavior is recovered.Especially,we obtain the mean square displace-ment͗x 2͑t ͒͘ϭ2Dt .͑13͒The properties of free Le´vy flights could also be obtained directly from Eq.͑2͒employing the method of characteristic functions.This view also allows us to extract the distributionof speeds,which turns out to be Le´vy distributed.From this likewise follows the mean kinetic energy,and we have for a finite mass m of the walker in the case Ͻ2͗1m v 2͘ϭϱ.͑14͒III.CONSTANT FORCE:DRIFT AND ACCELERATIONFor a constant force F (x )ϭF 0,the FFPE Eq.͑4͒readsץץt W ͑x ,t ͒ϭϪץץx ͩF 0W ͑x ,t ͒␥mͪϩD ٌW ͑x ,t ͒.͑15͒Returning to the Fourier domain,we recover the equationץץt W ͑k ,t ͒ϭͩϪik F 0␥mϪD ͉k ͉ͪW ͑k ,t ͒,͑16͒which for the propagator,i.e.,W (k ,t )with the initial condi-tion W (k ,0)ϭ1,yieldsW ͑k ,t ͒ϭexp ͩϪt ͫikF 0␥mϩD ͉k ͉ͬͪ.͑17͒This is the same Le´vy distribution as calculated for the free Le´vy flight in Eq.͑8͒,but at the translated coordinate W ͑x ,t ͒ϭW 0ͩx ϪF 0t␥m,t ͪ.͑18͒Here W 0refers to the distribution of the free Le´vy flight.The displacement of the coordinate is due to the balancing of the friction against the imposed constant force,i.e.,␥m v ϭF 0,in the Galilei transformed system x →x ϪF 0t /͓␥m ͔.Clearly,the analytical form for the solution in x space is still given byEq.͑9͒,but now for the translated coordinate.Thus Le´vy flights in a constant force field described by Eq.͑15͒are similar to ͑anomalous ͒diffusion in a constant velocityfieldFIG.2.Function ͗x 2(t )͘L from Eq.͑11͒versus time with ϭ1in a log-log plot.The slope of the straight line is 1.990Ϯ0.028,which is to be compared to the expected value 2/ϭ2.FIG.3.Log-log plot of ͗x 2(t )͘L versus time from Eq.͑11͒͑lower curve ͒,compared to the squared absolute mean ͉͗x (t )͉͘2from Eq.͑12͒͑upper curve ͒for ϭ1.5.The fitted slopes are 1.363Ϯ0.013and 1.364Ϯ0.035,respectively,which is in good agreement with the theoretical value4/3.FIG.4.Graph of the slope 2/according to Eq.͑11͒as a func-tion of the Le´vy index .Note the bend at ϭ2marking the transition to normal diffusion.2738PRE 59SUNE JESPERSEN,RALF METZLER,AND HANS C.FOGEDBY͓42,43͔.The reason we go directly to the steady state de-scribed by ␥m v ϭF 0is due to the omission of the inertial term in the Langevin equation,Eq.͑2͒.By including this term it can be shown that we obtain a transient contribution.Thus the diffusion in a force field is only equivalent to dif-fusion in a constant velocity field for large times,according to the discussion in Ref.͓42͔,see below.In fact the same situation is encountered even for the standard diffusion-advection picture ͓42–44͔.For long waiting periods in the random walk picture for the subdiffusive model,the external force can only act upon the walker,when it is released from a trap,which leads to the sublinear dependence ͗x (t )͘ϰt ␥found in Ref.͓42͔,where ␥denotes the power-law index of the broad waiting time distribution.The Poissonian waiting time distribution,on the other hand,which is typical for the Le´vy flyer,leads to the same behavior,Eq.͑21͒below ͑for 1Ͻр2),as known from the Brownian case,due to the very short waiting periods in between jumps.The long-tailednature of the Le´vy flight,however,makes all higher mo-ments infinite.Both effects lead to an accelerated time de-pendence of the motion.Including the inertial term in the Langevin equation,Eq.͑2͒,changes the solution for short times according tox ͑t ͒ϭF 0m ͩt Ϫ1ϪeϪ␥t␥ͪϩ͵tds ͑s ͒͑1Ϫe Ϫ␥͑t Ϫs ͒͒.͑19͒At times much greater than the characteristic time ␥Ϫ1,we havex ͑t ͒ӍF 0m ␥t ϩ͵tds ͑s ͒,͑20͒which follows from the behavior of the Laplace transform of the convolution integral in the second summand of Eq.͑19͒:͓(u )/u ͔͓1Ϫu /(u ϩ␥)͔ϳ(u )/u in the limit u Ӷ␥.Thus the effect of the inertial term is negligible for times t much greater than ␥Ϫ1.If the first moment exists,that is,for 1Ͻр2,we find from Eq.͑17͒͗x ͑t ͒͘ϭF 0t␥m.͑21͒Only for ϭ2do we have a finite second moment,and the mean square displacement becomes Š͓x (t )Ϫ͗x (t )͔͘2‹ϭ2Dt ,in agreement with Eqs.͑13͒and ͑18͒.For the standard FPE as well as for the subdiffusive FFPE introduced in ͓25͔,one finds the generalized Einstein relation ͓25,45,44͔͗x (t )͘F 0ϭF 0͗x 2(t )͘0/(2k B T ),relating the first moment in the presence of the force F 0to the second mo-ment in the absence of the force.For the Le´vy flight model defined in Eq.͑15͒,only in the Brownian limit ϭ2is this relation satisfied,provided we choose the proper amplitude of the noise,i.e.,take it to be thermal noise:D ϭk B T /͓␥m ͔.Generally since we have a diverging mean square displacement,the generalized Einstein relation does not hold,and we have a violation of the classical fluctuation-dissipation theorem.IV.LINEAR FORCE AND NON-GIBBSIANSTATIONARY SOLUTIONIn the case of ordinary Brownian motion the diffusingparticles can be trapped in a harmonic potential and thus attain an equilibrium distribution with a finite variance ͓44,33͔.More precisely this equilibrium distribution is the Gibbs or Boltzmann distribution,also obtainable from maxi-mizing the Gibbs entropy under the constraints of norm and energy conservation.This property is also fulfilled for sub-diffusive transport in a harmonic potential ͓25͔.For Le´vy flights we shall see that a stationary solution does exist,how-ever,it possesses no finite variance.This deviation from theGibbs-Boltzmann equilibrium implies that Le´vy flights do not describe systems close to thermal equilibrium.For the Hookean force F (x )ϭϪx ,corresponding to theharmonic potential V (x )ϭ12x 2,the FFPE Eq.͑4͒becomesץץt W ͑x ,t ͒ϭץץx ͩ␥mxW ͑x ,t ͒ͪϩD ٌW ͑x ,t ͒.͑22͒In Fourier space,the conjugate equation readsץץt W ͑k ,t ͒ϭϪ␥m k ץץkW ͑k ,t ͒ϪD ͉k ͉W ͑k ,t ͒,͑23͒which can be easily solved by making a transformation of variables ͑applying the method of characteristics ͒W ͑k ,t ͒ϭexp ͩϪ␥mD ͉k ͉͓1Ϫe Ϫt /␥m ͔ͪ.͑24͒This is still a Le´vy distribution,only with a different ‘‘width’’D →(␥mD /)(1Ϫe Ϫt /␥m ),and the exact so-lution in real space can again be obtained from Eq.͑9͒by inserting the time-dependent width.For ϭ2we recover the Brownian results,but in the general case Ͻ2a different situation arises.We always reach a stationary distributionW st ͑k ͒ϭexp ͩϪ␥mD ͉k ͉ͪ,͑25͒but with a diverging mean square.The exact stationary solu-tion in x space can be given in terms of Fox’s H functions:W st ͑x ͒ϭ͉x ͉H 2,21,1͉ͫx ͉D ␥m ͯ͑1,1͒,͑1,/2͒͑1,͒,͑1,/2͒ͬ,͑26͒leading to the asymptotic power-law behavior W st (x )ϳD ␥m /(͉x ͉1ϩ).A numerical result for a simulation of a Le´vy flight in a harmonic potential is shown in Fig.5.The slope is in good agreement with the theoretical prediction.We pause to mention that we could have derived the so-lution,Eq.͑24͒,also by means of a separation ansatz,i.e.,assuming a particular solution of the form W n (x ,t )ϭT n (t )n (x ),as was discussed in Refs.͓25,42͔.Thus we arrive at the ordinary differential equationsddtT ͑t ͒ϭϪn T ͑t ͒,͑27a ͒PRE 592739LE´VY FLIGHTS IN EXTERNAL FORCE FIELDS:...n n ͑x ͒ϩd dx ͫ␥mx n ͑x ͒ͬϩD ٌn ͑x ͒ϭ0,͑27b ͒with the eigenvalue n .The complete solution is then givenby the sum W (x ,t )ϭ͚n ϭ0ϱW n (x ,t ).For the time behavior we find the usual exponentially decaying modes,T n (t )ϭe Ϫn t ,and for the spatial eigenfunction we haven ͑k ͒ϭc n ͉k ͉n ␥m /e Ϫ͉k ͉/.͑28͒The eigenvalues are given by n ϭ(/␥m )n ,and the com-plete solution in wave number space is given by W ͑k ,t ͒ϭeϪD ␥m ͉k ͉/[]͚n ϭ0ϱ1n !ͩD ␥m ͪn͉k ͉n e Ϫn t /[␥m ].͑29͒The sum converges to Eq.͑24͒,and can be transformed back to real space by use of the Fox functions:W ͑x ,t ͒ϭ͚n ϭ0ϱ1n !ͩD ␥mͪn͉x ͉n /[␥m ]e Ϫn t /[␥m ]ϫH 2,21,1ͫD ␥m ͉x ͉ͯ͑1Ϫn /␥m ,1͒,͑1,/2͒͑1,͒,͑1,/2͒ͬ.͑30͒In comparison to Risken’s result for the Fokker-Planck equa-tion in a harmonic potential in the Brownian case ͓33͔,alsoreferred to as the Ornstein-Uhlenbeck process,the eigenval-ues in the solution,Eq.͑30͒,for ϭ2take on only even numbers.This is due to our consideration of the start in the origin,so that all the uneven Hermite polynomials occurring in the solution given in Ref.͓33͔vanish:H 2n ϩ1(0)ϭ0.We note that the Fox functions in Eq.͑30͒can be considered as generalized Hermite polynomials.Clearly,the stationary so-lution corresponding to 0ϭ0obtained from Eq.͑30͒is thesame stable law as in Eq.͑26͒.Also it can be seen that only in the Brownian case ϭ2do we recover the Boltzmanndistribution W (x )ϰe Ϫx 2/[2k B T ].Thus Boltzmann equilib-rium with a finite variance is not reached,in spite of the fact that the system is isolated and time independent in respect to the ensemble.This can be physically understood as a conse-quence of the diverging mean kinetic energy of the free Le´vy flight.In this case we haveddtv ͑t ͒ϭϪ␥v ͑t ͒ϩ␥͑t ͒.͑31͒If we take the noise to be white according to Eq.͑3͒with ϭ2,we get from Eq.͑31͒that in the stationary state͗v 2͘ϭD ␥⇔͗E kin ͘ϭm ␥D2.͑32͒When the external harmonic potential is turned on,a length scale is introduced by the comparison ͗E kin ͘Ϸ͗E pot ͘ϭ12͗x 2͘.This means that ͗x 2͘ϷmD ␥/.In fact,solving Eq.͑2͒in the Brownian case with the harmonic potential,we obtain exactly ͗x 2͘ϭmD ␥/,in accordance with the equi-partition theorem.Similar considerations remain valid for the subdiffusive FFPE from Ref.͓42͔,which thus describes anomalous systems close to thermal equilibrium.However,in the Le´vy case we have ͗E kin ͘ϭϱ,͑33͒and therefore the length scale which appears when the po-tential is introduced is also diverging.Consequently the question arises whether other statistics could predict the equilibrium distribution in the present context.Here we con-sider the recently proposed Tsallis q entropy ͓26͔,according to which the generalized entropyS q ͓p ͑x ,v ͔͒ϭ1Ϫ͵p q ͑x ,v ͒dxd v q Ϫ1͑34͒is introduced along with the generalized constraints ͑see ͓46͔͒͵p ͑x ,v ͒dxd v ϭ1,͑35a ͒͵p q ͑x ,v ͒E ͑x ,v ͒dxd v ϭU .͑35b ͒For q →1,S q recovers the usual Boltzmann entropy.Here v is the velocity of the particle and E (x ,v )is its energy.Thus Eq.͑35b ͒is a generalized constraint of conservation of energy along with the usual norm conservation,Eq.͑35a ͒.Varying Eq.͑34͒subject to these constraints by introducing Lagrange multipliers,one obtains the stationary distributionp q ͑x ,v ͒ϳͩ1Ϫ͑1Ϫq ͒2͓x 2ϩm v 2͔ͪ1/͑1Ϫq ͒.͑36͒Integrating over all velocities to obtain the distribution of positions aloneyieldsFIG.5.Histogram for the stationary solution W st of a Le´vy flight in a harmonic potential versus ͉x ͉,as a plot of ln W st (y )ver-sus y ,where y ϭln(x )denotes the natural logarithm of the positionof the flyer,see Sec.VI.The data were produced for a Le´vy index ϭ1.4.The fit indicated by the dashed line reveals a slope of Ϫ1.408Ϯ0.108,which thus shows a good agreement with the the-oretical prediction Ϫ.2740PRE 59SUNE JESPERSEN,RALF METZLER,AND HANS C.FOGEDBYp q͑x͒ϳͩ1Ϫ͑1Ϫq͒2x2ͪ͑3Ϫq͒/͑2Ϫ2q͒,͑37͒compare to Ref.͓27͔.Matching this expression to the asymptotic behavior of the stationary solution of the FFPE, Eq.͑25͒,we obtainϭ4Ϫ2qqϪ1.͑38͒The relationship Eq.͑38͒implies that q can range in the interval(1,2).Equation͑38͒is at variance with the relation ϭ(3Ϫq free)/(q freeϪ1)found in the case of the free Le´vyflight͓27,30͔,where the allowed range is q free(5/3,3). Note that the q-relation does not involve the potential strength.Thus,the Tsallis index q makes a jump for a given Le´vy index,when the potential is switched off,ir-respectively of how slowly,e.g.,quasistatically,the limit→0is performed.Moreover,away from the asymptotic re-gime,the solution predicted by the entropy Eq.͑34͒does not agree with the solution found in Eq.͑25͒in the stationary state.Thus we conclude that the Tsallis entropy is not the appropriate framework for Le´vyflights in a harmonic poten-tial described by the generalized Fokker-Planck equation, Eq.͑4͒.This form of a generalized entropy does not give rise to the solution in Eq.͑25͒.Recently it has also been shown that Tsallis q entropy is one out of an entire family of gen-eralized entropies with similar properties͓47͔,so that it would have been rather surprising if this special case of en-tropy had led to the complete description according to the Le´vyflight model.Finally,the FFPE Eq.͑4͒being linear is not compatible with the nonextensive nature of Tsallis en-tropy͓26͔;compare with the nonlinear diffusion equation derived from Tsallis entropy in Ref.͓48͔.By comparing the distribution of the particle in the har-monic potential W(x,t)with that of the freeflight W0(x,t), we obtain the correspondenceW͑x,t͒ϭW0͑x,t eff͒,͑39͒where we definet effϵm␥͑1ϪeϪt/[␥m]͒.͑40͒Thus the distribution of the particle position in a harmonic potential can be obtained from the distribution in the free Le´vyflight case at an earlier,‘‘effective’’time t eff.This comparison illustrates the slowing down of the particle in the harmonic potential,where the restoring force is centered to-wards the origin.It characterizes in a precise way the ap-proach to stationarity,which is graphed in Fig.6.Further-more,this approach is seen to take longer the smalleris; the quickest relaxation occurs in the Brownian caseϭ2.We conclude this section by some remarks about the in-ertial term encountered for the Hookean force.There are two time scales͑decay times͒involved in this case:sϪ1ϵ␥2ͩ1Ϫͱ1Ϫ4m␥2ͪ,͑41͒fϪ1ϵ␥2ͩ1ϩͱ1Ϫ4m␥2ͪ.͑42͒Considering only the case of large overdamping,i.e.,␥2ӷ4/m,the two time scales separate into a fast and a slowdecaying mode,f→␥Ϫ1ands→␥m/,withfӶs.Itcan be easily shown that neglecting the fast mode for timeslong compared tof corresponds to neglecting the inertialterm in the original Langevin equation.Thus we have aneffective separation into three different regimes,and being inthe second withfӶtϽs,the approach to the stationarystate is not influenced by the omission of the inertial term.V.SOLUTION OF THE LANGEVIN EQUATIONAll of the above results could have been reached equallywell directly from the Langevin equation,Eq.͑2͒.To illus-trate this,we solve this equation for a Le´vyflight in a con-stant forcefield in addition to a linear force,F(x)ϭϪxϩF0corresponding,for instance,to a harmonic potentialand a superimposed gravityfield.However,it could alsocorrespond to many harmonic oscillators placed at differentpositions,i.e.,F͑x͒ϭ͚iϭ1NϪi͑xϪx i͒ϭϪ͚ͩiϭ1Niͪxϩ͚iϭ1Ni x iϵϪxϩF0.͑43͒The solution of Eq.͑2͒is given byx͑t͒ϭeϪt/[␥m]͵0t dtЈetЈ/[␥m]ͩ͑tЈ͒ϩF0␥mͪ.͑44͒The distribution can then be found using theidentityFIG.6.The linear time affecting the free walker͑dashed line͒,compared to the effective time sensed by the random walker in theharmonic potential,as a function of the laboratory time.For theLe´vyflight in the potential,the restoring Hookean force slowsdown the spreading of the diffusion,and eventually brings it to ahalt.PRE592741LE´VY FLIGHTS IN EXTERNAL FORCE FIELDS:...p ͑x ,t ͒ϭ͗␦…x Ϫx ͑t ͒…͘ϭ͵dk͑2͒͗exp ͕ik ͓x Ϫx ͑t ͔͖͒͘ϭ͵dk ͑2͒e ikxp ͑k ,t ͒.͑45͒Using the solution Eq.͑44͒to obtain,for the characteristic function p (k ,t ),p ͑k ,t ͒ϭͳexp ͩϪie Ϫt /[␥m ]k͵tdt Јe t Ј/[␥m ]ϫͫ͑t Ј͒ϩF 0␥mͬͪʹ,͑46a ͒we have after discretizing the integral p ͑k ,t ͒Ӎe ϪikF 0/͑1ϪeϪt /[␥m ]͒ϫͳ͟t Јϭ0texp ͓Ϫie Ϫ͑t Ϫt Ј͒/[␥m ]⌬k ͑t Ј͔͒ʹ͑46b ͒orp ͑k ,t ͒Ӎe ϪikF 0/͑1ϪeϪt /[␥m ]͒ϫ͟t Јϭ0t͗exp ͓Ϫie Ϫ͑t Ϫt Ј͒/[␥m ]⌬k ͑t Ј͔͒͘.͑46c ͒Using the definition of Le´vy noise from Eq.͑3͒,we obtain p ͑k ,t ͒Ӎe ϪikF 0/͑1ϪeϪt /[␥m ]͒ϫ͟t Јϭ0texp ͑ϪDe Ϫ͑t Ϫt Ј͒/[␥m ]⌬͉k ͉͒.͑46d ͒Reintroducing the integrals and using the same renormaliza-tion D ⌬Ϫ1→D as in passing from Eq.͑2͒to Eq.͑4͒,wefinally havep ͑k ,t ͒ϭe ϪikF 0/͑1ϪeϪt /[␥m ]͒ϫexp ͩϪD ␥m ͓1ϪeϪt /[␥m ]͔͉k ͉ͪ.͑47͒For ϭ0,we recover the constant force result,Eq.͑17͒.ForF 0ϭ0we get Eq.͑24͒.The free Le´vy flight result W 0(x ,t )according to Eq.͑8͒is likewise reproduced when ϭF 0ϭ0.In fact,by comparing to Eq.͑8͒we have the correspon-denceW ,F 0͑x ,t ͒ϭW 0ͩx ϪF 0͓1Ϫe Ϫt /[␥m ]͔,ϫ␥m͓1Ϫe Ϫt /[␥m ]͔ͪ.͑48͒In the presence of the harmonic potential we can no longer simply make a Galilean transformation in order to eliminate the constant force,since the presence of the linear force singles out a special reference frame.VI.SOME REMARKS ON THE NUMERICALSIMULATIONSUsing a computer code written in C ,we have simulated Le´vy flights in two dimensions in order to compare with the theoretical predictions.The amplitude of the noise has beendefined by the asymptotics of the Le´vy distribution to p ͑͒ϭ0Ϫ1Ϫ.͑49͒We have basically investigated three properties:͑1͒histo-grams of the distribution of position for the free walker,͑2͒histograms of the walker in a harmonic potential,and ͑3͒the dynamic exponent as defined via the imaginary box in Eq.͑11͒.The imaginary box according to Eq.͑11͒grows in time like the characteristic width of the stable distribution,͗x 2(t )͘L ϳt 2/,which is not the variance.It gives a measure,that a finite portion of the probability is gathered within a given interval,which we call the imaginary box.The values of L 1and L 2have been chosen so as to ͑1͒ensure that we are in the asymptotic regime,where W (x ,t )ϳt ͉x ͉Ϫ1Ϫ,and ͑2͒to produce good statistics for all values of .When fitting the results to a straight line on a log-log plot,we have se-lected a subset of equidistant points from the entire set of data,in order not to favor the high t region over the low t region.Concerning the histogram,several precautions have to be taken when working with power-law statistics.First of all,due to the occurrence of arbitrarily long steps,we have to define the interval of sampling beforehand,since this is the only way to improve the statistics when increasing the num-ber of samples.We have chosen a minimum limit for the evaluated data points to ensure the asymptotic range.The maximum limit has been chosen as the maximum of the first,say,100͑out of a total of 10000͒simulations.In this way we obtain many data points throughout the entire region,where the asymptotic power-law expression is valid.We close with a remark on the axes of the plots of the histograms.To obtain equidistant points on the log-log plots,we chose to graph the histograms of the distributions p (y )of y ,where y ϭln x .If now p (x )ϳ͉x ͉Ϫ1Ϫ,we have p (y )ϳe Ϫy .Plotting the logarithm of p (y )as a function of y ,we find a straight line with slope Ϫ,which is not to be con-fused with the slope Ϫ1Ϫfor the log-log plot of p (x ).VII.CONCLUSIONSWe have investigated Le´vy flights under the influence of external force fields.Especially for the cases of free flights,a constant and a linear force,explicit solutions are derived and the consequences shown.The solutions can be reduced to a transformation of variables in the free flight result.We have employed the approaches of FFPEs and a Langevin equation with a power-law noise term.We have shown that the clas-sical fluctuation-dissipation theorem is violated in the case of constant force,and exhibited the non-Gibbsian nature of the2742PRE 59SUNE JESPERSEN,RALF METZLER,AND HANS C.FOGEDBY。
A Survey of Clustering Data Mining TechniquesPavel BerkhinYahoo!,Inc.pberkhin@Summary.Clustering is the division of data into groups of similar objects.It dis-regards some details in exchange for data simplifirmally,clustering can be viewed as data modeling concisely summarizing the data,and,therefore,it re-lates to many disciplines from statistics to numerical analysis.Clustering plays an important role in a broad range of applications,from information retrieval to CRM. Such applications usually deal with large datasets and many attributes.Exploration of such data is a subject of data mining.This survey concentrates on clustering algorithms from a data mining perspective.1IntroductionThe goal of this survey is to provide a comprehensive review of different clus-tering techniques in data mining.Clustering is a division of data into groups of similar objects.Each group,called a cluster,consists of objects that are similar to one another and dissimilar to objects of other groups.When repre-senting data with fewer clusters necessarily loses certainfine details(akin to lossy data compression),but achieves simplification.It represents many data objects by few clusters,and hence,it models data by its clusters.Data mod-eling puts clustering in a historical perspective rooted in mathematics,sta-tistics,and numerical analysis.From a machine learning perspective clusters correspond to hidden patterns,the search for clusters is unsupervised learn-ing,and the resulting system represents a data concept.Therefore,clustering is unsupervised learning of a hidden data concept.Data mining applications add to a general picture three complications:(a)large databases,(b)many attributes,(c)attributes of different types.This imposes on a data analysis se-vere computational requirements.Data mining applications include scientific data exploration,information retrieval,text mining,spatial databases,Web analysis,CRM,marketing,medical diagnostics,computational biology,and many others.They present real challenges to classic clustering algorithms. These challenges led to the emergence of powerful broadly applicable data2Pavel Berkhinmining clustering methods developed on the foundation of classic techniques.They are subject of this survey.1.1NotationsTo fix the context and clarify terminology,consider a dataset X consisting of data points (i.e.,objects ,instances ,cases ,patterns ,tuples ,transactions )x i =(x i 1,···,x id ),i =1:N ,in attribute space A ,where each component x il ∈A l ,l =1:d ,is a numerical or nominal categorical attribute (i.e.,feature ,variable ,dimension ,component ,field ).For a discussion of attribute data types see [106].Such point-by-attribute data format conceptually corresponds to a N ×d matrix and is used by a majority of algorithms reviewed below.However,data of other formats,such as variable length sequences and heterogeneous data,are not uncommon.The simplest subset in an attribute space is a direct Cartesian product of sub-ranges C = C l ⊂A ,C l ⊂A l ,called a segment (i.e.,cube ,cell ,region ).A unit is an elementary segment whose sub-ranges consist of a single category value,or of a small numerical bin.Describing the numbers of data points per every unit represents an extreme case of clustering,a histogram .This is a very expensive representation,and not a very revealing er driven segmentation is another commonly used practice in data exploration that utilizes expert knowledge regarding the importance of certain sub-domains.Unlike segmentation,clustering is assumed to be automatic,and so it is a machine learning technique.The ultimate goal of clustering is to assign points to a finite system of k subsets (clusters).Usually (but not always)subsets do not intersect,and their union is equal to a full dataset with the possible exception of outliersX =C 1 ··· C k C outliers ,C i C j =0,i =j.1.2Clustering Bibliography at GlanceGeneral references regarding clustering include [110],[205],[116],[131],[63],[72],[165],[119],[75],[141],[107],[91].A very good introduction to contem-porary data mining clustering techniques can be found in the textbook [106].There is a close relationship between clustering and many other fields.Clustering has always been used in statistics [10]and science [158].The clas-sic introduction into pattern recognition framework is given in [64].Typical applications include speech and character recognition.Machine learning clus-tering algorithms were applied to image segmentation and computer vision[117].For statistical approaches to pattern recognition see [56]and [85].Clus-tering can be viewed as a density estimation problem.This is the subject of traditional multivariate statistical estimation [197].Clustering is also widelyA Survey of Clustering Data Mining Techniques3 used for data compression in image processing,which is also known as vec-tor quantization[89].Datafitting in numerical analysis provides still another venue in data modeling[53].This survey’s emphasis is on clustering in data mining.Such clustering is characterized by large datasets with many attributes of different types. Though we do not even try to review particular applications,many important ideas are related to the specificfields.Clustering in data mining was brought to life by intense developments in information retrieval and text mining[52], [206],[58],spatial database applications,for example,GIS or astronomical data,[223],[189],[68],sequence and heterogeneous data analysis[43],Web applications[48],[111],[81],DNA analysis in computational biology[23],and many others.They resulted in a large amount of application-specific devel-opments,but also in some general techniques.These techniques and classic clustering algorithms that relate to them are surveyed below.1.3Plan of Further PresentationClassification of clustering algorithms is neither straightforward,nor canoni-cal.In reality,different classes of algorithms overlap.Traditionally clustering techniques are broadly divided in hierarchical and partitioning.Hierarchical clustering is further subdivided into agglomerative and divisive.The basics of hierarchical clustering include Lance-Williams formula,idea of conceptual clustering,now classic algorithms SLINK,COBWEB,as well as newer algo-rithms CURE and CHAMELEON.We survey these algorithms in the section Hierarchical Clustering.While hierarchical algorithms gradually(dis)assemble points into clusters (as crystals grow),partitioning algorithms learn clusters directly.In doing so they try to discover clusters either by iteratively relocating points between subsets,or by identifying areas heavily populated with data.Algorithms of thefirst kind are called Partitioning Relocation Clustering. They are further classified into probabilistic clustering(EM framework,al-gorithms SNOB,AUTOCLASS,MCLUST),k-medoids methods(algorithms PAM,CLARA,CLARANS,and its extension),and k-means methods(differ-ent schemes,initialization,optimization,harmonic means,extensions).Such methods concentrate on how well pointsfit into their clusters and tend to build clusters of proper convex shapes.Partitioning algorithms of the second type are surveyed in the section Density-Based Partitioning.They attempt to discover dense connected com-ponents of data,which areflexible in terms of their shape.Density-based connectivity is used in the algorithms DBSCAN,OPTICS,DBCLASD,while the algorithm DENCLUE exploits space density functions.These algorithms are less sensitive to outliers and can discover clusters of irregular shape.They usually work with low-dimensional numerical data,known as spatial data. Spatial objects could include not only points,but also geometrically extended objects(algorithm GDBSCAN).4Pavel BerkhinSome algorithms work with data indirectly by constructing summaries of data over the attribute space subsets.They perform space segmentation and then aggregate appropriate segments.We discuss them in the section Grid-Based Methods.They frequently use hierarchical agglomeration as one phase of processing.Algorithms BANG,STING,WaveCluster,and FC are discussed in this section.Grid-based methods are fast and handle outliers well.Grid-based methodology is also used as an intermediate step in many other algorithms (for example,CLIQUE,MAFIA).Categorical data is intimately connected with transactional databases.The concept of a similarity alone is not sufficient for clustering such data.The idea of categorical data co-occurrence comes to the rescue.The algorithms ROCK,SNN,and CACTUS are surveyed in the section Co-Occurrence of Categorical Data.The situation gets even more aggravated with the growth of the number of items involved.To help with this problem the effort is shifted from data clustering to pre-clustering of items or categorical attribute values. Development based on hyper-graph partitioning and the algorithm STIRR exemplify this approach.Many other clustering techniques are developed,primarily in machine learning,that either have theoretical significance,are used traditionally out-side the data mining community,or do notfit in previously outlined categories. The boundary is blurred.In the section Other Developments we discuss the emerging direction of constraint-based clustering,the important researchfield of graph partitioning,and the relationship of clustering to supervised learning, gradient descent,artificial neural networks,and evolutionary methods.Data Mining primarily works with large databases.Clustering large datasets presents scalability problems reviewed in the section Scalability and VLDB Extensions.Here we talk about algorithms like DIGNET,about BIRCH and other data squashing techniques,and about Hoffding or Chernoffbounds.Another trait of real-life data is high dimensionality.Corresponding de-velopments are surveyed in the section Clustering High Dimensional Data. The trouble comes from a decrease in metric separation when the dimension grows.One approach to dimensionality reduction uses attributes transforma-tions(DFT,PCA,wavelets).Another way to address the problem is through subspace clustering(algorithms CLIQUE,MAFIA,ENCLUS,OPTIGRID, PROCLUS,ORCLUS).Still another approach clusters attributes in groups and uses their derived proxies to cluster objects.This double clustering is known as co-clustering.Issues common to different clustering methods are overviewed in the sec-tion General Algorithmic Issues.We talk about assessment of results,de-termination of appropriate number of clusters to build,data preprocessing, proximity measures,and handling of outliers.For reader’s convenience we provide a classification of clustering algorithms closely followed by this survey:•Hierarchical MethodsA Survey of Clustering Data Mining Techniques5Agglomerative AlgorithmsDivisive Algorithms•Partitioning Relocation MethodsProbabilistic ClusteringK-medoids MethodsK-means Methods•Density-Based Partitioning MethodsDensity-Based Connectivity ClusteringDensity Functions Clustering•Grid-Based Methods•Methods Based on Co-Occurrence of Categorical Data•Other Clustering TechniquesConstraint-Based ClusteringGraph PartitioningClustering Algorithms and Supervised LearningClustering Algorithms in Machine Learning•Scalable Clustering Algorithms•Algorithms For High Dimensional DataSubspace ClusteringCo-Clustering Techniques1.4Important IssuesThe properties of clustering algorithms we are primarily concerned with in data mining include:•Type of attributes algorithm can handle•Scalability to large datasets•Ability to work with high dimensional data•Ability tofind clusters of irregular shape•Handling outliers•Time complexity(we frequently simply use the term complexity)•Data order dependency•Labeling or assignment(hard or strict vs.soft or fuzzy)•Reliance on a priori knowledge and user defined parameters •Interpretability of resultsRealistically,with every algorithm we discuss only some of these properties. The list is in no way exhaustive.For example,as appropriate,we also discuss algorithms ability to work in pre-defined memory buffer,to restart,and to provide an intermediate solution.6Pavel Berkhin2Hierarchical ClusteringHierarchical clustering builds a cluster hierarchy or a tree of clusters,also known as a dendrogram.Every cluster node contains child clusters;sibling clusters partition the points covered by their common parent.Such an ap-proach allows exploring data on different levels of granularity.Hierarchical clustering methods are categorized into agglomerative(bottom-up)and divi-sive(top-down)[116],[131].An agglomerative clustering starts with one-point (singleton)clusters and recursively merges two or more of the most similar clusters.A divisive clustering starts with a single cluster containing all data points and recursively splits the most appropriate cluster.The process contin-ues until a stopping criterion(frequently,the requested number k of clusters) is achieved.Advantages of hierarchical clustering include:•Flexibility regarding the level of granularity•Ease of handling any form of similarity or distance•Applicability to any attribute typesDisadvantages of hierarchical clustering are related to:•Vagueness of termination criteria•Most hierarchical algorithms do not revisit(intermediate)clusters once constructed.The classic approaches to hierarchical clustering are presented in the sub-section Linkage Metrics.Hierarchical clustering based on linkage metrics re-sults in clusters of proper(convex)shapes.Active contemporary efforts to build cluster systems that incorporate our intuitive concept of clusters as con-nected components of arbitrary shape,including the algorithms CURE and CHAMELEON,are surveyed in the subsection Hierarchical Clusters of Arbi-trary Shapes.Divisive techniques based on binary taxonomies are presented in the subsection Binary Divisive Partitioning.The subsection Other Devel-opments contains information related to incremental learning,model-based clustering,and cluster refinement.In hierarchical clustering our regular point-by-attribute data representa-tion frequently is of secondary importance.Instead,hierarchical clustering frequently deals with the N×N matrix of distances(dissimilarities)or sim-ilarities between training points sometimes called a connectivity matrix.So-called linkage metrics are constructed from elements of this matrix.The re-quirement of keeping a connectivity matrix in memory is unrealistic.To relax this limitation different techniques are used to sparsify(introduce zeros into) the connectivity matrix.This can be done by omitting entries smaller than a certain threshold,by using only a certain subset of data representatives,or by keeping with each point only a certain number of its nearest neighbors(for nearest neighbor chains see[177]).Notice that the way we process the original (dis)similarity matrix and construct a linkage metric reflects our a priori ideas about the data model.A Survey of Clustering Data Mining Techniques7With the(sparsified)connectivity matrix we can associate the weighted connectivity graph G(X,E)whose vertices X are data points,and edges E and their weights are defined by the connectivity matrix.This establishes a connection between hierarchical clustering and graph partitioning.One of the most striking developments in hierarchical clustering is the algorithm BIRCH.It is discussed in the section Scalable VLDB Extensions.Hierarchical clustering initializes a cluster system as a set of singleton clusters(agglomerative case)or a single cluster of all points(divisive case) and proceeds iteratively merging or splitting the most appropriate cluster(s) until the stopping criterion is achieved.The appropriateness of a cluster(s) for merging or splitting depends on the(dis)similarity of cluster(s)elements. This reflects a general presumption that clusters consist of similar points.An important example of dissimilarity between two points is the distance between them.To merge or split subsets of points rather than individual points,the dis-tance between individual points has to be generalized to the distance between subsets.Such a derived proximity measure is called a linkage metric.The type of a linkage metric significantly affects hierarchical algorithms,because it re-flects a particular concept of closeness and connectivity.Major inter-cluster linkage metrics[171],[177]include single link,average link,and complete link. The underlying dissimilarity measure(usually,distance)is computed for every pair of nodes with one node in thefirst set and another node in the second set.A specific operation such as minimum(single link),average(average link),or maximum(complete link)is applied to pair-wise dissimilarity measures:d(C1,C2)=Op{d(x,y),x∈C1,y∈C2}Early examples include the algorithm SLINK[199],which implements single link(Op=min),Voorhees’method[215],which implements average link (Op=Avr),and the algorithm CLINK[55],which implements complete link (Op=max).It is related to the problem offinding the Euclidean minimal spanning tree[224]and has O(N2)complexity.The methods using inter-cluster distances defined in terms of pairs of nodes(one in each respective cluster)are called graph methods.They do not use any cluster representation other than a set of points.This name naturally relates to the connectivity graph G(X,E)introduced above,because every data partition corresponds to a graph partition.Such methods can be augmented by so-called geometric methods in which a cluster is represented by its central point.Under the assumption of numerical attributes,the center point is defined as a centroid or an average of two cluster centroids subject to agglomeration.It results in centroid,median,and minimum variance linkage metrics.All of the above linkage metrics can be derived from the Lance-Williams updating formula[145],d(C iC j,C k)=a(i)d(C i,C k)+a(j)d(C j,C k)+b·d(C i,C j)+c|d(C i,C k)−d(C j,C k)|.8Pavel BerkhinHere a,b,c are coefficients corresponding to a particular linkage.This formula expresses a linkage metric between a union of the two clusters and the third cluster in terms of underlying nodes.The Lance-Williams formula is crucial to making the dis(similarity)computations feasible.Surveys of linkage metrics can be found in [170][54].When distance is used as a base measure,linkage metrics capture inter-cluster proximity.However,a similarity-based view that results in intra-cluster connectivity considerations is also used,for example,in the original average link agglomeration (Group-Average Method)[116].Under reasonable assumptions,such as reducibility condition (graph meth-ods satisfy this condition),linkage metrics methods suffer from O N 2 time complexity [177].Despite the unfavorable time complexity,these algorithms are widely used.As an example,the algorithm AGNES (AGlomerative NESt-ing)[131]is used in S-Plus.When the connectivity N ×N matrix is sparsified,graph methods directly dealing with the connectivity graph G can be used.In particular,hierarchical divisive MST (Minimum Spanning Tree)algorithm is based on graph parti-tioning [116].2.1Hierarchical Clusters of Arbitrary ShapesFor spatial data,linkage metrics based on Euclidean distance naturally gener-ate clusters of convex shapes.Meanwhile,visual inspection of spatial images frequently discovers clusters with curvy appearance.Guha et al.[99]introduced the hierarchical agglomerative clustering algo-rithm CURE (Clustering Using REpresentatives).This algorithm has a num-ber of novel features of general importance.It takes special steps to handle outliers and to provide labeling in assignment stage.It also uses two techniques to achieve scalability:data sampling (section 8),and data partitioning.CURE creates p partitions,so that fine granularity clusters are constructed in parti-tions first.A major feature of CURE is that it represents a cluster by a fixed number,c ,of points scattered around it.The distance between two clusters used in the agglomerative process is the minimum of distances between two scattered representatives.Therefore,CURE takes a middle approach between the graph (all-points)methods and the geometric (one centroid)methods.Single and average link closeness are replaced by representatives’aggregate closeness.Selecting representatives scattered around a cluster makes it pos-sible to cover non-spherical shapes.As before,agglomeration continues until the requested number k of clusters is achieved.CURE employs one additional trick:originally selected scattered points are shrunk to the geometric centroid of the cluster by a user-specified factor α.Shrinkage suppresses the affect of outliers;outliers happen to be located further from the cluster centroid than the other scattered representatives.CURE is capable of finding clusters of different shapes and sizes,and it is insensitive to outliers.Because CURE uses sampling,estimation of its complexity is not straightforward.For low-dimensional data authors provide a complexity estimate of O (N 2sample )definedA Survey of Clustering Data Mining Techniques9 in terms of a sample size.More exact bounds depend on input parameters: shrink factorα,number of representative points c,number of partitions p,and a sample size.Figure1(a)illustrates agglomeration in CURE.Three clusters, each with three representatives,are shown before and after the merge and shrinkage.Two closest representatives are connected.While the algorithm CURE works with numerical attributes(particularly low dimensional spatial data),the algorithm ROCK developed by the same researchers[100]targets hierarchical agglomerative clustering for categorical attributes.It is reviewed in the section Co-Occurrence of Categorical Data.The hierarchical agglomerative algorithm CHAMELEON[127]uses the connectivity graph G corresponding to the K-nearest neighbor model spar-sification of the connectivity matrix:the edges of K most similar points to any given point are preserved,the rest are pruned.CHAMELEON has two stages.In thefirst stage small tight clusters are built to ignite the second stage.This involves a graph partitioning[129].In the second stage agglomer-ative process is performed.It utilizes measures of relative inter-connectivity RI(C i,C j)and relative closeness RC(C i,C j);both are locally normalized by internal interconnectivity and closeness of clusters C i and C j.In this sense the modeling is dynamic:it depends on data locally.Normalization involves certain non-obvious graph operations[129].CHAMELEON relies heavily on graph partitioning implemented in the library HMETIS(see the section6). Agglomerative process depends on user provided thresholds.A decision to merge is made based on the combinationRI(C i,C j)·RC(C i,C j)αof local measures.The algorithm does not depend on assumptions about the data model.It has been proven tofind clusters of different shapes,densities, and sizes in2D(two-dimensional)space.It has a complexity of O(Nm+ Nlog(N)+m2log(m),where m is the number of sub-clusters built during the first initialization phase.Figure1(b)(analogous to the one in[127])clarifies the difference with CURE.It presents a choice of four clusters(a)-(d)for a merge.While CURE would merge clusters(a)and(b),CHAMELEON makes intuitively better choice of merging(c)and(d).2.2Binary Divisive PartitioningIn linguistics,information retrieval,and document clustering applications bi-nary taxonomies are very useful.Linear algebra methods,based on singular value decomposition(SVD)are used for this purpose in collaborativefilter-ing and information retrieval[26].Application of SVD to hierarchical divisive clustering of document collections resulted in the PDDP(Principal Direction Divisive Partitioning)algorithm[31].In our notations,object x is a docu-ment,l th attribute corresponds to a word(index term),and a matrix X entry x il is a measure(e.g.TF-IDF)of l-term frequency in a document x.PDDP constructs SVD decomposition of the matrix10Pavel Berkhin(a)Algorithm CURE (b)Algorithm CHAMELEONFig.1.Agglomeration in Clusters of Arbitrary Shapes(X −e ¯x ),¯x =1Ni =1:N x i ,e =(1,...,1)T .This algorithm bisects data in Euclidean space by a hyperplane that passes through data centroid orthogonal to the eigenvector with the largest singular value.A k -way split is also possible if the k largest singular values are consid-ered.Bisecting is a good way to categorize documents and it yields a binary tree.When k -means (2-means)is used for bisecting,the dividing hyperplane is orthogonal to the line connecting the two centroids.The comparative study of SVD vs.k -means approaches [191]can be used for further references.Hier-archical divisive bisecting k -means was proven [206]to be preferable to PDDP for document clustering.While PDDP or 2-means are concerned with how to split a cluster,the problem of which cluster to split is also important.Simple strategies are:(1)split each node at a given level,(2)split the cluster with highest cardinality,and,(3)split the cluster with the largest intra-cluster variance.All three strategies have problems.For a more detailed analysis of this subject and better strategies,see [192].2.3Other DevelopmentsOne of early agglomerative clustering algorithms,Ward’s method [222],is based not on linkage metric,but on an objective function used in k -means.The merger decision is viewed in terms of its effect on the objective function.The popular hierarchical clustering algorithm for categorical data COB-WEB [77]has two very important qualities.First,it utilizes incremental learn-ing.Instead of following divisive or agglomerative approaches,it dynamically builds a dendrogram by processing one data point at a time.Second,COB-WEB is an example of conceptual or model-based learning.This means that each cluster is considered as a model that can be described intrinsically,rather than as a collection of points assigned to it.COBWEB’s dendrogram is calleda classification tree.Each tree node(cluster)C is associated with the condi-tional probabilities for categorical attribute-values pairs,P r(x l=νlp|C),l=1:d,p=1:|A l|.This easily can be recognized as a C-specific Na¨ıve Bayes classifier.During the classification tree construction,every new point is descended along the tree and the tree is potentially updated(by an insert/split/merge/create op-eration).Decisions are based on the category utility[49]CU{C1,...,C k}=1j=1:kCU(C j)CU(C j)=l,p(P r(x l=νlp|C j)2−(P r(x l=νlp)2.Category utility is similar to the GINI index.It rewards clusters C j for in-creases in predictability of the categorical attribute valuesνlp.Being incre-mental,COBWEB is fast with a complexity of O(tN),though it depends non-linearly on tree characteristics packed into a constant t.There is a similar incremental hierarchical algorithm for all numerical attributes called CLAS-SIT[88].CLASSIT associates normal distributions with cluster nodes.Both algorithms can result in highly unbalanced trees.Chiu et al.[47]proposed another conceptual or model-based approach to hierarchical clustering.This development contains several different use-ful features,such as the extension of scalability preprocessing to categori-cal attributes,outliers handling,and a two-step strategy for monitoring the number of clusters including BIC(defined below).A model associated with a cluster covers both numerical and categorical attributes and constitutes a blend of Gaussian and multinomial models.Denote corresponding multivari-ate parameters byθ.With every cluster C we associate a logarithm of its (classification)likelihoodl C=x i∈Clog(p(x i|θ))The algorithm uses maximum likelihood estimates for parameterθ.The dis-tance between two clusters is defined(instead of linkage metric)as a decrease in log-likelihoodd(C1,C2)=l C1+l C2−l C1∪C2caused by merging of the two clusters under consideration.The agglomerative process continues until the stopping criterion is satisfied.As such,determina-tion of the best k is automatic.This algorithm has the commercial implemen-tation(in SPSS Clementine).The complexity of the algorithm is linear in N for the summarization phase.Traditional hierarchical clustering does not change points membership in once assigned clusters due to its greedy approach:after a merge or a split is selected it is not refined.Though COBWEB does reconsider its decisions,its。
A monothetic clustering method∗Marie Chavent(*)(**)(*)INRIA Rocquencourt,Action SODAS,Domaine de Voluceau,B.P.105,78153Le Chesnay cedex,France(**)Universit´e de Paris IX Dauphine,Lise Ceremade,Place du Mar´e chal De Lattre de Tassigny,75775Paris cedex16,Francee-mail:Marie.Chavent@inria.frAbstract:The proposed divisive clustering method performs simultaneously a hierarchy of a set of objects and a monothetic characterization of each cluster of the hierarchy.A division is performed according to the within-cluster inertia criterion which is minimized among the bipartitions induced by a set of binary questions.In order to improve the clustering,the algorithm revises at each step the division which has induced the cluster chosen for division.Key Words:Hierarchical clustering methods,Monothetic cluster,Inertia criterion1.IntroductionThe objective of cluster analysis is to group a setΩof N objects into clusters having the property that objects in the same cluster are similar to another and different from objects of other clusters.In the pattern recognition literature(Duda and Hart,1973)this type of problem is referred to as unsupervised pattern recognition.The most common clustering methods are partitioning,hierarchical agglomerative and hierarchical divisive ones.A partition ofΩis a list(C1,...,C K)of clusters verifying C1∪...∪C K=Ωand C k∩C k =∅for all k=k .The essence of partitioning is the optimization an objective function measuring the homogeneity within the clusters and/or the separation between the clusters.Algorithms of the exchange type are frequently used tofind a local optimum of the objective function, because of the complexity of the exact algorithms.Well-known partitioning procedures are the Forgy’s k-means and the isodata methods,described in Anderberg(1973),and the dynamical clustering method(Diday,1974).Agglomerative and divisive hierarchical clustering methods are different,in the type of structure they are searching,from partitioning.Indeed,a hierarchy ofΩis a family H of clusters satisfying Ω∈H,{ω}∈H for allω∈Ωand A∩B∈{∅,A,B}for all A,B∈H.A hierarchy can be represented in the form of a tree or dendogram,that shows how the clusters are hierarchically organized.The general algorithm for agglomerative clustering starts with N clusters,each consisting of ∗Pattern Recognition Letters19(1998)989-9961one element ofΩ,and merges successively two clusters on the basis of a similarity measure. Well-known agglomerative hierarchical methods are described in Everitt(1974).Divisive hierarchical clustering reverses the process of agglomerative hierarchical clustering,by starting with all objects in one cluster,and dividing successively each cluster into smaller ones. Those methods are usually iterative and determine at each iteration the cluster to be divided and the subdivision of this cluster.This process is continued until suitable stopping rule arrests further division.There is a variety of divisive clustering methods(Kaufman and Rousseeuw,1990).A natural approach of dividing a cluster C of n objects into two non-empty subsets would be to consider all the possible bipartitions.In this,Edward and Cavalli-Sforza(1965)choose among the2n−1−1 possible bipartitions of C,the one having the smallest within-cluster sum of squares.It is clear that such complete enumeration procedure provides a global optimum but is computationally prohibitive.Neverless,it is possible to construct divisive clustering methods that does not consider all bipar-titions.MacNaughton-Smith(1964)proposed an iterative divisive procedure using an average dissimilarity between an object and a group of objects.Chidananda Gowda and Krishna(1978) proposed a disaggregative clustering method based on the concept of mutual nearest neighbor-hood.Other methods taking as input a dissimilarity matrix are based on the optimization of criterions like the split or the diameter of the bipartition(Gu´e noche,Hansen and Jaumard,1991; Wang,Yan and Sriskandarajah,1996).Probabilistic validation approach for divisive clustering has also been proposed(Har-even and Brailovsky,1995).Another family of divisive clustering methods is monothetic.A cluster is called monothetic if a conjunction of logical properties is both necessary and sufficient for membership in the cluster (Sneath and Sokal,1973).Indeed,each division is carried out using a single variable and by separating objects possessing some specified values of this variable from those lacking them. Monothetic divisive clustering methods havefirst been proposed in the particular case of binary data(Williams and Lambert,1959;Lance and Williams,1968).Since then,monothetic cluster-ing methods have mostly been developed in thefield of unsupervised learning and are known as descendant conceptual clustering methods(Michalski,Diday and Stepp,1981;Michalski and Stepp,1983).In thefield of discriminant analysis,monothetic divisive methods have also been widely devel-oped.However,those methods are different from clustering in which the clusters are inferred from data.Indeed,a partition ofΩis pre-defined and the problem concerns the construction of a systematic way of predicting the class membership of a new object.In the pattern recognition literature,this type of classification is referred to as supervised pattern recognition.Divisive methods of this type are usually known as tree structured classifier like cart(Breiman,Fried-man,Olshen and Stone,1984)or id3(Quinlan,1986).Recently,Ciampi(1994)insisted on the idea that trees offer a natural approach for both class formation(clustering)and development of classification rules(discrimination).The clustering method proposed in this paper was developed in the framework of symbolic data analysis(Diday,1995),which aims at bringing together data analysis and machine learning. More precisely,we propose a monothetic hierarchical clustering method performed in the spirit of cart from an unsupervised point of view.We have restricted the presentation of this method to the particular case of quantitative data.At each stage,the division of a cluster is performed according to the within-cluster inertia criterion(section??).This criterion is minimized among bipartitions induced by a set of binary questions(section??).Moreover,clusters are not sys-2tematically divided but one of them is chosen according to a specific criterion(section??).The divisions are stopped after a number of iterations given as input by the user,usually interested in few clusters partitions.The output of this divisive clustering method is an indexed hierarchy. It is also a decision tree(section??).The Ruspini’s data are given as afirst illustration of this method(section??).We propose a modification of the algorithm in order to soften the property shared by both agglomerative and divisive hierarchical methods,that efficient early partition cannot be corrected at a later stage.It consists in revising,after the division of a cluster,the previous division which has induced the cluster itself(section??).Before the conclusion(section ??),the method is performed on Fisher’s iris dataset(section??).2.The inertia criterionLet N be the number of objects inΩ.Each object is described on p real variables Y1,...,Y p by a vector x i∈R p and weighted by a real value p i(i=1,...,N).Indeed,the analyst will prefer sometimes to weight the objects differently.For instance,countries could be weighted according to the size of their population.But usually,the weights are equal to1or equal to1n.The inertia I of a cluster C k is an homogeneity measure equal to:I(C k)=xi ∈C kp i d2M(x i,x k)(1)where d M is the Euclidean distance(M is a symmetric matrix positively defined):∀x,y∈R p,d2M(x,y)=(x−y)t M(x−y)(2) and x k is the center of gravity of the cluster C k:x k=1µkxi∈C kp i x i(3)µk=xi ∈C kp i(4)The within-cluster inertia W of a K-clusters-partition P K=(C1,...,C K)is equal to:W(P K)=Kk=1I(C k)(5)According to the Huygens Theorem,minimizing the within-cluster inertia of a partition(e.g. the homogeneity within the clusters)is equivalent to maximizing the between-cluster inertia (e.g.the separation between the clusters).This equals to:B(W K)=Kk=1µk d2M(x k,x)(6)3.Bipartitioning a clusterLet C be a set of n objects.We want tofind a bipartition(C1,C2)of C such that the within-cluster inertia is minimum.In the Edward and Cavalli-Sforza method(1965)one chooses the optimal bipartition(C1,C2)among the2n−1−1possible bipartitions.It is clear that the amount of calculation needed when n is large will be prohibitive.3In our approach,to reduce the complexity,we divide C according to a binary question(Breiman, Friedman,Olshen and Stone,1984)of the form“Y i≤c?”where Y i:Ω→R is a real variable and c∈R is called the cut point.The bipartition(C1,C2)induced by the binary question is defined as follows.Letωbe an object in C.If Y i(ω)≤c thenω∈C1elseω∈C2.Those objects in C answering“yes”go to the left descendant cluster and those answering“no”to the right descendant cluster(Fig.??).Figure1:“Is height≤172?”For each variable Y i,there will be at most n−1different bipartitions(C1,C2)induced by the above procedure.Indeed,whatever the cut point c between two consecutive observations Y i(ω) may be,the bipartition induced is the same.In order to ask only n−1questions to generate all these bipartitions,we decide to use the n−1cut points c,chosen as the middle of two consecutive observations Y i(ω)∈R.Indeed,if the n observations Y i(ω)are different,there are n−1cut points on Y i.If there are p variables,we choose among the p(n−1)corresponding bipartitions(C1,C2),the bipartition having the smallest within-cluster inertia.4.Choice of the clusterLet P K=(C1,...,C K)be a K-clusters-partition ofΩ.At each stage,a new(K+1)-clusters-partition is obtained by dividing a cluster C k∈P K into two new clusters C1k and C2k.Thepurpose is to choose the cluster C k∈P K so that the new partition,P K+1=P K∪{C1k,C2k}−{C k}has minimum within-cluster inertia.We know that:W(P K+1)=W(P K)−I(C k)+I(C1k)+I(C2k)In this,minimizing W(P K+1)is equivalent to choosing the cluster C k∈P K so that the differencebetween the inertia of C k and the within-cluster inertia of its bipartition(C1k ,C2k)is maximum.The criterion used to determine the cluster that will be divided is then equal to:∆(C k)=I(C k)−I(C1k)−I(C2k)(7) Of course,it means that the bipartitions of all the clusters of the partition P K have been definedpreviously.At each stage,the bipartitions of the two new clusters C1k and C2kare defined andused in the next stage.5.The stopping rule and the outputThe divisions are stopped after a number L of iterations and L is given as input by the user, usually interested in few clusters partitions.Indeed,the last partition obtained in the last iteration is a L+1-clusters-partition.The issue of stopping the divisions before obtaining the4total hierarchy(L=N)is to ensure that the partitions of smallest within-cluster inertia of the total hierarchy are still in the hierarchy obtained after L iterations.This property is verified because the clusters are not systematically divided but one cluster is chosen according to the criterion∆given in(??)which ensures that the partition induced by this division has minimum within-cluster inertia.However,this stopping rule doesn’t solve the issue of determining the number of clusters in the dataset(Milligan and Cooper,1985).The output of this divisive clustering method is a hierarchy H which singletons are the L+1 clusters of the partition obtained in the last iteration of the algorithm.Each cluster C k∈H is indexed by∆(C k).Because∆is a non-decreasing mapping,C k⊂C k ⇒∆(C k)≤∆(C k )(8)there will be no inversions in the dendogram of the hierarchy.This hierarchy is also a decision tree.The L clusters are the leaves and the nodes are the binary questions selected by the algorithm.Each cluster is characterized by a rule defined according to the binary questions leading from the root to the corresponding leaves.6.A simple exampleThe dataset is75points of R2(Ruspini,1970).Wefind successively a partition in2,3and4 clusters(L=3).At thefirst stage,the method induces2(75−1)=148bipartitions.We choose among the 148bipartitions(C1,C2),the one of smallest within-cluster inertia.It has been induced by the binary question“Is Y1≤75.5?”.Notice that the number of subdivisions has been reduced from 275−1=3,77×1022to148.At the second stage,we have to choose whether we divide C1or C2.Here,we choose the cluster C1and its bipartition(C11,C21)because∆(C1)>∆(C2).The binary question is“Is Y2≤54?”. At the third stage,we choose the cluster C2and its bipartition(C12,C22).The binary question is“Is Y2≤75.5?”.Finally,the divisive algorithm gives the4clusters represented Fig.??.Figure2:The4-clusters partitionAccording to the dendogram of the hierarchy givenfigure??,the four clusters are characterized by four rules.For instance cluster C11is characterized by the following rule:If[Y1(ω)≤75,5]and[Y2(ω)≤54]thenω∈C11.5Figure3:The dendogram of the indexed hierarchyThis dendogram can be read as a decision tree and the rules can be read as classification rules of new objects to one of the four clusters.7.Revising a binary questionThe purpose is to enable the analyst to revise at each division of a cluster the binary question which has induced the cluster itself.Let C be a cluster which has been divided in two clusters C and C according to the binary question“Is Y1≤c1?”.Then C is chosen to be divided in two clusters C1and C2according to the binary question“Is Y2≤c2?”.Figure4:Revising a binary questionAt this stage,the binary question“Is Y1≤c1?”is revised by modifying the cut point c1.We choose a new cut point c among all possible cut points on Y1,such that the3-clusters-partition (C 1,C 2,C )induced by“Is Y1≤c ?”and“Is Y2≤c2?”has minimum within-cluster inertia (figure??).For instance,figure??gives the3-clusters-partition of320points of R2simulated from four 2-dimensional Gaussian distributions.The points have been dividedfirst according to the binary question“Is Y2≤10,9?”and then according to the binary question“Is Y1≤8?”.Thefirst cut point10.9is then modified in order tofind,with the second binary question “Is Y1≤8?”,the3-clusters-partition of minimum within-cluster inertia.The new cut point is 12.1(figure??).8.The Fisher’s iris datasetThe above clustering method has been examined with the well-known Fisher’s iris dataset.The length and breadth of both petals and sepals were measured on150flowers.There are three varieties of iris:Setoa,Versicolor and Virginia.There are50iris of each variety.6Figure5:The two3-clusters-partitionsOf course,the knowledge of this pre-defined3-clusters-partition is not used in our unsupervised clustering procedure which is performed only with four quantitative variables:the petal width (PeWi),the petal length(PeLe),the sepal width(SeWi)and the sepal length(SeLe).First,we have used the Euclidean distance d M,with M=I,the identity matrix.Figure?? gives the dendogram of the hierarchy and the3-clusters-partition(C1,C2,C3)obtained after two divisions of the dataset.Thefirst cluster is composed of53iris including50Setoa,3Versicolor and no Virginia.Wholly,the3-clusters-partition contains19iris misclassified.Thefirst binary question“Is PeLe≤3.4”is then revised in order to improve the within-cluster inertia of the3-clusters-partition.Figure??gives the dendogram of the hierarchy obtained with the revised binary question“Is PeLe≤2.45”.We can notice that the mis-classifications have been reduced to16.Indeed,the50Versicolor are all in C2.Figure6:Before the revision Figure7:After the revisionDynamical clustering and Ward agglomerative hierarchical clustering methods have also been performed on the same dataset.The same distance was used.The partitions obtained with the two clustering methods contained the same number of mis-classifications since16iris were misclassified.where U i is Secondly,we have used the normalized Euclidean distance d M,with M=D1/U2ithe length between the maximum and the minimum value for the variable Y i.Thefigure?? gives the dendogram of the hierarchy obtained with this distance and we notice a reduction of the number of mis-classifications from16to10iris.It confirms the influence of the choice of the distance in the result of a clustering.Then,before the second division,we have normalized the Euclidean distance,according to the four length U i computed locally in the cluster which7Figure8:Global normalization Figure9:Local normalizationwas divided.Thefigure??gives the dendogram of the hierarchy obtained with the locally normalized Euclidean distance.We can notice that the number of mis-classifications is now reduced to6.It corresponds to an error rate of0.04.In their comparative study of the performance of different classifiers with Fisher’s iris dataset, Weiss&Kulikowski(1991)give for the cart decision tree an error rate equal to0.04.In this,we obtain with the Fisher’s iris dataset comparable results with both unsupervised and supervised approaches.However,the goal of the proposed clustering method and the cart algorithm are different since we aim at inferring clusters from the data and cart algorithm aims at discovering classification rules.9.ConclusionThe proposed clustering method has the advantages to be simple and to give simultaneously a hierarchy and a simple interpretation of its cluster.Moreover,it deals easily with very large datasets.Indeed,is possible to construct the hierarchy on a sample of the dataset,and to use the classification rules to assign the rest of the objects.This method has also given good results on the Fisher’s iris dataset and on other real applications where it has been compared with the dynamical clustering method and the Ward agglomerative hierarchical method(Chavent,1997). However,dividing a cluster according to a single variable can also be a deficiency in some situa-tions.As for cart algorithm,in situations where the cluster structure depends on combinations of variables,the divisive method will do poorly at discovering the structure.A perspective would be on the one hand to use a local stopping rule(Milligan and Cooper,1985; Har-even and Brailovsky,1995)for deciding if a cluster should be divided into two subclusters and on the other hand to divide a cluster according to a metric locally defined in the cluster itself.ReferencesAnderberg,M.R.(1973).Cluster analysis for applications.Academic Press,New York. Breiman,L.,J.H.Friedman,R.A.Olshen and C.J.Stone(1984).Classification and regression Trees.C.A:Wadsworth.8Chavent,M.(1997).Analyse des Donn´e es Symboliques.Une m´e those divisive de classification.PhD Thesis,Universit´e Paris-IX Dauphine,France.Chidananda Gowda,K.and G.Krishna(1978).Disaggregative Clustering Using the Concept of Mutual Nearest Neighborhood.ieee Transactions on Systems,Man,and Cybernetics 8,888-895.Ciampi,A.(1994).Classification and Discrimination:the recpam Approach.In proc.of compstat’94,129-147.Diday,E.(1974).Optimization in non-hierarchical clustering.Pattern Recognition6,17-33. Diday,E.(1995).Probabilist,possibilist and belief objects for knowledge analysis.Annals of Operations Research55,227-276.Duda,R.O.and Hart,P.E.(1973).Pattern Classification and Scene Analysis.Wiley,New York.Edwards,A.W.F.and L.L.Cavalli-Sforza(1965).A method for cluster analysis.Biometrics 21,362-375.Everitt,B.(1974).Cluster Analysis.Social Sciences Research Council,Heineman Educational Books.Gu´e noche,A.,P.Hansen and B.Jaumard(1991).Efficient algorithms for divisive hierarchical clustering.Journal of Classification8,5-30.Har-even,M.and V.L.Brailvosky(1995).Probabilistic validation approach for clustering.Pattern Recognition16,1189-1196.Kaufman,L.and P.J.Rousseeuw(1990).Finding groups in data.Wiley,New York. Lance,G.N.and W.T.Williams(1968).Note on a new information statistic classification program.The Computer Journal11,195-197.MacNaughton-Smith,P.(1964).Dissimilarity analysis:A new technique of hierarchical sub-division.Nature202,1034-1035.Michalski,R.S.and E.Diday,R.Stepp(1981).A recent advance in data analysis:Clustering objects into classes characterized by conjunctive concepts.Progress in Pattern Recognition, L.N.Kanal and A.Rosenfeld(eds),North Holland,33-56.Michalski,R.S.and R.Stepp(1983).Learning from observations:Conceptual clustering.Machine Learning:An Artificial Intelligence Approach,R.S.Michalsky,J.Carbonell and T.Mitchell(eds),163-190.Milligan,G.W.and M.C.Cooper(1985).An examination of procedures for determining the number of clusters in a data set.Psychometrika50,159-179.Quinlan,J.R.(1986).Induction of decision trees.Machine Learning1,81-106.Ruspini,E.M.(1970).Numerical Methods for Fuzzy rmation Science2,319-350.Sneath,P.H.and R.R.Sokal(1973).Numerical Taxonomy.Freeman and company,San Fran-cisco.Weiss,S.M.and C.A.Kulikowski(1991).Computer systems that learn:Classification and prediction methods from statistics,neural network,machine learning,and expert systems.San Mateo,Calif:Morgan Kaufmann.Williams,W.T.and mbert(1959).Multivariate methods in plant ecology.Journal of Ecology47,83-101.Wang,Y.and H.Yan,C.Sriskandarajah(1996).The weighted Sum of Split and Diameter Clustering.Journal of Classification13,231-248.9List of Figures10。
Package‘BClustLonG’October12,2022Type PackageTitle A Dirichlet Process Mixture Model for Clustering LongitudinalGene Expression DataVersion0.1.3Author Jiehuan Sun[aut,cre],Jose D.Herazo-Maya[aut],Naftali Kaminski[aut],Hongyu Zhao[aut],and Joshua L.Warren[aut], Maintainer Jiehuan Sun<*********************>Description Many clustering methods have been proposed,butmost of them cannot work for longitudinal gene expression data.'BClustLonG'is a package that allows us to perform clustering analysis forlongitudinal gene expression data.It adopts a linear-mixed effects frameworkto model the trajectory of genes over time,while clustering is jointlyconducted based on the regression coefficients obtained from all genes.To account for the correlations among genes and alleviate thehigh dimensionality challenges,factor analysis models are adoptedfor the regression coefficients.The Dirichlet process prior distributionis utilized for the means of the regression coefficients to induce clustering.This package allows users to specify which variables to use for clustering(intercepts or slopes or both)and whether a factor analysis model is desired.More de-tails about this method can be found in Jiehuan Sun,et al.(2017)<doi:10.1002/sim.7374>. License GPL-2Encoding UTF-8LazyData trueDepends R(>=3.4.0),MASS(>=7.3-47),lme4(>=1.1-13),mcclust(>=1.0)Imports Rcpp(>=0.12.7)Suggests knitr,latticeVignetteBuilder knitrLinkingTo Rcpp,RcppArmadilloRoxygenNote7.1.0NeedsCompilation yes12BClustLonGRepository CRANDate/Publication2020-05-0704:10:02UTCR topics documented:BClustLonG (2)calSim (3)data (4)Index5 BClustLonG A Dirichlet process mixture model for clustering longitudinal gene ex-pression data.DescriptionA Dirichlet process mixture model for clustering longitudinal gene expression data.UsageBClustLonG(data=NULL,iter=20000,thin=2,savePara=FALSE,infoVar=c("both","int")[1],factor=TRUE,hyperPara=list(v1=0.1,v2=0.1,v=1.5,c=1,a=0,b=10,cd=1,aa1=2, aa2=1,alpha0=-1,alpha1=-1e-04,cutoff=1e-04,h=100) )Argumentsdata Data list with three elements:Y(gene expression data with each column being one gene),ID,and years.(The names of the elements have to be matachedexactly.See the data in the example section more info)iter Number of iterations(excluding the thinning).thin Number of thinnings.savePara Logical variable indicating if all the parameters needed to be saved.Default value is FALSE,in which case only the membership indicators are saved.infoVar Either"both"(using both intercepts and slopes for clustering)or"int"(using only intercepts for clustering)factor Logical variable indicating whether factor analysis model is wanted.hyperPara A list of hyperparameters with default values.calSim3 Valuereturns a list with following objects.e.mat Membership indicators from all iterations.All other parametersonly returned when savePara=TRUE.ReferencesJiehuan Sun,Jose D.Herazo-Maya,Naftali Kaminski,Hongyu Zhao,and Joshua L.Warren."A Dirichlet process mixture model for clustering longitudinal gene expression data."Statistics in Medicine36,No.22(2017):3495-3506.Examplesdata(data)##increase the number of iterations##to ensure convergence of the algorithmres=BClustLonG(data,iter=20,thin=2,savePara=FALSE,infoVar="both",factor=TRUE)##discard the first10burn-ins in the e.mat##and calculate similarity matrix##the number of burn-ins has be chosen s.t.the algorithm is converged.mat=calSim(t(res$e.mat[,11:20]))clust=maxpear(mat)$cl##the clustering results.##Not run:##if only want to include intercepts for clustering##set infoVar="int"res=BClustLonG(data,iter=10,thin=2,savePara=FALSE,infoVar="int",factor=TRUE)##if no factor analysis model is wanted##set factor=FALSEres=BClustLonG(data,iter=10,thin=2,savePara=FALSE,infoVar="int",factor=TRUE)##End(Not run)calSim Function to calculate the similarity matrix based on the cluster mem-bership indicator of each iteration.DescriptionFunction to calculate the similarity matrix based on the cluster membership indicator of each itera-tion.4dataUsagecalSim(mat)Argumentsmat Matrix of cluster membership indicator from all iterationsExamplesn=90##number of subjectsiters=200##number of iterations##matrix of cluster membership indicators##perfect clustering with three clustersmat=matrix(rep(1:3,each=n/3),nrow=n,ncol=iters)sim=calSim(t(mat))data Simulated dataset for testing the algorithmDescriptionSimulated dataset for testing the algorithmUsagedata(data)FormatAn object of class list of length3.Examplesdata(data)##this is the required data input formathead(data.frame(ID=data$ID,years=data$years,data$Y))Index∗datasetsdata,4BClustLonG,2calSim,3data,45。
自动化专业英语词汇表acceleration transducer 加速度传感器acceptance testing 验收测试accessibility 可及性accumulated error 累积误差AC-DC-AC frequency converter 交-直-交变频器AC (alternating current) electric drive 交流电子传动active attitude stabilization 主动姿态稳定actuator 驱动器,执行机构adaline 线性适应元adaptation layer 适应层adaptive telemeter system 适应遥测系统adjoint operator 伴随算子admissible error 容许误差aggregation matrix 集结矩阵AHP (analytic hierarchy process) 层次分析法amplifying element 放大环节analog-digital conversion 模数转换annunciator 信号器antenna pointing control 天线指向控制anti-integral windup 抗积分饱卷aperiodic decomposition 非周期分解a posteriori estimate 后验估计approximate reasoning 近似推理a priori estimate 先验估计articulated robot 关节型机器人assignment problem 配置问题,分配问题associative memory model 联想记忆模型associatron 联想机asymptotic stability 渐进稳定性attained pose drift 实际位姿漂移attitude acquisition 姿态捕获AOCS (attritude and orbit control system) 姿态轨道控制系统attitude angular velocity 姿态角速度attitude disturbance 姿态扰动attitude maneuver 姿态机动attractor 吸引子augment ability 可扩充性augmented system 增广系统automatic manual station 自动-手动操作器automaton 自动机autonomous system 自治系统backlash characteristics 间隙特性base coordinate system 基座坐标系Bayes classifier 贝叶斯分类器bearing alignment 方位对准bellows pressure gauge 波纹管压力表benefit-cost analysis 收益成本分析bilinear system 双线性系统biocybernetics 生物控制论biological feedback system 生物反馈系统black box testing approach 黑箱测试法blind search 盲目搜索block diagonalization 块对角化Boltzman machine 玻耳兹曼机bottom-up development 自下而上开发boundary value analysis 边界值分析brainstorming method 头脑风暴法breadth-first search 广度优先搜索butterfly valve 蝶阀CAE (computer aided engineering) 计算机辅助工程CAM (computer aided manufacturing) 计算机辅助制造Camflex valve 偏心旋转阀canonical state variable 规范化状态变量capacitive displacement transducer 电容式位移传感器capsule pressure gauge 膜盒压力表CARD 计算机辅助研究开发Cartesian robot 直角坐标型机器人cascade compensation 串联补偿catastrophe theory 突变论centrality 集中性chained aggregation 链式集结chaos 混沌characteristic locus 特征轨迹chemical propulsion 化学推进calrity 清晰性classical information pattern 经典信息模式classifier 分类器clinical control system 临床控制系统closed loop pole 闭环极点closed loop transfer function 闭环传递函数cluster analysis 聚类分析coarse-fine control 粗-精控制cobweb model 蛛网模型coefficient matrix 系数矩阵cognitive science 认知科学cognitron 认知机coherent system 单调关联系统combination decision 组合决策combinatorial explosion 组合爆炸combined pressure and vacuum gauge 压力真空表command pose 指令位姿companion matrix 相伴矩阵compartmental model 房室模型compatibility 相容性,兼容性compensating network 补偿网络compensation 补偿,矫正compliance 柔顺,顺应composite control 组合控制computable general equilibrium model 可计算一般均衡模型conditionally instability 条件不稳定性configuration 组态connectionism 连接机制connectivity 连接性conservative system 守恒系统consistency 一致性constraint condition 约束条件consumption function 消费函数context-free grammar 上下文无关语法continuous discrete event hybrid system simulation 连续离散事件混合系统仿真continuous duty 连续工作制control accuracy 控制精度control cabinet 控制柜controllability index 可控指数controllable canonical form 可控规范型[control] plant 控制对象,被控对象controlling instrument 控制仪表control moment gyro 控制力矩陀螺control panel 控制屏,控制盘control synchro 控制[式]自整角机control system synthesis 控制系统综合control time horizon 控制时程cooperative game 合作对策coordinability condition 可协调条件coordination strategy 协调策略coordinator 协调器corner frequency 转折频率costate variable 共态变量cost-effectiveness analysis 费用效益分析coupling of orbit and attitude 轨道和姿态耦合critical damping 临界阻尼critical stability 临界稳定性cross-over frequency 穿越频率,交越频率current source inverter 电流[源]型逆变器cut-off frequency 截止频率cybernetics 控制论cyclic remote control 循环遥控cylindrical robot 圆柱坐标型机器人damped oscillation 阻尼振荡damper 阻尼器damping ratio 阻尼比data acquisition 数据采集data encryption 数据加密data preprocessing 数据预处理data processor 数据处理器DC generator-motor set drive 直流发电机-电动机组传动D controller 微分控制器decentrality 分散性decentralized stochastic control 分散随机控制decision space 决策空间decision support system 决策支持系统decomposition-aggregation approach 分解集结法decoupling parameter 解耦参数deductive-inductive hybrid modeling method 演绎与归纳混合建模法delayed telemetry 延时遥测derivation tree 导出树derivative feedback 微分反馈describing function 描述函数desired value 希望值despinner 消旋体destination 目的站detector 检出器deterministic automaton 确定性自动机deviation 偏差deviation alarm 偏差报警器DFD 数据流图diagnostic model 诊断模型diagonally dominant matrix 对角主导矩阵diaphragm pressure gauge 膜片压力表difference equation model 差分方程模型differential dynamical system 微分动力学系统differential game 微分对策differential pressure level meter 差压液位计differential pressure transmitter 差压变送器differential transformer displacement transducer 差动变压器式位移传感器differentiation element 微分环节digital filer 数字滤波器digital signal processing 数字信号处理digitization 数字化digitizer 数字化仪dimension transducer 尺度传感器direct coordination 直接协调disaggregation 解裂discoordination 失协调discrete event dynamic system 离散事件动态系统discrete system simulation language 离散系统仿真语言discriminant function 判别函数displacement vibration amplitude transducer 位移振幅传感器dissipative structure 耗散结构distributed parameter control system 分布参数控制系统distrubance 扰动disturbance compensation 扰动补偿diversity 多样性divisibility 可分性domain knowledge 领域知识dominant pole 主导极点dose-response model 剂量反应模型dual modulation telemetering system 双重调制遥测系统dual principle 对偶原理dual spin stabilization 双自旋稳定duty ratio 负载比dynamic braking 能耗制动dynamic characteristics 动态特性dynamic deviation 动态偏差dynamic error coefficient 动态误差系数dynamic exactness 动它吻合性dynamic input-output model 动态投入产出模型econometric model 计量经济模型economic cybernetics 经济控制论economic effectiveness 经济效益economic evaluation 经济评价economic index 经济指数economic indicator 经济指标eddy current thickness meter 电涡流厚度计effectiveness 有效性effectiveness theory 效益理论elasticity of demand 需求弹性electric actuator 电动执行机构electric conductance levelmeter 电导液位计electric drive control gear 电动传动控制设备electric hydraulic converter 电-液转换器electric pneumatic converter 电-气转换器electrohydraulic servo vale 电液伺服阀electromagnetic flow transducer 电磁流量传感器electronic batching scale 电子配料秤electronic belt conveyor scale 电子皮带秤electronic hopper scale 电子料斗秤elevation 仰角emergency stop 异常停止empirical distribution 经验分布endogenous variable 内生变量equilibrium growth 均衡增长equilibrium point 平衡点equivalence partitioning 等价类划分ergonomics 工效学error 误差error-correction parsing 纠错剖析estimate 估计量estimation theory 估计理论evaluation technique 评价技术event chain 事件链evolutionary system 进化系统exogenous variable 外生变量expected characteristics 希望特性external disturbance 外扰fact base 事实failure diagnosis 故障诊断fast mode 快变模态feasibility study 可行性研究feasible coordination 可行协调feasible region 可行域feature detection 特征检测feature extraction 特征抽取feedback compensation 反馈补偿feedforward path 前馈通路field bus 现场总线finite automaton 有限自动机FIP (factory information protocol) 工厂信息协议first order predicate logic 一阶谓词逻辑fixed sequence manipulator 固定顺序机械手fixed set point control 定值控制FMS (flexible manufacturing system) 柔性制造系统flow sensor/transducer 流量传感器flow transmitter 流量变送器fluctuation 涨落forced oscillation 强迫振荡formal language theory 形式语言理论formal neuron 形式神经元forward path 正向通路forward reasoning 正向推理fractal 分形体,分维体frequency converter 变频器frequency domain model reduction method 频域模型降阶法frequency response 频域响应full order observer 全阶观测器functional decomposition 功能分解FES (functional electrical stimulation) 功能电刺激functional simularity 功能相似fuzzy logic 模糊逻辑game tree 对策树gate valve 闸阀general equilibrium theory 一般均衡理论generalized least squares estimation 广义最小二乘估计generation function 生成函数geomagnetic torque 地磁力矩geometric similarity 几何相似gimbaled wheel 框架轮global asymptotic stability 全局渐进稳定性global optimum 全局最优globe valve 球形阀goal coordination method 目标协调法grammatical inference 文法推断graphic search 图搜索gravity gradient torque 重力梯度力矩group technology 成组技术guidance system 制导系统gyro drift rate 陀螺漂移率gyrostat 陀螺体Hall displacement transducer 霍尔式位移传感器hardware-in-the-loop simulation 半实物仿真harmonious deviation 和谐偏差harmonious strategy 和谐策略heuristic inference 启发式推理hidden oscillation 隐蔽振荡hierarchical chart 层次结构图hierarchical planning 递阶规划hierarchical control 递阶控制homeostasis 内稳态homomorphic model 同态系统horizontal decomposition 横向分解hormonal control 内分泌控制hydraulic step motor 液压步进马达hypercycle theory 超循环理论I controller 积分控制器identifiability 可辨识性IDSS (intelligent decision support system) 智能决策支持系统image recognition 图像识别impulse 冲量impulse function 冲击函数,脉冲函数inching 点动incompatibility principle 不相容原理incremental motion control 增量运动控制index of merit 品质因数inductive force transducer 电感式位移传感器inductive modeling method 归纳建模法industrial automation 工业自动化inertial attitude sensor 惯性姿态敏感器inertial coordinate system 惯性坐标系inertial wheel 惯性轮inference engine 推理机infinite dimensional system 无穷维系统information acquisition 信息采集infrared gas analyzer 红外线气体分析器inherent nonlinearity 固有非线性inherent regulation 固有调节initial deviation 初始偏差initiator 发起站injection attitude 入轨姿势input-output model 投入产出模型instability 不稳定性instruction level language 指令级语言integral of absolute value of error criterion 绝对误差积分准则integral of squared error criterion 平方误差积分准则integral performance criterion 积分性能准则integration instrument 积算仪器integrity 整体性intelligent terminal 智能终端interacted system 互联系统,关联系统interactive prediction approach 互联预估法,关联预估法interconnection 互联intermittent duty 断续工作制internal disturbance 内扰ISM (interpretive structure modeling) 解释结构建模法invariant embedding principle 不变嵌入原理inventory theory 库伦论inverse Nyquist diagram 逆奈奎斯特图inverter 逆变器investment decision 投资决策isomorphic model 同构模型iterative coordination 迭代协调jet propulsion 喷气推进job-lot control 分批控制joint 关节Kalman-Bucy filer 卡尔曼-布西滤波器knowledge accomodation 知识顺应knowledge acquisition 知识获取knowledge assimilation 知识同化KBMS (knowledge base management system) 知识库管理系统knowledge representation 知识表达ladder diagram 梯形图lag-lead compensation 滞后超前补偿Lagrange duality 拉格朗日对偶性Laplace transform 拉普拉斯变换large scale system 大系统lateral inhibition network 侧抑制网络least cost input 最小成本投入least squares criterion 最小二乘准则level switch 物位开关libration damping 天平动阻尼limit cycle 极限环linearization technique 线性化方法linear motion electric drive 直线运动电气传动linear motion valve 直行程阀linear programming 线性规划LQR (linear quadratic regulator problem) 线性二次调节器问题load cell 称重传感器local asymptotic stability 局部渐近稳定性local optimum 局部最优log magnitude-phase diagram 对数幅相图long term memory 长期记忆lumped parameter model 集总参数模型Lyapunov theorem of asymptotic stability 李雅普诺夫渐近稳定性定理macro-economic system 宏观经济系统magnetic dumping 磁卸载magnetoelastic weighing cell 磁致弹性称重传感器magnitude-frequency characteristic 幅频特性magnitude margin 幅值裕度magnitude scale factor 幅值比例尺manipulator 机械手man-machine coordination 人机协调manual station 手动操作器MAP (manufacturing automation protocol) 制造自动化协议marginal effectiveness 边际效益Mason's gain formula 梅森增益公式master station 主站matching criterion 匹配准则maximum likelihood estimation 最大似然估计maximum overshoot 最大超调量maximum principle 极大值原理mean-square error criterion 均方误差准则mechanism model 机理模型meta-knowledge 元知识metallurgical automation 冶金自动化minimal realization 最小实现minimum phase system 最小相位系统minimum variance estimation 最小方差估计minor loop 副回路missile-target relative movement simulator 弹体-目标相对运动仿真器modal aggregation 模态集结modal transformation 模态变换MB (model base) 模型库model confidence 模型置信度model fidelity 模型逼真度model reference adaptive control system 模型参考适应控制系统model verification 模型验证modularization 模块化MEC (most economic control) 最经济控制motion space 可动空间MTBF (mean time between failures) 平均故障间隔时间MTTF (mean time to failures) 平均无故障时间multi-attributive utility function 多属性效用函数multicriteria 多重判据multilevel hierarchical structure 多级递阶结构multiloop control 多回路控制multi-objective decision 多目标决策multistate logic 多态逻辑multistratum hierarchical control 多段递阶控制multivariable control system 多变量控制系统myoelectric control 肌电控制Nash optimality 纳什最优性natural language generation 自然语言生成nearest-neighbor 最近邻necessity measure 必然性侧度negative feedback 负反馈neural assembly 神经集合neural network computer 神经网络计算机Nichols chart 尼科尔斯图noetic science 思维科学noncoherent system 非单调关联系统noncooperative game 非合作博弈nonequilibrium state 非平衡态nonlinear element 非线性环节nonmonotonic logic 非单调逻辑nonparametric training 非参数训练nonreversible electric drive 不可逆电气传动nonsingular perturbation 非奇异摄动non-stationary random process 非平稳随机过程nuclear radiation levelmeter 核辐射物位计nutation sensor 章动敏感器Nyquist stability criterion 奈奎斯特稳定判据objective function 目标函数observability index 可观测指数observable canonical form 可观测规范型on-line assistance 在线帮助on-off control 通断控制open loop pole 开环极点operational research model 运筹学模型optic fiber tachometer 光纤式转速表optimal trajectory 最优轨迹optimization technique 最优化技术orbital rendezvous 轨道交会orbit gyrocompass 轨道陀螺罗盘orbit perturbation 轨道摄动order parameter 序参数orientation control 定向控制originator 始发站oscillating period 振荡周期output prediction method 输出预估法oval wheel flowmeter 椭圆齿轮流量计overall design 总体设计overdamping 过阻尼overlapping decomposition 交叠分解Pade approximation 帕德近似Pareto optimality 帕雷托最优性passive attitude stabilization 被动姿态稳定path repeatability 路径可重复性pattern primitive 模式基元PR (pattern recognition) 模式识别P control 比例控制器peak time 峰值时间penalty function method 罚函数法perceptron 感知器periodic duty 周期工作制perturbation theory 摄动理论pessimistic value 悲观值phase locus 相轨迹phase trajectory 相轨迹phase lead 相位超前photoelectric tachometric transducer 光电式转速传感器phrase-structure grammar 短句结构文法physical symbol system 物理符号系统piezoelectric force transducer 压电式力传感器playback robot 示教再现式机器人PLC (programmable logic controller) 可编程序逻辑控制器plug braking 反接制动plug valve 旋塞阀pneumatic actuator 气动执行机构point-to-point control 点位控制polar robot 极坐标型机器人pole assignment 极点配置pole-zero cancellation 零极点相消polynomial input 多项式输入portfolio theory 投资搭配理论pose overshoot 位姿过调量position measuring instrument 位置测量仪posentiometric displacement transducer 电位器式位移传感器positive feedback 正反馈power system automation 电力系统自动化predicate logic 谓词逻辑pressure gauge with electric contact 电接点压力表pressure transmitter 压力变送器price coordination 价格协调primal coordination 主协调primary frequency zone 主频区PCA (principal component analysis) 主成分分析法principle of turnpike 大道原理priority 优先级process-oriented simulation 面向过程的仿真production budget 生产预算production rule 产生式规则profit forecast 利润预测PERT (program evaluation and review technique) 计划评审技术program set station 程序设定操作器proportional control 比例控制proportional plus derivative controller 比例微分控制器protocol engineering 协议工程prototype 原型pseudo random sequence 伪随机序列pseudo-rate-increment control 伪速率增量控制pulse duration 脉冲持续时间pulse frequency modulation control system 脉冲调频控制系统pulse width modulation control system 脉冲调宽控制系统PWM inverter 脉宽调制逆变器pushdown automaton 下推自动机QC (quality control) 质量管理quadratic performance index 二次型性能指标qualitative physical model 定性物理模型quantized noise 量化噪声quasilinear characteristics 准线性特性queuing theory 排队论radio frequency sensor 射频敏感器ramp function 斜坡函数random disturbance 随机扰动random process 随机过程rate integrating gyro 速率积分陀螺ratio station 比值操作器reachability 可达性reaction wheel control 反作用轮控制realizability 可实现性,能实现性real time telemetry 实时遥测receptive field 感受野rectangular robot 直角坐标型机器人rectifier 整流器recursive estimation 递推估计reduced order observer 降阶观测器redundant information 冗余信息reentry control 再入控制regenerative braking 回馈制动,再生制动regional planning model 区域规划模型regulating device 调节装载regulation 调节relational algebra 关系代数relay characteristic 继电器特性remote manipulator 遥控操作器remote regulating 遥调remote set point adjuster 远程设定点调整器rendezvous and docking 交会和对接reproducibility 再现性resistance thermometer sensor 热电阻resolution principle 归结原理resource allocation 资源分配response curve 响应曲线return difference matrix 回差矩阵return ratio matrix 回比矩阵reverberation 回响reversible electric drive 可逆电气传动revolute robot 关节型机器人revolution speed transducer 转速传感器rewriting rule 重写规则rigid spacecraft dynamics 刚性航天动力学risk decision 风险分析robotics 机器人学robot programming language 机器人编程语言robust control 鲁棒控制robustness 鲁棒性roll gap measuring instrument 辊缝测量仪root locus 根轨迹roots flowmeter 腰轮流量计rotameter 浮子流量计,转子流量计rotary eccentric plug valve 偏心旋转阀rotary motion valve 角行程阀rotating transformer 旋转变压器Routh approximation method 劳思近似判据routing problem 路径问题sampled-data control system 采样控制系统sampling control system 采样控制系统saturation characteristics 饱和特性scalar Lyapunov function 标量李雅普诺夫函数SCARA (selective compliance assembly robot arm) 平面关节型机器人scenario analysis method 情景分析法scene analysis 物景分析s-domain s域self-operated controller 自力式控制器self-organizing system 自组织系统self-reproducing system 自繁殖系统self-tuning control 自校正控制semantic network 语义网络semi-physical simulation 半实物仿真sensing element 敏感元件sensitivity analysis 灵敏度分析sensory control 感觉控制sequential decomposition 顺序分解sequential least squares estimation 序贯最小二乘估计servo control 伺服控制,随动控制servomotor 伺服马达settling time 过渡时间sextant 六分仪short term planning 短期计划short time horizon coordination 短时程协调signal detection and estimation 信号检测和估计signal reconstruction 信号重构similarity 相似性simulated interrupt 仿真中断simulation block diagram 仿真框图simulation experiment 仿真实验simulation velocity 仿真速度simulator 仿真器single axle table 单轴转台single degree of freedom gyro 单自由度陀螺single level process 单级过程single value nonlinearity 单值非线性singular attractor 奇异吸引子singular perturbation 奇异摄动sink 汇点slaved system 受役系统slower-than-real-time simulation 欠实时仿真slow subsystem 慢变子系统socio-cybernetics 社会控制论socioeconomic system 社会经济系统software psychology 软件心理学solar array pointing control 太阳帆板指向控制solenoid valve 电磁阀source 源点specific impulse 比冲speed control system 调速系统spin axis 自旋轴spinner 自旋体stability criterion 稳定性判据stability limit 稳定极限stabilization 镇定,稳定Stackelberg decision theory 施塔克尔贝格决策理论state equation model 状态方程模型state space description 状态空间描述static characteristics curve 静态特性曲线station accuracy 定点精度stationary random process 平稳随机过程statistical analysis 统计分析statistic pattern recognition 统计模式识别steady state deviation 稳态偏差steady state error coefficient 稳态误差系数step-by-step control 步进控制step function 阶跃函数stepwise refinement 逐步精化stochastic finite automaton 随机有限自动机strain gauge load cell 应变式称重传感器strategic function 策略函数strongly coupled system 强耦合系统subjective probability 主观频率suboptimality 次优性supervised training 监督学习supervisory computer control system 计算机监控系统sustained oscillation 自持振荡swirlmeter 旋进流量计switching point 切换点symbolic processing 符号处理synaptic plasticity 突触可塑性synergetics 协同学syntactic analysis 句法分析system assessment 系统评价systematology 系统学system homomorphism 系统同态system isomorphism 系统同构system engineering 系统工程tachometer 转速表target flow transmitter 靶式流量变送器task cycle 作业周期teaching programming 示教编程telemechanics 远动学telemetering system of frequency division type 频分遥测系统telemetry 遥测teleological system 目的系统teleology 目的论temperature transducer 温度传感器template base 模版库tensiometer 张力计texture 纹理theorem proving 定理证明therapy model 治疗模型thermocouple 热电偶thermometer 温度计thickness meter 厚度计three-axis attitude stabilization 三轴姿态稳定three state controller 三位控制器thrust vector control system 推力矢量控制系统thruster 推力器time constant 时间常数time-invariant system 定常系统,非时变系统time schedule controller 时序控制器time-sharing control 分时控制time-varying parameter 时变参数top-down testing 自上而下测试topological structure 拓扑结构TQC (total quality control) 全面质量管理tracking error 跟踪误差trade-off analysis 权衡分析transfer function matrix 传递函数矩阵transformation grammar 转换文法transient deviation 瞬态偏差transient process 过渡过程transition diagram 转移图transmissible pressure gauge 电远传压力表transmitter 变送器trend analysis 趋势分析triple modulation telemetering system 三重调制遥测系统turbine flowmeter 涡轮流量计Turing machine 图灵机two-time scale system 双时标系统ultrasonic levelmeter 超声物位计unadjustable speed electric drive 非调速电气传动unbiased estimation 无偏估计underdamping 欠阻尼uniformly asymptotic stability 一致渐近稳定性uninterrupted duty 不间断工作制,长期工作制unit circle 单位圆unit testing 单元测试unsupervised learing 非监督学习upper level problem 上级问题urban planning 城市规划utility function 效用函数value engineering 价值工程variable gain 可变增益,可变放大系数variable structure control system 变结构控制vector Lyapunov function 向量李雅普诺夫函数velocity error coefficient 速度误差系数velocity transducer 速度传感器vertical decomposition 纵向分解vibrating wire force transducer 振弦式力传感器vibrometer 振动计viscous damping 粘性阻尼voltage source inverter 电压源型逆变器vortex precession flowmeter 旋进流量计vortex shedding flowmeter 涡街流量计WB (way base) 方法库weighing cell 称重传感器weighting factor 权因子weighting method 加权法Whittaker-Shannon sampling theorem 惠特克-香农采样定理Wiener filtering 维纳滤波work station for computer aided design 计算机辅助设计工作站w-plane w平面zero-based budget 零基预算zero-input response 零输入响应zero-state response 零状态响应zero sum game model 零和对策模型z-transform z变换。
a r X i v :n l i n /0204022v 1 [n l i n .C D ] 12 A p r 2002The clustering instability of inertial particles spatial distribution in turbulent flowsTov Elperin,1,∗Nathan Kleeorin,1,†Victor S.L’vov,2,‡Igor Rogachevskii,1,§and Dmitry Sokoloff3,¶1The Pearlstone Center for Aeronautical Engineering Studies,Department of Mechanical Engineering,Ben-Gurion University of the Negev,Beer-Sheva 84105,P.O.Box 653,Israel2Department of Chemical Physics,The Weizmann Institute of Science,Rehovot 76100,Israel3Department of Physics,Moscow State University,Moscow 117234,Russia(Dated:Version of February 8,2008)A theory of clustering of inertial particles advected by a turbulent velocity field caused by an instability of their spatial distribution is suggested.The reason for the clustering instability is a combined effect of the particles inertia and a finite correlation time of the velocity field.The crucial parameter for the clustering instability is a size of the particles.The critical size is estimated for a strong clustering (with a finite fraction of particles in clusters)associated with the growth of the mean absolute value of the particles number density and for a weak clustering associated with the growth of the second and higher moments.A new concept of compressibility of the turbulent diffusion tensor caused by a finite correlation time of an incompressible velocity field is introduced.In this model of the velocity field,the field of Lagrangian trajectories is not divergence-free.A mechanism of saturation of the clustering instability associated with the particles collisions in the clusters is suggested.Applications of the analyzed effects to the dynamics of droplets in the turbulent atmosphere are discussed.An estimated nonlinear level of the saturation of the droplets number density in clouds exceeds by the orders of magnitude their mean number density.The critical size of cloud droplets required for clusters formation is more than 20µm.PACS numbers:47.27.Qb,05.40.-aI.INTRODUCTIONFormation and evolution of aerosols and droplets in-homogeneities (clusters)are of fundamental significance in many areas of environmental sciences,physics of the atmosphere and meteorology (e.g.,smog and fog for-mation,rain formation),transport and mixing in in-dustrial turbulent flows (like spray drying,pulverized-coal-fired furnaces,cyclone dust separation,abrasive water-jet cutting)and in turbulent combustion (see,e.g.,[1,2,3,4,5,6,7,8]).The reason is that the direct,hy-drodynamic,diffusional and thermal interactions of par-ticles in dense clusters strongly affect the character of the involved phenomena.Thus,e.g.,enhanced binary colli-sions between cloud droplets in dense clusters can cause fast broadening of droplet size spectrum and rain for-mation (see,e.g.,[8]).Another example is combustion of pulverized coal or sprays whereby reaction rate of a single particle or a droplet differs considerably from a reaction rate of a coal particle or a droplet in a cluster (see,e.g.,[9,10]).Analysis of experimental data shows that spatial distri-butions of droplets in clouds are stronglyinhomogeneous2The main quantitative result of the theory of clustering instability of inertial particles,suggested in this paper,is the existence of this instability under some conditions that we determined.We showed that the inertia of the particles is only one of the necessary conditions for par-ticles clustering in turbulent flow.In the present study we found a second necessary condition for the clustering instability:a finite correlation time of the fluid velocity field which in the suggested theory results in a nonzero di-vergence of the field of Lagrangian trajectories.This time is equalzeroin the above mentioned Kraichnan model (see [22])which was the reason for the disappearance of the instability in this particular model.In this study we used a model of the turbulent veloc-ity field with a finite correlation time which drastically changes the dynamics of inertial particles.In the frame-work of this model of the velocity field we rigorously de-rived the sufficient conditions for the clustering instabil-ity.We demonstrated the existence of the new phenom-ena of strong and weak clustering of inertial particles in a turbulent flow.These two types of the clustering in-stabilities have different physical meaning and different physical consequences in various phenomena.We com-puted also the instability thresholds which are different for the strong and weak clustering instabilities.II.QUALITATIVE ANALYSIS OF STRONGAND WEAK CLUSTERING A.Basic equations in the continuous mediaapproximationIn this study we used the equation for the number den-sity n (t,r )of particles advected by a turbulent velocityfield u (t,r ):∂n (t,r )∂t+[v (t,r )·∇]Θ(t,r )(2.3)=−Θ(t,r )div v (t,r )+D ∆Θ(t,r ).Here we assumed that the mean particles velocity is zero.We also neglected the term ∝¯n div v describing an effect of an external source of fluctuations.This term does not affect the growth rate of the instability.In the present study we investigate only the effect of self-excitation of the clustering instability,and we do not consider an ef-fect of the source term on the dynamics of fluctuations.The source term ∝¯n div v causes another type of fluctu-ations of particle number density which are not related with an instability and are localized in the maximum scale of turbulent motions.A mechanism of these fluctu-ations is related with perturbations of the mean number density of particles by a random divergent velocity field.The magnitude of these fluctuations is much lower than that of fluctuations which are caused by the clustering instability.In our qualitative analysis of the problem we use Eq.(2.3)written in a co-moving with a cluster reference frame.Formally,this may be done using the Belinicher-L’vov (BL)representation (for details,see [23,24]).Let ξL (t 0,r |t )will be Lagrangian trajectory of the reference point [in the particle velocity field v (t,r )]located at r at time t 0and ρL (t 0,r |t )be an increment of the trajectory:ρL (t 0,r |t )=tt 0v [τ,ξL (t 0,r |τ)]d τ,(2.4)ξL (t 0,r |t )≡r +ρL (t 0,r |t ).By definition ρL (t 0,r |t 0)=0and ξ(t 0,r |t 0)=r .Define as r 0a position of a center of a cluster at the “initial”time t 0=0(for the brevity of notations hereafter we skip the label t 0)and consider a “co-moving”reference frame with the position of the origin at ζ0(t )≡ζ(r 0|t ).Then BL velocity field ˜v(r 0|t,r )and BL velocity differ-ence W (r 0|t,r )are defined as˜v(r 0|t,r )≡v [t,r +ρL (r 0|t )],(2.5)W (r 0|t,r )≡˜v(r 0|t,r )−˜v (r 0|t,r 0).(2.6)Actually the BL representation is very similar to the La-grangian description of the velocity field.The difference between the two representations is that in the Lagrangian representation one follows the trajectory of every fluid particle r +ρL (r ,t 0|t )(located at r at time t =t 0),whereas in the BL-representation there is a special ini-tial point r 0(in our case the initial position of the center of the cluster)whose trajectory determines the new co-ordinate system (see [23,24]).With time the BL-field ˜v(r 0|t,r )becomes very different from the Lagrangian velocity field.It must be noted that the simultaneous correlators of both,the Lagrangian and the BL-velocity fields,are identical to the simultaneous correlators of the Eulerian velocity v (r ,t ).The reason is that for station-ary statistics the simultaneous correlators do not depend on t ,and in particular one can assume t =t 0.Similar to Eq.(2.5)let us introduce BL representation for Θ(t,r ):˜Θ(r 0|t,r )≡Θ[t,r +ρL (r 0|t )].(2.7)3In BL variables defined by Eqs.(2.5)-(2.7),Eq.(2.3) reads:∂˜Θ(r0|t,r)ℓcl .(2.11) Here A(t)is time-dependent amplitude of a cluster,ℓcl is the characteristic width of the cluster.Shape func-tionθ(x)may be chosen with a maximum equal one at x=0and unit width.Real shapes of various clusters in the turbulent ensemble are determined by a compe-tition of different terms in the evolution equation(2.8). However,we believe that particular shapes affect only numerical factors in the expression for the growth rate of clusters and do not effect their functional dependence on the parameters of the problem which is considered in this subsection.1.Effect of turbulent diffusionThe advective term in the LHS of Eq.(2.8)results in turbulent diffusion inside the cluster.This effect may be modelled by renormalization of the molecular diffusion coefficient D in the right hand side(RHS)of Eq.(2.8)by the effective turbulent diffusion coefficient D T with a usual estimate of D T:D→D+D T,D T∼ℓcl v cl/3.(2.12) Hereafter v cl is the mean square velocity of particles at the scaleℓcl.Instead of the full Eq.(2.8)consider now a model equation∂˜Θ(r0|t,r)∂t=−˜Θ(r0|t,r)div W(r0|t,r).(2.17) The main contribution to the BL-velocity difference W(r0|t,r)in the RHS of this equation is due to the ed-dies with sizeℓcl,the characteristic size of the cluster. Denote by v cl the characteristic velocity of these eddies and byτv∼ℓcl/v cl the corresponding correlation time. In our qualitative analysis we neglect the r dependence of div W(r0|t,r)inside the cluster and consider the di-vergence in Eq.(2.17)as a random process b(t)with a correlation timeτv:div W(r0|t,r)→b(t).(2.18) Together with the decomposition(2.11)this yields the following equation for the cluster amplitude A(t):∂A(t)4 The solution of Eq.(2.19)reads:A(t)=A0exp[−I(t)],I(t)≡ t0b(τ)dτ.(2.20)Integral I(t)in Eq.(2.20)can be rewritten as a sum ofintegrals I n over small time intervalsτv:I(t)=t/τvn=1I n,I n(t)≡nτv (n−1)τv b(τ)dτ.In our qualitative analysis integrals I n may be considered as independent random ing the central limit theorem we estimate the total integralI(t)∼ Nζ, I2n v= b2 vτ2v, where ... v denotes averaging over turbulent velocity ensemble,ζis a Gaussian random variable with zero mean and unit variance,N=t/τv.Now we calculateM q(t)= Θq P(ζ)dζ,P(ζ)=(1/√2π) exp[−(ζ−qS√N,the parameter q cannot be large.In this approximation the q−momentsM q(t)=M q(0)exp[γin(q)t]withγin(q)being the growth rate of the q-th moment due to particles inertia which is given byγin(q)∼12 τv[div W(r0|t,r)]2 v q2. Clearly,the instability is caused by a nonzero value of τv(div W)2 ,i.e.,by a compressibility of the particle ve-locityfield v(t,r).Compressibility offluid velocity itself u(t,r)(including atmospheric turbulence)is often negligible,i.e.,div u≈0.However,due to the effect of particles inertia their velocity v(t,r)does not coincide with u(t,r)(see,e.g., [25,26,27,28]),and a degree of compressibility,σv,of thefield v(t,r),may be of the order of unity[19,20,29]. Parameterσv is defined asσv≡ [div v]2 / |∇×v|2 .(2.23) Note that parameterσv is independent of the scale of the turbulent velocityfield.It characterizes a compressible part of the velocityfield as a whole.The main contribu-tion to this parameter comes from the scales which are of the order of the Kolmogorov scaleη.For inertial particles div v∼τp∆P/ρ,whereτp is the particle response time,τp=m p/6πρνa=2ρp a2/9ρν,(2.24) where m p andρp are the mass and material density of particles,respectively.Thefluidflow parameters are: pressure P,Reynolds number R e=Lu T/ν,the dissi-pative scale of turbulenceη=L R e−3/4,the maximumscale of turbulent motions L and the turbulent velocity u T in the scale L.Now we can estimateσv asσv≃(ρp/ρ)2(a/η)4≡(a/a∗)4(2.25) (see[20]),where a∗is a characteristic radius of particles. For a>a∗it is plausible to correct this estimate as follows:σv∼a4τη,q cr∼15contain a small fraction of the total number of parti-cles.This does not mean that the instability of higher moments is not important.Thus,e.g.,the rate of bi-nary particles collisions is proportional to the square of their number density n2 =(¯n)2+ |Θ|2 .Therefore, the growth of the2nd moment, |Θ|2 ,(which we de-fine as a weak clustering)results in that binary collisionsoccur mainly between particles inside the cluster.The latter can be important in coagulation of droplets in at-mospheric clouds whereby the collisions between droplets play a crucial role in a rain formation.The growth of the q-th moment, |Θ|q ,results in that q-particles col-lisions occur mainly between particles inside the cluster. The growth of the negative moments of particles num-ber density(possibly associated with formation of voids and cellular structures)was discussed in[30](see also [31,32]).In the above qualitative analysis whereby we consid-ered only one-point correlation functions of the number density of particles,we missed an important effect of an effective drift velocity which decreases a growth rate of the clustering instability.For the one-point correlation functions of the number density of particles the effective drift velocity is zero for homogeneous and isotropic tur-bulence.However,in the equations for two-point and multi-point correlation functions of the number density of particles the effective drift velocity is not zero and as we will see in the next section it increases a threshold for the clustering instability.III.THE CLUSTERING INSTABILITY OF THE2ND MOMENTA.Basic equationsIn the previous section we estimated the growth rates of all moments |Θ|q .Here we present the results of a rigorous analysis of the evolution of the two-point2nd momentΦ(t,R)≡ Θ(t,r)Θ(t,r+R) .(3.1) In this analysis we used stochastic calculus[e.g.,Wiener path integral representation of the solution of the Cauchy problem for Eq.(2.1),Feynman-Kac formula and Cameron-Martin-Girsanov theorem].The compre-hensive description of this approach can be found in [21,33,34,35].We showed that afinite correlation time of a turbulent velocity plays a crucial role for the clustering instability. Notably,an equation for the second momentΦ(t,R)of the number density of inertial particles comprises spatial derivatives of high orders due to a non-local nature of turbulent transport of inertial particles in a random ve-locityfield with afinite correlation time(see Appendix A and[20]).However,we found that equation forΦ(t,R) is a second-order partial differential equation at least for two models of a random velocityfield:M odel I.The random velocity with Gaussian statis-tics of the integrals t0v(t′,ξ)dt′and t0b(t′,ξ)dt′,see Appendix B.M odel II.The Gaussian velocityfield with a small yet finite correlation time,see Appendix C.In both models equation forΦ(t,R)has the same form:∂Φ/∂t=ˆLΦ(t,R),(3.2)ˆL=B(R)+2U(R)·∇+ˆDαβ(R)∇α∇β,but with different expressions for its coefficients.The meaning of the coefficients B(R),U(R)andˆDαβ(R)is as follows:–Function B(R)is determined only by a compress-ibility of the velocityfield and it causes generation of fluctuations of the number density of inertial particles.–The vector U(R)determines a scale-dependent drift velocity which describes a transfer offluctuations of the number density of inertial particles over the spectrum. Note that U(R=0)=0whereas B(R=0)=0.For incompressible velocityfield U(R)=0,B(R)=0.–The scale-dependent tensor of turbulent diffusion ˆDαβ(R)is also affected by the compressibility.In very small scales this tensor is equal to the tensor of the molecular(Brownian)diffusion,while in the vicin-ity of the maximum scale of turbulent motions this ten-sor coincides with the usual tensor of turbulent diffusion. TensorˆDαβ(R)may be written asˆDαβ(R)=2Dδαβ+D Tαβ(R),(3.3)D Tαβ(R)=˜D Tαβ(0)−˜D Tαβ(R).In Appendix B we found that for Model I:B(R)≈2 ∞0 b[0,ξ(r1|0)]b[τ,ξ(r2|τ)] dτ,(3.4) U(R)≈−2 ∞0 v[0,ξ(r1|0)]b[τ,ξ(r2|τ)] dτ,˜D Tαβ(R)≈2 ∞0 vα[0,ξ(r1|0)]vβ[τ,ξ(r2|τ)] dτ.For theδ-correlated in time random Gaussian com-pressible velocityfield the operatorˆL is replaced byˆL0 in the equation for the second momentΦ(t,R),where ˆL0≡B0(R)+2U0(R)·∇+ˆDαβ(R)∇α∇β,B0(R)=∇α∇βˆDαβ(R),(3.5)U0,α(R)=∇βˆDαβ(R)(for details see[20,21]).In theδ-correlated in time ve-locityfield the second momentΦ(t,R)can only decay in spite of the compressibility of the velocityfield.The rea-son is that the differential operatorˆL0≡∇α∇βˆDαβ(R) is adjoint to the operatorˆL†0≡ˆDαβ(R)∇α∇βand their eigenvalues are equal.The damping rate for the equation∂Φ/∂t=ˆL†0Φ(t,R)(3.6)6has been found in Ref.[36]for a compressible isotropic homogeneous turbulence in a dissipative range:γ2=−(3−σT)2∇×D T×∇=∇α∇βD Tαβ(R)(∇×ξ)2 ,(3.10) whereξ(r1|t)is the Lagrangian displacement of a par-ticle trajectory which paths through point r1at t=0. Remarkably that Taylor[37]obtained the coefficient of turbulent diffusion for the meanfield in the form D T(R= 0)=τ−1 ξα(r1|t)ξα(r1|t) .B.Clustering instability in Model ILet us study the clustering instability for the model of the random velocity with Gaussian statistics of the integralst0v(t′,ξ)dt′, t0b(t′,ξ)dt′,see Appendix B.In this model Eq.(3.2)in a non-dimensional form reads:∂Φm(r)+ 1r+BΦ, 1/m(r)≡(C1+C2)r2+2/Sc,(3.11)where U≡U R and Sc=ν/D is the Schmidt number.For small inertial particles advected by airflow Sc≫1. The non-dimensional variables in Eq.(3.11)are r≡R/ηand˜t=t/τη,B and U are measured in the unitsτ−1η. Consider a solution of Eq.(3.11)in two spatial regions.a.Molecular diffusion region of scales.In this region r≪Sc−1/2,and all terms∝r2(with C1,C2and U)may be neglected.Then the solution of Eq.(3.11)is given by Φ(r)=(1−αr2)exp(γ2t),whereα=Sc(B−γ2τη)/12,B>γ2τη.b.Turbulent diffusion region of scales.In this regionSc−1/2≪r≪1,the molecular diffusion term∝1/Sc is negligible.Thus,the solution of(3.11)in this region isΦ(r)=A1r−λexp(γ2t),whereλ=(C1−C2+2U±iC3)/2(C1+C2),C23=4(B−γ2τη)(C1+C2)−(C1−C2+2U)2. Since the total number of particles in a closed volume is conserved:∞0r2Φ(r)dr=0.This implies that C23>0,and thereforeλis a complex number.Since the correlation functionΦ(r)has a global maximum at r=0,C1>C2−2U.The latter condition for very small U yieldsσT≤3.For r≫1the solution forΦ(r)decays sharply with r.The growth rateγ2of the second moment of particles number density can be obtained by matching the correlation functionΦ(r)and itsfirst derivativeΦ′(r)at the boundaries of the above regions,i.e.,at the points r=Sc−1/2and r=1.The matching yields C3/2(C1+C2)≈2π/ln Sc.Thus,γ2=13(1+σU)−(3−σT)2 (1+σT)ln2Sc +20(σB−σU)7FIG.1:The range of parameters(σv,σT)forσB=σU=σvin the case of Sc=103(curve c),Sc=105(curve b)andSc→∞(curve a).The dashed lineσv=σT corresponds totheδ-correlated in time random compressible velocityfield.that even a very small deviations from theδ-correlatedin time random compressible velocityfield results in theinstability of the second moment of the number densityof inertial particles.The minimum value ofσT requiredfor the clustering instability isσT≈0.26and a corre-sponding value ofσv≈0.12(see Fig.1).For smallervalue ofσv the clustering instability can occur,but itrequires larger values ofσT.Notably,in Model II of a random velocityfield(i.e.,the Gaussian velocityfield with a small yetfinite correla-tion time)the clustering instability occurs whenσv>0.2(see Appendix C).Indeed,the growth rateγ2of the sec-ond moment of particles number density is determinedby equation:Γ=˜B(σv)St2−(3−σv)23(1+σv) π4a22−b13(1+σv),a2=2(3σv+1)27(1+σv)2(12−1278σv−3067σ2v),b2=8501+σv 2,b3=1η 2−(3−σv)23π8[30]).The clustering instability is saturated by nonlineareffects.Now let us discuss a mechanism of the nonlinear satu-ration of the clustering instability using on the example of atmospheric turbulence with characteristic parameters:η∼1mm,τη∼(0.1−0.01)s.A momentum coupling of particles and turbulentfluid is essential when m p n cl∼ρ, i.e.,the mass loading parameterφ=m p n cl/ρis of theorder of unity(see,e.g.,[1]).This condition implies that the kinetic energy offluidρ u2 is of the order of the particles kinetic energy m p n cl v2 ,where|u|∼|v|.This yields:n cl∼a−3(ρ/3ρp).(4.1) For water dropletsρp/ρ∼103.Thus,for a=a∗∼30µm we obtain n cl∼104cm−3and the total number of particles in the cluster of sizeη,N cl≃η3n cl∼10. This values may be considered as a lower estimate for the “two-way coupling”when the effect offluid on particles has to be considered together with the feed-back effect of the particles on the carrierfluid.However,it was found in[38]that turbulence modification by particles is governed by the ratio of the particle energy and the total energy of the suspension(rather then the energy of the carrierfluid)and thus by parameterφ(1+φ)(rather then byφitself).Thus we expect that the two-way coupling can only mitigate but not stop the clustering instability. An actual mechanism of the nonlinear saturation of the clustering instability is“four way coupling”when the particle-particle interaction is also important.In this situation the particles collisions result in effective particle pressure which prevents further grows of con-centration.Particles collisions play essential role when during the life-time of a cluster the total number of colli-sions is of the order of number of particles in the cluster. The rate of collisions J∼n cl/τηcan be estimated as J∼4πa2n2cl|v rel|.The relative velocity v rel of collid-ing particles with different but comparable sizes can be estimated as|v rel|∼τp|(u·∇)u|∼τp u2η/η.Thus the collisions in clusters may be essential forn cl∼a−3(η/a)(ρ/3ρp),ℓs∼a(3aρp/ηρ)1/3,(4.2) whereℓs is a mean separation of particles in the cluster. For the above parameters(a=30µm)n cl∼3×105cm−3,ℓs∼5a≈150µm and N cl∼300.Note that the mean number density of droplets in clouds¯n is about102−103cm−3.Therefore the clustering instability of droplets in the clouds increases their concentrations in the clusters by the orders of magnitude.In all our analysis we have neglected the effect of sed-imentation of particles in gravityfield which is essential for particles of the radius a>100µm.Takingℓcl≃ηwe assumed implicitly thatτp<τη.This is valid(for the atmospheric conditions)if a≤60µm.Otherwise the cluster size can be estimated asℓcl≃η(τp/τη)3/2.Our estimates support the conjecture that the clus-tering instability serves as a preliminary stage for a co-agulation of water droplets in clouds leading to a rain formation.V.DISCUSSIONIn this study we investigated the clustering instability of the spatial distribution of inertial particles advected by a turbulent velocityfield.The instability results in formation of clusters,i.e.,small-scale inhomogeneities of aerosols and droplets.The clustering instability is caused by a combined effect of the particle inertia andfinite cor-relation time of the velocityfield.Thefinite correlation time of the turbulent velocityfield causes the compress-ibility of thefield of Lagrangian trajectories.The latter implies that the number of particlesflowing into a small control volume in a Lagrangian frame does not equal to the number of particlesflowing out from this control vol-ume during a correlation time.This can result in the depletion of turbulent diffusion.The role of the compressibility of the velocityfield is as follows.Divergence of the velocityfield of the inertial particles div v=τp∆P/ρ.The inertia of particles results in that particles inside the turbulent eddies are carried out to the boundary regions between the eddies by iner-tial forces(i.e.,regions with low vorticity and high strain rate).For a small molecular diffusivity div v∝−dn/dt [see Eq.(2.1)].Therefore,dn/dt∝−τp∆P/ρ.Thus there is accumulation of inertial particles(i.e.,dn/dt>0) in regions with∆P<0.Similarly,there is an out-flow of inertial particles from the regions with∆P>0. This mechanism acts in a wide range of scales of a tur-bulentfluidflow.Turbulent diffusion results in relax-ation offluctuations of particles concentration in large scales.However,in small scales where turbulent diffu-sion is small,the relaxation offluctuations of particle concentration is very weak.Therefore thefluctuations of particle concentration are localized in the small scales. This phenomenon is considered for the case when den-sity offluid is much less than the material densityρp of particles(ρ≪ρp).Whenρ≥ρp the results coincide with those obtained for the caseρ≪ρp except for the transformationτp→β∗τp,whereβ∗=2 1+ρ2ρp+ρ .Forρ≥ρp the value dn/dt∝−β∗τp∆P/ρ.Thus there is accumulation of inertial particles(i.e.,dn/dt>0)in regions with the minimum pressure of a turbulentfluid sinceβ∗<0.In the caseρ≥ρp we used the equation of motion of particles influidflow which takes into account contributions due to the pressure gradient in thefluid surrounding the particle(caused by acceleration of the fluid)and the virtual(”added”)mass of the particles relative to the ambientfluid[39].The exponential growth of the second moment of a number density of inertial particles due to the small-9scale instability can be saturated by the nonlinear ef-fects (see Section IV).The excitation of the second mo-ment of a number density of particles requires two kinds of compressibilities:compressibility of the velocity field and compressibility of the field of Lagrangian trajecto-ries,which is caused by a finite correlation time of a ran-dom velocity field.Remarkably,the compressibility of the field of Lagrangian trajectories determines the coeffi-cient of turbulent diffusion (i.e.the coefficient D αβnear the second-order spatial derivative of the second moment of a number density ofinertialparticles in Eq.(3.2).The compressibility of the field of Lagrangian trajecto-ries causes depletion of turbulent diffusion in small scales even for σv =0.On the other hand,the compressibility of the velocity field determines a coefficient B(r)near the second moment of a number density of inertial particles in Eq.(3.2).This term is responsible for the exponen-tial growth of the second moment of a number density of particles.Summary :•We showed that the physical reason for the cluster-ing instability in spatial distribution of particles in tur-bulent flows is a combined effect of the inertia of particles leading to a compressibility of the particle velocity field v (t,r )and a finite velocity correlation time.•The clustering instability can result in a strong clus-tering whereby a finite fraction of particles is accumu-lated in the clusters and a weak clustering when a finite fraction of particle collisions occurs in the clusters.•The crucial parameter for the clustering instability is a radius of the particles a .The instability criterion is a >a cr ≈a ∗for which (div v )2 = |rot v |2 .For the droplets in the atmosphere a ∗≃30µm.The growth rateof the clustering instability γcl ∼τ−1η(a/a ∗)4,where τηis the turnover time in the viscous scales of turbulence.•We introduced a new concept of compressibility of the turbulent diffusion tensor caused by a finite corre-lation time of an incompressible velocity field.For this model of the velocity field,the field of Lagrangian tra-jectories is not divergence-free.•We suggested a mechanism of saturation of the clus-tering instability -particle collisions in the clusters .An evaluated nonlinear level of the saturation of the droplets number density in clouds exceeds by the orders of mag-nitude their mean number density.AcknowledgmentsWe have benefited from discussions with I.Procaccia.This work was partially supported by the German-Israeli Project Cooperation (DIP)administrated by the Federal Ministry of Education and Research (BMBF),by the Is-rael Science Foundation and by INTAS (Grant 00-0309).DS is grateful to a special fund for visiting senior sci-entists of the Faculty of Engineering of the Ben-Gurion University of the Negev and to the Russian Foundationfor Basic Research (RFBR)for financial support under grant 01-02-16158.APPENDIX A:Basic equations in the model with arandom renewal timeIn this Appendix we derive Eq.(A20)for the simul-taneous second-order correlation function Φ(t,r )which serves as a basis for further analysis in Appendixes B and C under some simplifying model assumptions about the statistics of the velocity field.1.Exact solution of dynamical equations for agiven velocity field a.Simple case:no molecular diffusionConsider first Eq.(2.1)for the number density of par-ticles n (t,r )in the case D =0:∂n (t,r )2![−˜ρL ·∇]2−...(A6)which acts as follows:exp[−˜ρL ·∇]n (t,r )=n (t,r −˜ρL ).(A7)。