!Determining the parametric structure of models
- 格式:pdf
- 大小:666.76 KB
- 文档页数:15
摘要本文是根据无锡迪奥机械厂轴承内外圈生产线改造项目要求,针对工人手工上料易出现危险,且效率较低的问题,设计一套轴承内外圈加工专用机床上料机构,使其能够代替工人手工上料,保证工人操作的安全性,并提高生产效率。
论文根据轴承内外圈的特点,对其上料机构进行了合理的设计。
此上料机构主要实现了坯料的自动定位、夹紧以及工件的回放等功能。
这一系列运动都是由气缸驱动获得。
本设计中的设计部分主要包括:上料机构设计;料道设计;夹紧机构设计;驱动系统等。
确定了上料机构的具体尺寸后,利用UG软件对上料机构的零件进行参数化建模,并对整体结构进行虚拟装配。
然后将装配体导入UG软件的运动仿真界面,并利用软件进行运动学仿真和动力学仿真。
分析仿真结果,得出相应结论。
最后对轴承内外圈加工专用机床上料机构进行优化设计,提高其稳定性,可靠性,让本设计能够真正的投入到日常生产操作中,使其切实能够为轴承厂的生产线改造做出贡献。
关键词:上料机构;参数化建模;虚拟装配;运动仿真AbstractThis article is based on the requirements of the bearings inside and outside circle line renovation project of the wuxi dior machinery, and it aims to solve the problem that the manual feeding by workers which is dangerous for workers, and it has low efficiency, therefore a set of bearing inner and outer circle machining machine tool feeding mechanism is designed to replace the manual feeding of workers, and ensure the security of workers when operating, and improving producing efficiency.This paper provides a reasonable design of the feeding mechanism according to the characteristics of the bearing inner and outer circle. The feeding mechanism mainly achieves the following functions: the automatically positioning, clamping and artifact-play backing of the billet, etc. This series of movement is driven by cylinder. The design part of this design mainly includes: the design portion of feeding mechanism; material design; the design of Clamping mechanism; Drive systems, etc. The method is: firstly determining the dimensions of the feeding mechanism, and then using UG software for parametric modeling of parts of the feeding mechanism, after that, doing virtual assembly to the overall structure; then importing the assembly body to the motion simulation interface of UG software, and using the software to do kinematics simulation and dynamics simulation. Thereafter, analyzing the simulation results, and getting the corresponding conclusions. Finally the optimization design of bearing inner and outer circle machining machine tool feeding mechanism is done, to improve its stability, reliability, and to make the design truly enter into the daily production operation, make it practical and able to contribute to bearing factory production line modification.Key words: feeding agencies; parametric modeling; virtual assembly; motion simulation目录摘要 (I)Abstract (II)目录 (V)1 绪论 (1)1.1 上料机构设计的研究内容和意义 (1)1.2 上料机构国内外发展现状 (1)1.3 本课题应达到的要求 (2)2 总体方案设计 (4)2.1 上料机构的类型 (4)2.2 设计方案 (5)2.2.1 设计要求 (5)2.2.2 整体结构设计方案 (6)2.2.3 上料机构各部分的设计与计算 (7)2.3 本章小结 (14)3 基于UG的上料机构三维建模与虚拟装配 (15)3.1 UG软件介绍 (15)3.2 三维建模 (16)3.2.1 主要部件及其相互关系 (16)3.2.2 主要零部件的三维建模 (16)3.3 上料机构的虚拟装配 (21)3.3.1 基于UG的高级装配功能 (21)3.3.2 上料机构的虚拟装配 (22)3.4 本章小结 (24)4 基于UG的运动仿真 (25)4.1 运动仿真的工作界面 (25)4.1.1 UG的接口问题 (25)4.1.2 打开运动仿真主界面 (26)4.1.3 运动仿真工作界面介绍 (26)4.2 上料机构的运动仿真 (27)4.2.1 连杆特性的建立 (28)4.2.2 运动副特性的建立 (30)4.2.3 施加运动 (31)4.2.4 分析验证 (33)4.3 本章小结 (34)5 结论与展望 (35)5.1 结论 (35)5.2 不足之处与展望 (35)致谢 (36)参考文献 (37)1 绪论1.1 上料机构设计的研究内容和意义本课题来源于无锡迪奥机械厂生产线改造项目。
nx参数化建模流程(中英文版)Title: Parametric Modeling Process with NX标题:NX参数化建模流程1.The initial step in the parametric modeling process with NX is to define the design intent.This involves determining the overall configuration and dimensions of the model, as well as any constraints that will guide the modeling process.1.NX参数化建模流程的初始步骤是定义设计意图。
这包括确定模型的整体配置和尺寸,以及任何将指导建模过程的限制。
2.Once the design intent is established, the next step is to create the initial geometry.This can be done using a variety of tools, such as sketches, extrusions, and revolves.2.一旦确定了设计意图,下一步就是创建初始几何形状。
这可以通过使用各种工具来完成,如草图、拉伸和旋转。
3.After the initial geometry is created, the next step is to apply constraints to the model.Constraints can be used to define relationships between different parts of the model, such as distances, angles, and positions.3.创建初始几何形状后,下一步是将限制条件应用于模型。
Cost Estimating SchemeIntroductionCost estimating is an essential aspect of project management that involves predicting and calculating the expenses associated with a particular project or task. This is an important process as it helps in the efficient allocation of avlable resources and assists in making informed decisions regarding the feasibility and profitability of the project. In this document, we will discuss a comprehensive cost estimating scheme in English, outlining the various factors to consider and methods to determine the accurate cost estimates.Factors Influencing Cost Estimatesbor Costs: Labor costs are one of the significant factors indetermining the overall project cost. This includes wages, salaries,benefits, and any additional expenses related to the workforce involved in the project.2.Material Costs: The cost of materials required for the project is another significant factor. This includes the cost of raw materials, equipment, machinery, and other supplies essential for project completion.3.Subcontractor Costs: If subcontractors are hired for specialized tasks, their costs should be considered separately. This includes the costs associated with hiring, managing, and supervising subcontractors.4.Overhead Costs: Overhead costs such as utilities, rent, insurance, and office supplies should be taken into account. These costs are not directly related to a specific project but contribute to the overall expenses.5.Inflation and Currency Fluctuations: Inflation and currency fluctuations can impact the cost estimates significantly. Thesefactors should be considered, especially for projects that span over an extended period or involve international transactions.6.Market Conditions: The current market conditions, including supply and demand, can influence the cost of materials and labor. Staying updated with market trends and conditions is crucial for accurate cost estimating.7.Contingency Budget: It is vital to include a contingency budget to account for unforeseen events or risks that may arise during project execution. This ensures that any unexpected costs can be managed without adversely affecting the project.8.Timeframe: The timeframe of a project can also impact the cost estimates. Longer project durations may incur additional expenses such as increased labor costs or inflation.Methods for Cost Estimation1.Analogous Estimating: This method involves using historical data from similar past projects to estimate the costs of the current project. This is a quick and strghtforward method but may not be accurate if the projects differ significantly.2.Parametric Estimating: In this method, mathematical models are used to estimate project costs based on specific parameters such as size, complexity, or volume. It provides more accuracy than analogous estimating and is commonly used for repetitive tasks.3.Bottom-Up Estimating: Bottom-up estimating involves estimating the cost of individual project components and then combining them to determine the overall project cost. This method is time-consuming but provides a detled and accurate cost estimate.4.Three-Points Estimating: Three-points estimating is a technique that uses three estimates: the most optimistic, mostpessimistic, and most likely scenario. The average of these estimates is then used to determine the cost estimate. This method is useful for projects with a high level of uncertnty.5.Vendor Quotes and Bids: Obtning quotes and bids fromvendors and suppliers can provide accurate cost estimates formaterial and equipment costs. This method is particularly usefulwhen dealing with external parties.Documentation and ReportingDocumenting and reporting the cost estimates is crucial for effective project management. The following points should be considered when documenting cost estimates:1.Cost Breakdown Structure: A cost breakdown structureprovides a detled breakdown of all the cost elements involved in the project. It helps in understanding the sources of costs and facilitates better cost control.2.Cost Estimating Worksheet: A cost estimating worksheet should be prepared, documenting the estimated cost for each component of the project. This includes labor, material, subcontractor, and overhead costs.3.Assumptions and Constrnts: Clearly stating the assumptions and constrnts considered while estimating the costs is essential for transparency and avoiding misunderstandings. This ensures that everyone involved in the project is aware of the underlying assumptions.4.Risk Assessment: Identifying and assessing potential risks and their impact on the cost estimates should be documented. This helps in proactive risk management and better decision-making.5.Cost Variance Tracking: Tracking the actual costs incurred during the project execution and comparing them with the estimated costs is crucial. Any variations from the estimated costsshould be documented and analyzed to identify the reasons behind the deviations.6.Regular Reporting: Regular reporting of cost estimates andactual costs to the project stakeholders is essential. This helps in mntning transparency and enables informed decision-making.ConclusionA well-defined and accurate cost estimating scheme is essential for successful project management. Considering the various factors influencing cost estimates, using appropriate estimation methods, and effectively documenting and reporting cost estimates contribute to better resource allocation, cost control, and decision-making. By following the guidelines outlined in this document, project managers can enhance their ability to estimate costs accurately and improve project outcomes.。
ftir分析方法(FTIR analysis method)Two, group frequency and characteristic absorption peakEach molecule of the constituent molecule has its specific infrared absorption region. According to the nature of chemical bond, which can be divided into four regions: 4000 - 2500 cm-1 - 2000 cm-1 2500 hydrogen bonded area; 2000 - 1500 cm-1 and keypad; double bond zone; 1500 - 1000 cm-1 bond zone. According to the absorption characteristics, it can be divided into functional areas (4000 - 1300cm-1) and fingerprint areas (1300 - 600cm-1). The characteristic absorption peak of the group usually appears in the functional group, and the absorption peak is sparse, which is beneficial to the analysis, and is the most useful area for the identification of the group. The latter mainly contains spectra produced by deformation vibration. When the molecular structure is slightly different, the absorption peaks show subtle differences in this region, thus being as specific as human fingerprints. This region is important for distinguishing compounds with similar structures.Three 、 infrared absorption spectrum analysisA typical infrared spectrum is shown:In order to facilitate the analysis of infrared spectrum, the spectrum is usually divided into eight different sections. According to the corresponding absorption peaks, the possible structure of the compounds can be deduced.A) O-H and N-H stretching vibration zone (3750-3000 cm-1)Alcohol, phenol, acid in nonpolar solvents, small concentration (dilute solution), the sharp peak of strong absorption; when the concentration is high, the association effect, wide peak.The peak intensity of amino compounds is weaker than that of associated OH base, and the peaks are more sharp. The number of absorption peaks depends on the number of substituents on the N atom. The primary amine and the primary amide are Shuangfeng, and the intensity of the two peaks is approximately equal.B) C-H stretching vibration zone (3300 - 2700cm-1)The stretching vibrations of different types of compounds are at different locations in this region. The absorption peak of C-H, =C-H, and Ar-H are in the area above 3000 cm-1, the absorption intensity and alkynes with narrow spectrum. Aromatic hydrocarbons are near 3030cm-1 and olefins are absorbed at 3010-3040cm-1, and the absorption of terminal olefins occurs near 3028cm-1. The absorption wavelength of saturated hydrocarbons and aldehydes is lower than 3000cm-1, which can distinguish saturated hydrocarbons from unsaturated hydrocarbons.(C) parametric bonds and accumulated double bond regions (2400-2100cm-1)There are less bands in this region. Main absorption with:(D) carbonyl stretching vibration zone (1900-1650cm-1)The most common region of the carbonyl group is 1755-1670cm-1, which is often the strongest absorption peak in the infrared spectrum. The V C=O absorption is the main basis for determining the compounds have no carbonyl existed.Due to the difference of neighboring groups, the specific peak positions of carbonyl compounds are also different.(E) double bond stretching vibration region (1690-1500cm-1)This region mainly includes the stretching vibration of C=C, C=N, N=N and N=O and the skeleton vibration of benzene ring.The olefin is generally weak and moves with the C=C bond toward the center of the molecule, and its intensity decreases or vanishes. Therefore, there is no basis for judging a unique key.The skeleton stretching vibration of benzene ring, pyridine ring and other aromatic rings is in the range of 1600-1450cm-1, and there are 3-4 bands near 1600, 1580, 1500 and 1450cm-1. The 2-3 bands of this range are commonly used to confirm the existence of aromatic rings and hetero aromatic rings.(F) X-H in-plane bending vibration and X-Y stretching vibration zone (1475-1000cm-1);The region mainly includes C-H in-plane bending, C-O and C-X stretching vibration, and C-C skeleton vibration, which are part of the fingerprint area. The fine structure of alkyl,alcohol and ether can be analyzed from this region.Methyl and methylene have characteristic absorption at1460cm-1, in addition, the single peak of the isolated methyl group at 1380cm-1 is the basis for determining whether there is methyl in the molecule. When two or three methyl groups are linked to the same carbon atom, the 1380cm-1 peak splits into Shuangfeng, which is caused by the coupling of multiple methyl bending vibrations. Thus, isopropyl can be determined.(G) C-H out of plane flexural vibration zone (1000-650cm-1);The most important aspect of this region is that the C-H outer flexural vibrations of olefins and aromatics are very sensitive to structure, and the positions of substituents on olefins and aromatic rings can be determined by means of these absorption peaks.For compounds of type RCH=CH2, two peaks appear at 995 and 910cm-1, respectively. For the R1R2C=CH2 type, R is an alkyl, and a strong attraction peak appears at 890cm-1. The influence of substituent on the peak of CIS two substituted alkene is more significant than that of trans group. Trans olefins have absorption peaks at 970cm-1 and are used to determine CIS trans isomers.Using IR spectra to identify the substitution types of aromatic compounds mainly depends on two parts: (1) strong peaks of 900-650cm-1; (2) weak peaks of 2000-1660cm-1.The resolution steps of infrared spectrum are as follows:(1) understanding the source and purity of the sample. Purity is generally required in more than 98%, such as purity does not meet the requirements, you need to first separation and purification.(2) calculate the degree of unsaturation of the compound.(3) the group and its structure are determined by IR spectra. The absorption peaks of the high wavenumber region determine the group and structure, and then confirm from the fingerprint region. After the structure is deduced, it can be confirmed by consulting the standard spectral library.Notes on analytical spectra(1) the enantiomers have the same infrared spectrum and can not be distinguished by IR.(2) when the characteristic absorption peak does not exist, it can be confirmed that the group does not exist. But the absorption peak does not mean that the group must exist, and that it may be interference by impurities.(3) all absorption peaks in a spectrogram can not be attributed to each other, because some peaks are frequency doubling or group frequency of other peaks, and some peaks are the superposition of vibrational absorption of many groups.(4) there are only a few broad absorption peaks above 650cm-1, mostly inorganic compounds.(5) the absorption peaks at 3350 and 1640cm-1 may be the peaks of water absorbed in the sample.(6) the infrared spectra of polymers are independent of molecular weight.(7) we should pay attention to the strong absorption peaks first, but some weak peaks and shoulder peaks can also provide clues to the structure.。
Engineering project cost management is a crucial aspect of the construction industry. It involves the planning, estimating, budgeting, and controlling of costs involved in the construction of a project. Effective cost management ensures that projects are completed within the allocated budget and within the scheduled time. This article aims to provide an overview of the key aspects of engineering project cost management.1. Project Cost EstimationThe first step in cost management is to estimate the total cost of the project. This involves analyzing the project requirements, identifying the necessary resources, and determining the cost of these resources. Cost estimation can be done using various methods, such as:- Unit cost method: Estimating the cost per unit of work and multiplying it by the total quantity of work.-类比法: Comparing the project with similar completed projects to determine the cost.- Parametric estimation: Using mathematical models to estimate the cost based on project parameters.1. Project Cost BudgetingOnce the cost estimation is completed, the next step is to create a cost budget. The cost budget provides a detailed breakdown of the expected costs for each stage of the project. This includes direct costs (labor, materials, equipment) and indirect costs (overhead, contingency, profit). Budgeting helps in identifying potential cost overruns and taking necessary measures to prevent them.1. Cost ControlCost control is the process of monitoring and managing the actual costs incurred during the project execution. This involves comparing theactual costs with the budgeted costs and identifying any variances. Cost control measures include:- Cost tracking: Keeping a record of all costs incurred during the project.- Variance analysis: Analyzing the reasons for cost variances and taking corrective actions.- Change management: Managing any changes in the project scope, schedule, or resources that may affect the cost.1. Cost Reduction StrategiesCost reduction is an essential aspect of cost management. Somestrategies to reduce costs include:- Value engineering: Identifying and eliminating unnecessary features or components from the project design.- Efficient scheduling: Optimizing the project schedule to minimize idle time and reduce labor and equipment costs.- Vendor management: Selecting the most cost-effective suppliers and negotiating favorable terms.1. Role of TechnologyTechnology plays a significant role in engineering project cost management. Some technological tools and software used for cost management include:- Cost estimation software: Tools that help in estimating project costs based on historical data and industry benchmarks.- Project management software: Software that helps in tracking project progress, managing resources, and controlling costs.- BIM (Building Information Modeling): A digital representation of the project that helps in visualizing the project and identifying cost-saving opportunities.1. ConclusionIn conclusion, engineering project cost management is a comprehensive process that involves several key aspects, including cost estimation, budgeting, control, and reduction strategies. Effective cost management ensures that projects are completed within the allocated budget and within the scheduled time, leading to successful project delivery. By leveraging technology and adopting best practices, organizations can enhance their cost management capabilities and achieve greater profitability in the construction industry.。
THE STATA NEWSSTATA Volume 18 Number 3 July/August/September 2003New features for Stata 8P . 1New books from Stata Press P . 2Latest NetCourse schedule P . 3Inside this issue:FDA file format now supported 1Time-series graphs in Stata 8 1Multiple-language dataset support 1New books from Stata Press 2 From the Stata Bookstore 3Latest NetCourse schedule 3FDA file format now supportedStata 8 can now read and write files in the format required for submissions to the U.S. Food and Drug Administration (FDA)–SAS XPORT format. Two new commands provide this ability: fdasave and fdause . The primary intent of these commands is to assist people in making submissions to the FDA, but the commands are general enough to use in transferring data between SAS and Stata.These new commands are included in the latest free update to Stata 8, which you obtain by launching Stata, typing update , and following the instructions. You can access online documentation after updating, by typing whelp fdasave .fdasave saves the data in the FDA’s official format for submittingand archiving new drug and device applications (NDAs). To format your Stata dataset for submission, type .fdasave filenameStata creates an FDA format file filename .xpt containing the data. If the data includes value labels, Stata automatically creates an additional FDA format file, formats.xpf , containing the value-label definitions. These files are in exactly the format expected by the FDA and can easily be read into Stata or into SAS Statistical Software (SAS Institute Inc., Cary, North Carolina).fdause reads FDA format files into Stata. You type.fdause filenameStata reads the FDA format file filename .xpt , and if the file formats.xpf also exists, Stata also reads the value-label definitions.For more information about FDA NDA submissions, see /cder/guidance/2867fnl.pdf .Time-series graphs in Stata 8Stata 8 has new features for the graphical display of time-series data. Stata’s graphics engine has an enhanced time axis and associated options that allow you to specify date strings instead of numeric values. The latest update also contains new time-series plottypes, tsline and tsrline , and a graph command for plotting panel data, xtline .When you tsset your data and identify the scale of the time variable (e.g., daily , weekly , monthly , …), Stata creates a time axis when plotting variables over time. You can now specify date strings when adding tick marks (both major and minor, labeled and unlabeled), vertical lines, and text boxes.tsline and tsrline automatically identify the time variable set by tsset . tsline generates line plots of one or more time series, and tsrline generates range plots with lines. These two plottypes can be combined; for example, you can plot a forecast with error bounds.. tsset time, daily. tsline ypred y || tsrline ll ul || in -60/l , ytitle(“”)tlabel(minmax 15apr2003) tline(15apr2003)���������������������������������������������������������������������������xtline automatically identifies the time and panel variables set by tsset . By default, xtline plots several variables versus time for each panel separately. Optionally, xtline can also produce a singlegraph with line plots overlaid by panel.You can obtain online documentation after updating by typing whelp tsline , whelp xtline , and whelp twoway_options .Multiple-language dataset supportJust added to Stata 8 is label language, which allows you to create datasets that contain data, variable, and value labels in different languages. A dataset might contain one set of labels in English, another in German, and a third in Spanish. Or, a dataset might contain labels all in the same language, one set of long labels and another set of shorter ones. A dataset may contain up to 100 sets of labels.This new capability is available for free to all Stata 8 users–just launch Stata, type update , and follow the instructions.Regression Models for Categorical DependentVariables Using Stata, Revised EditionWhile regression models ubiquitous, a discussion of how to interpret these models has been sorely lacking. Regression Models for Categorical Dependent Variables Using Stata, Revised Edition void. This book discusses how to fit and interpret regression models for categorical data with Stata and includes some commands written by the authors. Hypothesis testing and goodness-of-fit statistics are also discussed.Copyright: An Introduction Stata, Revised EditionAn Introduction to Survival Analysis Using Stata, Revised Edition is the ideal tutorial for professional data analysts who want to learn survival analysis for the first time or who are well versed in survival analysis but not as dexterous in using Stata to analyze survival data. This text also serves as a valuable reference to those who already have experience using Stata’s survival analysis routines.Survival analysis is a field of its own requiring specialized data man-Common Errors in StatisticsCommon Errors in Statistics contains common-sense, minimally technical advice on how to improve experimental design, analysis of data, and presentation of results. It provides a guide for experienced scientists as well as students learning to design and complete experiments and statistical analysis.The text begins with a section on foundations that covers sources of error, hypotheses, and data collection. The second section, on hypothesis testing and parameter estimation, takes a harder look at statistical evaluation of the data, including strengths and limitations of various statistical procedures, and guidelines for reporting results from what information to include to how to create an informative Essential Medical Statistics, 2d edEssential Medical Statistics gives an excellent overview of the basics of medical statistics and achieves a good balance contingency tables, and regression. As personal computers become more powerful, the use of complex models in leading medical journals is becoming more prevalent, and the second edition of this text follows this shift by focusing more on these models. A very nice section on basic meta-analysis is also included.Covered in the text are the basics of normal-based tests and inference, including linear regression, contingency tables and logistic regression, Copyright: Survival Analysis: Techniques for Censored and Truncated Data, 2d edSurvival Analysis: Techniques for Censored and Truncated Data John Klein and Melvin Moeschberger is an essential reference for any researcher using techniques of survival analysis. Ideal for self-study or for a two-term graduate sequence in survival analysis, this book maintains a technical level suited for both the medical researcher and the professional statistician. Most impressive are the number and scope of real-data examples that are presented in the first chapter and used subsequently throughout the text.Copyright:4HOW TO REACH USStata Corporation PHONE 979-696-46004905 Lakeway Drive FAX 979-696-4601College Station TX 77845 EMAIL stata@ USA WEB Please include your Stata serial number with all correspondence.THE STATA NEWS is published four times a year and is free to all registered users of Stata.I N T E R N A T I O N A L D I S T R I B U T O R SChips ElectronicsServing Brunei, Indonesia, Malaysia, Singapore tel: 62 - 21 - 452 17 61 email: puyuh23@.id Cosinus Computing BV Serving The Netherlands tel: +31 416-378 125 email: info@cosinus.nl Dittrich & Partner Consulting Serving Austria, Czech Republic, Germany, Hungary, Poland tel: +49 2 12 / 26 066 - 0 email: sales@dpc.de Ixon Technology Company Ltd Serving Taiwan tel: +886-(0)2-27045535 email: hank@ MercoStat Consultores Serving Argentina, Brazil, Paraguay, Uruguay tel: 598-2-613-7905email: mercost@.uy Metrika ConsultingServing the Baltic States, Denmark, Finland, Iceland, Norway, Sweden tel: +46-708-163128 email: sales@metrika.se MultiON Consulting S.A. de C.V. Serving Belize, Costa Rica, El Salvador, Guatemala, Honduras, Mexico, Nicaragua, Panama tel: 52 (55) 55 59 40 50 email: info@.mxRitme InformatiqueServing Belgium, France, Luxembourg tel: +33 (0)1 42 46 00 42 email: info@ Scientific Solutions S.A. Serving Switzerland tel: 41 (0)21 711 15 20email: info@scientific-solutions.ch Survey Design & Analysis Services Serving Australia, New Zealand tel: +61 (0)3 9878 7373 email: sales@.au Timberlake Consultants Serving Eire, U.K. tel: +44 (0)208 697 3377 email: info@ Timberlake Consulting S.L. Serving Spaintel: +34 (9) 5 560 14 30 email:timberlake@ Timberlake Consultores, Lda. Serving Portugal tel: +351 214 702 869 email:timberlake.co@mail.telepac.pt TStat S.r.l.Serving Italytel: +39 0864 210101email: tstat@tstat.it Vishvas Marketing-Mix Services Serving India tel: 91-22-25892639email: bandya@I N T E R N A T I O N A L R E S E L L E R SCreActive snc Serving Italy tel: +39 0575 333297 email: staff@ IEMServing Botswana, Lesotho,Mozambique, Namibia,South Africa, Swaziland, Zimbabwe tel: +27-11-8286169email: instruments@mweb.co.za Informatique Inc Serving Japan tel: +81-3-3505-1250email: sales@informatiq.co.jpSOFTWARE shop IncServing Bolivia, Chile, Colombia,Ecuador, Peru, Venezuela tel: 425-651-4090email: ventas@ Timberlake Consultores Brasil Serving Braziltel: +55-11-3263-1287 email: info@.br Timberlake Consultants Polska Serving Poland tel: +48 600 370 435 email: info@timberlake.plNC-101 is designed to take smart, knowledgeable people and turn theminto proficient interactive users of Stata. The course covers not just the obvious, such as getting data into Stata, but also covers detailed techniques and tricks to make you a powerful Stata user. From web update features and match-merging to using by groups and explicit subscripting, many of Stata’s key concepts are explored.NC-151 is intended for all Stata users. Through a combination of lectures, example applications, and carefully chosen exercises, the course addresses the full range of methods and techniques you will need to be most productive in the Stata environment. Beginning with effective ways to organize both simple and complicated analyses in Stata, NetCourse 151 moves into programming elements that can be used to work more efficiently. Key programming topics include macro processing, program flow of control, using do-files, programming ado-files, Monte Carlo simulations, and bootstrapped standard errors.From the Stata Bookstore Introduction to StataPrerequisites Stata 8Dates offered October 17 – November 28Course LeadersJames Hassell, Allen McDowell, and Derek WagnerEnrollment DeadlinesOctober 13Price $95From the Stata Bookstore Introduction to Stata programmingPrerequisitesStata 8; basic knowledge of using Stata interactivelyDates offered October 17 – November 28Course LeadersAllen McDowell and Kevin CrowEnrollment DeadlinesOctober 13Price $125。
representing the rate constraint Rate constraint is a fundamental concept in various fields, including computer science, engineering, and economics. It refers to the maximum rate at which a certain process can occur or be performed. Representing the rate constraint accurately is crucial for optimizing systems and ensuring their efficient functioning. In this article, we will explore different methods of representing the rate constraint and their applications.One common way to represent the rate constraint is through mathematical equations. For example, in computer science, we may use queuing theory to model the rate at which tasks are processed by a system. The rate constraint can be represented by a formula that takes into account the number of resources available and the time required to complete each task. This allows us to calculate the maximum throughput of the system and optimize its performance.In engineering, rate constraints are often represented using flow diagrams or phasor diagrams. These diagrams show the relationship between different variables, such as voltage, current, and power, and help engineers design systems that operate within specified rate limits. For instance, in electrical circuits, we may use Kirchhoff's laws to deriveequations that represent the rate constraint for current flow through resistors and capacitors.In economics, rate constraints can be represented using supply and demand curves. These curves show the relationship between the price of a good or service and the quantity that consumers are willing to buy or producers are willing to sell. By analyzing these curves, economists can determine the optimal production and consumption levels that satisfy rate constraints while maximizing social welfare.Another approach to representing rate constraints is through simulation models. These models allow us to test different scenarios and observe how changes in input parameters affect system performance. For example, in traffic engineering, we may use simulation software to model the flow of vehicles on a road network and analyze how changes in traffic light timings or lane configurations affect the overall rate of traffic flow.In conclusion, representing the rate constraint is essential for optimizing systems and ensuring their efficient operation. Different methods, such as mathematical equations, flow diagrams, supply and demand curves, and simulation models, can be used depending on the field of application. By accuratelyrepresenting rate constraints, we can make informed decisions and design systems that meet our needs while minimizing resource usage and maximizing efficiency.。
INFORMATION CRITERION AND CHANGE POINTPROBLEM FOR REGULAR MODELSInformation criteria are commonly used for selecting competing statistical models. They do not favor the model which gives the best to the data and little interpretive value, but simpler models with good fit. Thus, model complexity is an important factor in information criteria for model selection. Existing results often equate the model complexity to the dimension of the parameter space. Although this notion is well founded in regular parametric models, it lacks some desirable properties when applied to irregular statistical models. We refine the notion of model complexity in the context of change point problems, and modify the existing information criteria. The modified criterion is found consistent in selecting the correct model and has simple limiting behavior. The resulting estimatorof the location of the change point achieves the best convergence rate Op(1), and its limiting distribution is obtained. Simulation results indicate that the modified criterion has better power in detecting changes compared to other methodsIntroductionOut of several competing statistical models, we do not always use the one with the best to the data. Such models may simply interpolate the data and have little interpretive value. Information criteria, such as the Akaike information criterion and the Schwarz information criterion, are designed to select models with simple structure and good interpretive value, see Akaike (1973) and Schwarz (1978). The model complexity is often measured in terms of the dimensionality of the parameter space.Consider the problem of making inference on whether a process has undergone some changes. In the context of model selection, we want to choose between a model with a single set of parameters, or a model with two sets of parameters plus the location of change. The Akaike and the Schwarz information criteria can be readily adopted to this kind of change point problems. There have been many fruitful research done in this respect such as Hirotsu, Kuriki and Hayter (1992) and Chen and Gupta (1997), to name a few.Compared to usual model selection problems, the change point problem contains a special parameter: the location of the change. When it approaches the beginning or the end of the process, one ofthe two sets of the parameter becomes completely redundant. Hence, the model is un-necessarily complex. This observation motivates the notion that the model complexity also depends on the location of the change point. Consequently, we propose to generalize the Akaike and Schwarz information criteria by making the model complexity also a function of the location of the change point. The new method is shown to have a simple limiting behavior, and favourable power properties in many situations via simulation.The change point problem has been extensively discussed in the literature in recent years.The study of the change point problem dates back to Page (1954, 1955 and 1957) which tested the existence of a change point. Parametric approaches to this problem have been studied by a number of researchers, see Chernoff and Zacks (1964), Hinkley (1970), Hinkley et.al.(1980), Siegmund (1986) and Worsley (1979, 1986). Nonparametric tests and estimations have also been proposed (Brodsky and Darkhovsky, 1993; Lombard, 1987; Gombay and Huskova, 1998). Extensive discussions on the large sample behavior of likelihood ratio test statistics can be found in Gombay and Horvath (1996) and Csorgo and Horvath (1997).The detail can be found in some survey literatures such asBhattacharya (1994), Basseville and Nikiforov (1993), Zacks (1983), and Lai (1985). The present study deviates from other studies by refining the traditional measure of the model complexity, and bydetermining the limiting distribution of the resulting test statistic under very general parametric model settings.In Section 2, we define and motivate the new informationcriterion in detail. In Section 3, we give the conditions under which the resulting test statistic has chi-square limiting distribution and the estimator τ of change point attains the best convergence rate. An application example and some simulation results are given in Section4. The new method is compared to three existing methods and found to have good finite sample properties. The proofs are given in the Appendix.Main ResultsLet X 1,X 2, ……,Xn be a sequence of independent randomvariables. It is suspected that Xi has density function 1f (,)x θ when i<k and density 2f (,)x θ for i>k . We assume that 1f (,)x θand 2f (,)x θbelong to the same parametric distribution family {f (,):}d x R θθ∈.The problem is to test whether this change has indeed occurred and if so, find the location of the change k . The null hypothesis is 0H : 12θθ= and the alternative is 1H : 12θθ≠and 1k n <<Equivalently, we are asked to choose a model from 0H or a modelfrom 1H for the data.For regular parametric (not change point) models with loglikelihood function ()n l θ, Akaike and Schwarz information criteria are defined as:2()2dim()2()2dim()log()n n AIC l SIC l n θθθθ=-+=-+where θ is the maximum point of ()n l θ. The best model according to these criteria is the one which minimizes AIC or SIC . The Schwarz information criterion is asymptotically optimal according to certain Bayes formulation.The log likelihood function for the change point problem has the form121211(,;)log (,)log (,)k nn i i i i k l k f x f x θθθθ==+=+∑∑ The Schwarz information criterion for the change point problem becomes12()2(,;)[2dim()1]log()n SIC k l k n θθθ=-++and similarly for Akaike information criterion, where 12,θθmaximize 12(,;)n l k θθfor given k . See, for example, Chen and Gupta (1997). When the model complexity is the focus, we may also write it as1212()2(,;)(,;)log()n SIC k l k complexity k n θθθθ=-+We suggest that the notion of 12(,;)complexity k θθ = 2dim()1θ+ needs re-examination in the context of change point problem. When k takes values in the middle of 1 and n , both 1θand 2θare effectiveparameters. When k is near 1 or n , either 1θor 2θbecomes redundant.Hence, k is an increasingly undesirable parameter as k getting close to 1 or n . We hence propose a modified information criterion with2122(,;)2dim()(1)an k complexity k const t nθθθ=+-+ For 1<k<n, let 21212()2(,;)[2dim()(1)]log()n k MIC k l k n n θθθ=-++- Under the null model, we define12()2(,;)dim()log()n MIC k l k n θθθ=-+If 1()min ()k nMIC n MIC k <<>, we select the model with a change point and estimate the change point by τ such that1()min ()k nMIC MIC k τ<<= Clearly, this procedure can be repeated when a second change point is suspected.The size of model complexity can be motivated as follows. If the change point is at k , the variance of 1θ would be proportional to 1k -and the variance of 2θ would be proportionalto 1()n k --. Thus, the total variance is1211111[()]42k n k n k n --+=--- The specific form in (2) reflects this important fact. Thus, if a change at an early stage is suspected, relatively stronger evidence is needed to justify the change. Hence, we should place larger penalty when k is near 1 or n . This notion is shared by many researchers.The method in Inclan and Tiao (1994) scales down the statistics heavier when the suspected change point is near 1 or n . The U-statistic method in Gombay and Horvath (1995) is scaleddown by multiplying the factor k (n-k ).To assess the error rates of the method, we can simulate the finite sample distribution, or find the asymptotic distribution of the related statistics. For Schwarz information criterion,the relatedstatistic is found to have type I extreme value distributionasymptotically (Chen and Gupta, 1997; CsÄorgÄo and Horvath 1997). We show that the MIC statistic has chi-square limiting distribution for any regular distribution family, the estimator τ achieves the best convergence rate Op (1) and has a limiting distribution expressed via a random walk.Our asymptotic results under alternative model is obtained under the assumption that the location of the change point k , Thus, {:1,2}in X i n n <<> form a triangle array. The classical results on almost sure convergence for independent and identically distributed (iid) random variables cannot be directly applied. However, the conclusions on weak convergence will not be affected as the related probability statements are not affected by how one sequence is related to the other. Precautions will be taken on this issue but details will be omitted.Let1()min ()dim()log()n k nS MIC n MIC k n θ<<=-+ where MIC (k ) and MIC (n ) are defined in (3) and (4). Note that this standardization removes the constant term dim()log()n θ in the difference of MIC (k ) and MIC (n ).常见模型信息准则和变点分析问题 信息准则通常是用来选择统计模型的优劣。
A new approach for multimodel identification of complex systems based on both neural and fuzzy clustering algorithmsNesrine Elfelly a,c,Jean-Yves Dieulot b,Mohamed Benrejeb c,Pierre Borne a,Ãa Ecole Centrale de Lille,LAGIS UMR CNRS8146,Cite´Scientifique,59650Villeneuve d’Ascq,Franceb Ecole Polytechnique de Lille,LAGIS UMR CNRS8146,Cite´Scientifique,59650Villeneuve d’Ascq,Francec Ecole Nationale d’Inge´nieurs de Tunis,UR LARA Automatique,BP371002Tunis Le Belve´de re,Tunisiaa r t i c l e i n f oArticle history:Received13May2009Received in revised form22January2010Accepted11June2010Keywords:Complex systemsMultimodelIdentificationRival Penalized Competitive LearningK-meansFuzzy K-meansa b s t r a c tThe multimodel approach was recently developed to deal with the issues of complex systems modeling andcontrol.Despite its success in differentfields,it is still faced with several design problems,in particular thedetermination of the number and parameters of the different models representative of the system as well asthe choice of the adequate method of validities computation used for multimodel output deduction.In this paper,a new approach for complex systems modeling based on both neural and fuzzy clusteringalgorithms is proposed,which aims to derive different models describing the system in the whole operatingdomain.The implementation of this approach requires two main steps.Thefirst step consists indetermining the structure of the model-base.For this,the number of models must befirstly worked out byusing a neural network and a Rival Penalized Competitive Learning(RPCL).The different operating clustersare then selected referring to two different clustering algorithms(K-means and fuzzy K-means).The secondstep is a parametric identification of the different models in the base by using the clustering results formodel orders and parameters estimation.This step is ended in a validation procedure which aims toconfirm the efficiency of the proposed modeling by using the adequate method of validity computation.Theproposed approach is implemented and tested with two nonlinear systems.The obtained results turn out tobe satisfactory and show a good precision,which is strongly related to the dispersion of the data and therelated clustering method.&2010Elsevier Ltd.All rights reserved.1.IntroductionIn process industries,many operations include set-pointchanges and/or the co-existence of multiple operating modes.As an example,one can consider greenhouse control for whichnight or daylight workout is quite different,while otherconsiderations such as crop ripeness,outdoor temperature/humidity strongly influence the dynamical evolution of the plant(Salgado and Cunha,2005).While it is possible to represent thewhole nonlinear system,it is often preferred to simplify themodel considering a set of well-known operating points,forwhich the subsystems are linear or not.The main difficultyappears to be the monitoring of the plant between two differentoperating modes,which needs the blending of models and/or thecorresponding controls across different operating domains,known as multimodel systems/control(Delmotte et al.,1996;Johansen and Foss,1999).The blending of a priori known linear models and control hasbeen widely covered in the literature,e.g.by fuzzy Takagi–Sugenomodels(Takagi and Sugeno,1985;Angelov and Filev,2004),oftenusing linearization methods.However,the multimodel represen-tation is more difficult to obtain when the subsystems arenonlinear and/or should be determined from raw input–outputdata.In the latter case,thefirst stage consists infinding theappropriate size for the model-base,then clustering the data andestimating the models st,the blending functionsbetween several models should be estimated in the common casewhere the operating domains overlap.Several neural and fuzzy clustering algorithms have shown tobe relevant to handle a set of dynamical models(e.g.Gegu´ndezet al.,2008).For example,neural networks have been able torepresent and control such systems as a non-linear output-to-state mapping embedding a linear system(Baruch et al.,2008),aneural network using a direct representation of the nonlineardynamics into the neurons(Manioudakis et al.,2001;Vieira et al.,2004),or multiple neural networks(e.g.Ronen et al.,2002;Yu,2006;Fu and Chai,2007).They have been also used for building amultimodel representation and control from data using forexample the Kohonen Self Organizing Map as a clusteringtechnique(Cho et al.,2006,2007).On the other hand,thanks toContents lists available at ScienceDirectjournal homepage:/locate/engappaiEngineering Applications of Artificial Intelligence0952-1976/$-see front matter&2010Elsevier Ltd.All rights reserved.doi:10.1016/j.engappai.2010.06.004ÃCorresponding author.E-mail addresses:nesrine.elfelly@ed.univ-lille1.fr(N.Elfelly),jean-yves.dieulot@polytech-lille.fr(J.-Y.Dieulot),mohamed.benrejeb@enit.rnu.tn(M.Benrejeb),pierre.borne@ec-lille.fr(P.Borne).Engineering Applications of Artificial Intelligence23(2010)1064–1071their ability to classify data and their simplicity,K-means algorithms have been shown to be efficient for data clustering (e.g.Cheung,2003;Dembe´le´and Kastner,2003;Hore et al.,2008; Kanzawa et al.,2008).In short,whereas many architectures using multiple models and neural networks have been proposed,there has not been much work on clustering techniques,based on neural networks and K-means algorithms(Xue and Li,2006),applied to traditional multimodel representation using only input/output data.In this context,most of the proposed studies use clustering algorithms for the identification of the Takagi–Sugeno models(e.g.Kukolj and Levi,2004;Vernieuwe et al.,2006;Li et al.,in press).The most tedious issues are related to the size of the model-base and the data clustering procedure which aims to the determination of the operating domains of the process.This paper thus studies the feasibility of designing a clustering algorithms-based approach ofa multimodel representation devoted to the modeling of a class of complex systems,which has not been commonly addressed by prior studies.It extends and completes preliminary results given in Elfelly et al.(2008).In fact,instead of using a heuristic method for clusters’number determination like in Elfelly et al.(2008),an appropriate clustering method called Rival Penalized Competitive Learning(RPCL)(Du,2010;Tambe et al.,1996;Xu et al.,1993)is proposed to handle this classical issue.Besides,the clustering procedure,which aims to build the model base from input–output data,is improved here.A quite rough clustering method based on Kohonen maps(Elfelly et al.,2008)is replaced by the introduction of the K-means and fuzzy K-means algorithms.This allows to make some interpretations about the selection of the appropriate clustering method and to distinguish two kinds of models,those for which operating domains slightly overlap,and those for which these domains strongly overlap.Simulation examples will confirm the relevance of the suggested approach.2.Multimodel structure determination using clustering algorithmsThe multimodel structure was introduced as a global approach based on multiple local LTI models(linear or affine).Conse-quently,it assumes that it is possible to replace a unique nonlinear representation by a combination of simpler models thus building a so-called model-base.Each model of this base describes the behavior of the considered process at a specific operating point.The interaction between the different models of the base through normalized activation functions allows the modeling of the global non-linear and complex system.Therefore, the multimodel approach aims at lowering the system complexity by studying its behavior under specific conditions.The multi-model principle is given in Fig.1.The different models of the base could be of different structures and orders but no model can represent the system in its whole operating domain.The decision unit allows the estimation of the weight of each model and thus the selection of the most relevant models at each time.As for the output unit, controlled by the decision unit,it allows the computation of the multimodel output which is obtained by the contribution of the different models’outputs.2.1.Model-base construction procedureThe proposed approach allows the construction of the model-base by using some clustering algorithms(Du,2010).The application of this approach requiresfirstly the determination of the number of models:this will be handled by using a two-layers competitive neural network and a Rival Penalized Competitive Learning(RPCL)(Xu et al.,1992).The clustering result given by the RPCL will influence the choice of the appropriate K-means algorithm(K-means,fuzzy K-means)for data clustering.Once the operating domains generated,a structural and parametric identi-fication of different base-models is carried out by exploiting the clustering results.Thefinal step consists in validating the modeling strategy using an adequate method for validity computation allowing the generation of the multimodel output which will be compared to the real system output for a different set of inputs.2.2.Determination of the number of models with a neural network and a Rival Penalized Competitive Learning(RPCL)algorithm Thisfirst step requires experimental data which are obtained when the considered system is excited by applying an appropriate input sequence.In order to generate the different operating domains of the process,the measurements must be merged into a set of clusters.The idea consists in applying a clustering algorithm with unsupervised learning.Most existing clustering algorithms (Jain and Dubes,1988;Mirkin,1996)do not handle the selection of the appropriate number of clusters,which is,however, essential to estimation and clustering performance in the multi-model approach when no information is available about the operating domains and their number.However,many experi-mental results have shown that the RPCL algorithm automatically allocates an appropriate number of units for an input data set when they are used for clustering.2.2.1.System excitation and data collectingThe excitation of the system consists in applying an input signal and then collecting the useful measurements(output or input/output).The number of measured variables depends on the system complexity.The excitation signal must be rich enough and persistently exciting with well-chosen parameters in order to allow a full excitation of the operating dynamics,and to take into consideration the non-linear aspect of the considered process. 2.2.2.Selection of the number of models via RPCLFor tackling the issue of determination of the number of models in the multimodel representation,via input–output data, we propose to use a neural network and to apply the learning algorithm called RPCL which allows the selection of the adequate number of operating clusters for an input data set,where the extra units are gradually driven far away from the distribution of the data set when the number of units is larger than the real number of clusters in the input data set.RPCL is an unsupervised learning strategy(proposed by Xu et al.,1993and renewed by Tambe et al.,1996),that auto-matically determines the optimal number of nodes.TheprincipleFig.1.Multimodel approach.N.Elfelly et al./Engineering Applications of Artificial Intelligence23(2010)1064–10711065underlying RPCL can be considered as an extension of the competitive learning based on Kohonen (1990)rule.The specifi-city of the RPCL lies in the modification,for each input vector,not only of the winner weights,but also of the weights of its rival (called second winner)so that the rival will be moved or penalized.The rate at which the rival is penalized is much smaller than the learning rate (Borne et al.,2007).Given a competitive learning neural network (Fig.2),i.e.a layer of units with the output u i of each unit and its weight vector w i for i ¼1,y ,K (K is the number of clusters),the RPCL algorithm can be described by the following steps.1.Initialize weight vectors w i randomly.2.Take a sample x from a data set D ,and for i ¼1,y ,K ,letu i ¼1if i ¼c ,À1if i ¼r ,0otherwise ,8><>:ð1Þsuch thatg c J x Àw c J 2¼min jg j J x Àw j J 2ð2a Þg r J x Àw r J 2¼min j a cg j J x Àw j J 2,ð2b ÞJ ÃJ :Euclidean distance;c :index of the unit which wins the competition (winner);w c :weight vector of the winner;r :second winner (rival)index;w r :weight vector of the rival;g j :conscience factor (relative winning frequency)used to reduce the winning rate of the frequent winners.It is so-called because a processing term that wins too often begins to ‘‘feel guilty’’and avoids itself from winning excessively.It is useful to develop a set of equiprobabil-istic features or prototypes representing the input data.g j is calculated as follows (Nair et al.,2003):g j ¼n jP i ¼1n i,ð3Þwhere n j refers to the cumulative number of occurrences the node j has won the competition (u j ¼1).3.Update the weight vectors as follows:w j ðt þ1Þ¼w j ðt ÞþD w j ,ð4ÞwhereD w j ¼a c ðt Þðx Àw j ðt ÞÞif u j ¼1,Àa r ðt Þðx Àw j ðt ÞÞif u j ¼À1,0otherwise ,8><>:ð5Þ0r a c ðt Þand 0r a r ðt Þr 1are,respectively,the winner learningrate and the rival de-learning rate.In practice,the rates are fixed small numbers or depend on time (starting from not so small initial values and are then reduced to zero in some way)with a c ðt Þb a r ðt Þat each step.Several empirical functions have been proposed for the update of the learning and de-learning rates (King et al.,1998;Nair et al.,2003).4.Repeat steps 2and 3until the whole learning process has converged.Referring to the learning results,the number of clusters is determined manually.In fact,only the units enclosed by the data set must be considered and the number of clusters could be so deduced as equal to the number of the retained units.This step could be,however,automated by considering the position of the units to the convex envelope of the data set.Besides,this is an improvement with respect to heuristic methods as considered in Elfelly et al.(2008),which could not provide a good estimation of clusters’number.2.3.Determination of the operating clusters using K-means algorithms (K-means and fuzzy K-means)After the selection of the appropriate number of clusters,the next step consists in splitting up the measurements collected on the system into clusters in order to generate the different operating domains from which the base-models will be identified.For this,K-means and fuzzy K-means clustering algorithms have been chosen according to their easy working out and their efficiency.2.3.1.K-means algorithmK-means (Forgy,1965;MacQueen,1967)is one of the simplest unsupervised learning algorithms that solve the well known clustering problem.The procedure follows a simple and easy way to classify a given data set through a certain number of clusters fixed a priori.It can be shown that this algorithm aims at minimizing an objective function,in this case a squared error function,given by J ¼X K j ¼1X N i ¼1u ij J x i Àc j J 2,ð6Þwhere:J ÃJ :a chosen distance measure between a data point and acluster center;x i :i th data point;c j :center vector of the cluster j ;u ij :degree of membership of x i to cluster j such as u ij A f 0,1g andu ij ¼1if x i belongs to the cluster j ,0otherwise ,(ð7ÞN :number of data points;K :number of clusters ð2r K o N Þ.The algorithm is composed of the following steps.1.Locate K points into the space represented by the objects that are being clustered.These points represent initial group centroids.2.Assign each object to the group that has the closest centroid.3.When all objects have been assigned,recalculate the positions of the K centroids.4.Repeat steps 2and 3until the centroids no longer move.This produces a separation of the objects into groups from which the metric to be minimized can becalculated.petitive learning neural network.N.Elfelly et al./Engineering Applications of Artificial Intelligence 23(2010)1064–10711066Although it can be proved that the procedure will always terminate,the K-means algorithm does not necessarilyfind the optimal configuration,corresponding to the global objective function minimum.The algorithm is also significantly sensitive to the initial randomly selected cluster centers.The K-means algorithm can be run multiple times to reduce this effect.2.3.2.Fuzzy K-means algorithmFuzzy K-means algorithm(developed by Dunn,1973and improved by Bezdek,1981)is a data clustering technique which allows each data point to belong to more than one cluster with different membership degrees(between0and1)and vague or fuzzy boundaries between clusters.The aim of this method is to find an optimal fuzzy K-partition and corresponding prototypes minimizing the following objective function:J m¼X Kj¼1X Ni¼1u mij J x iÀc j J2,1r m o1;ð8Þwhere:JÃJ:any norm expressing the similarity between any measured data and a cluster centre;x i:i th data point;c j:center vector of the cluster j;u ij:degree of membership of x i to cluster j such as u ij A½0,1 ,P Kj¼1u ij¼1and0oP Ni¼1u ij o N;m:weighting exponent(real number greater than1)which is a constant that influences the membership values;N:number of observations;K:number of clustersð2r K o NÞ.Fuzzy partitioning is carried out through an iterative optimization of the objective function shown above,with the update of membership u ij and the cluster centers c j(Nascimento et al., 2000).This procedure will stop when max i,j fj u ijðkþ1ÞÀu ijðkÞjg o e where e is a termination criterion belonging to[0,1]and k are the iteration steps.This procedure converges to a local minimum of J m.The algorithm is composed of the following steps.1.Initialize the matrix U¼[u ij],U(0).2.At k-step,calculate the centers vectors C(k)¼[c j]:c j¼P Ni¼1u mijx iP Ni¼1u mij:ð9Þ3.Update U(k),U(k+1):u ij¼X Kr¼1J x iÀc j Ji r2=ðmÀ1Þ"#À1:ð10Þ4.If J Uðkþ1ÞÀUðkÞJ o e then STOP;otherwise return to step2. In our study,thefinal clustering result is obtained by considering that a given data point belongs to the cluster to which it presents the greatest membership degree.2.3.3.HypothesisThe implementation of the two clustering algorithms,previously described,allows the assignment of the considered data to a certain number of clusters.We assume,in the remainder,that the choice of the appropriate clustering algorithm will depend on the way in which data are dispersed in the observation space.In fact,referring to clustering results obtained by the application of the RPCL algorithm,if we have an overlapping between clusters without marked boundaries,the use of the fuzzy K-means algorithm will be preferred;otherwise,the K-means algorithm will be carried out.In this paper,we will try to validate this hypothesis through a comparison between results given by the considered clustering algorithms applied for the modeling of two different systems.3.Multimodel parameters estimation3.1.Model orders and parameters estimationThe application of the appropriate clustering algorithm(K-means or fuzzy K-means)results in some repartition of the data set.Each cluster is represented by a set of input/output measurements which will be exploited for the identification of the different models in the base.For this,we willfirst proceed to the estimation of the model orders by using the so-called instrumental determinants’ratio-test.This method is mainly based on the conditions concerning a matrix called‘‘information matrix’’which contains the input/output measurements(Abden-nour et al.,2001).This matrix is described as follows:Q m¼1HX N Hk¼1uðkÞuðkþ1Þ^uðkÀmþ1ÞuðkþmÞ2666666437777775½yðkþ1Þuðkþ1ÞÁÁÁyðkþmÞuðkþmÞ ,ð11Þwhere N H is the number of observations.The instrumental determinants’ratio(RDI)is given byRDIðmÞ¼detðQ mÞdetðQ mþ1Þ:ð12ÞFor every value of m,the determination procedure of the order consists in building the matrices Q m and Q m+1and in evaluating the ratio RDI(m),the retained order m is the value for which the ratio RDI(m)quickly increases for thefirst time.Given the order of the model,the parametric identification issue consists in calculating the values of the parameters of the corresponding model-equation,given several experimental mea-sures which describe the dynamic behavior of the model.For this, the Recursive Least-Squares(RLS)method(Abdennour et al., 2001)is applied to achieve the parameters estimation.3.2.Validity computationThe steps described in the previous paragraphs allow the construction of the model-base.The purpose is to test the efficiency and the precision of the proposed modeling structure. For this,a validation step is worked out where some inputs which are different from those used for clustering are fed into the system.Then,the multimodel output is computed and compared to the real output of the process.The multimodel output y mm is obtained through a fusion of the models’outputs y i weighted by their respective validity indexes v i,as illustrated by the system13 and Fig.3(K is the number of models in the base):y mmðkÞ¼X Ki¼1y iðkÞv iðkÞ,ð13aÞX Ki¼1v iðkÞ¼1:ð13bÞputation of validity indexesThe validity is a real number belonging to the interval[0,1].It represents the relevance degree of each model calculated at eachN.Elfelly et al./Engineering Applications of Artificial Intelligence23(2010)1064–10711067instant.In the literature,several methods have been proposed to deal with the validity issue.In our study,the residues’approach was adopted for the calculation of validities.This method is based on the distance measurement between the process and the considered model.For example,the residue can be given by the following expression:r i ¼j y Ày i j ,i ¼1,...,K ,ð14Þwhere:K :number of models in the base;y :process output;y i :output of the model M i .If the residue value is equal to zero,the corresponding model M i represents perfectly the process at that time.On the contrary,a non-zero value results in the fact that the model M i represents the process partially.The normalized residues are given by r u i ¼riP j ¼1r j:ð15ÞWithin the context of the residues’approach,several methods have been proposed for the calculation of validities (Delmotte et al.,1996;Borne et al.,1998;Leith and Leithead,1999).Only two methods will be considered:the simple and the reinforced validities.In general,the expression of the validities is given by v i ¼1Àr u i :ð16ÞUsing the previous expression,the simple and the reinforced validities are defined as follows.Simple validities:the normalized simple validities are defined so that their sum must be equal to 1at each time:v simp i¼v iK À1:ð17ÞReinforced validities:for this type of validities,the reinforcement expression is introduced as v u renf i ¼v iY K j ¼1,j a ið1Àv j Þ:ð18ÞThe normalized reinforced validities could be written as follows:v renf i ¼v u renfi P K j ¼1v j:ð19ÞConcerning the selection of the appropriate method of validities’computation (simple or reinforced),a comparative study between the two considered methods has been carried in Elfelly et al.(2008)for different systems.This study leads to conclude that the selection of the suitable method depends on clustering results,i.e.the clusters structure and repartition.In fact,when there are important variations in the same cluster and when an overlapping between clusters occurs,it is worth to use the simple validities’method since it takes account of different models’outputs referring to the expression (17).In this case,no model could represent ideally the process at any time.But when the clusterspresent very few variations and are well separated,the reinforced validities’method is better adapted.The application of this method,thanks to the reinforcement expression (19),promotes the contribution of the most dominant model which represents at best the process behavior.3.2.2.Validation of the proposed modeling schemeOnce the appropriate method of validity computation selected,the validation of the global modeling scheme is carried out through a comparison between the real and the multimodel outputs for different input sequences.4.Simulation examplesIn order to underline the interest and the performance of the proposed modeling approach,some simulation examples have been considered.4.1.Example 1:second order discrete system with time-dependent parametersThis first example is a complex discrete system whose evolution is described by the following equation:y ðk Þ¼Àa 1ðk Þy ðk À1ÞÀa 2ðk Þy ðk À2Þþb 1ðk Þu ðk À1Þþb 2ðk Þu ðk À2Þ:ð20ÞThe variation laws of different parameters of the process are given bya 1ðk Þ¼0:04sin ð0:035k ÞÀ0:8,ð21a Þa 2ðk Þ¼0:005sin ð0:03k Þþ0:1,ð21b Þb 1ðk Þ¼0:02sin ð0:03k Þþ0:5,ð21c Þb 2ðk Þ¼0:01sin ð0:035k Þþ0:2:ð21d ÞFirst,the system is excited by a uniform random signal u (k ).Then,the measurements y (k )and y (k À1)are collected at different instants.These numerical data are used for the determination of the appropriate number of operating clusters by using a neural network and the RPCL.Fig.4gives the results of this learning.In fact,with five neurons in the output layer,two centers move away from the observation space,which allows to conclude that the adequate number of clusters is equal to three.Then,the two clustering algorithms (K-means and fuzzy K-means)are carried out in order to select the different operating clusters.The difference between these two methods is not clearly visible referring to the clustering results.These results,obtained by application of the K-means algorithm,are given in Fig.5.Fig.3.Fusion principle.2.533.544.52.533.544.5y (k-1)y (k )Fig.4.Determination of the number of clusters (RPCL:K ¼5).N.Elfelly et al./Engineering Applications of Artificial Intelligence 23(2010)1064–10711068。
Determining the parametric structure of modelsD.J.Cole a ,⇑,B.J.T.Morgan a ,D.M.Titterington ba School of Mathematics,Statistics and Actuarial Science,University of Kent,Canterbury CT27NF,UK bDepartment of Statistics,University of Glasgow,Glasgow G128QQ,UKa r t i c l e i n f o Article history:Received 31March 2010Received in revised form 9August 2010Accepted 21August 2010Available online 25August 2010Keywords:Derivative matrixExhaustive summaries Global identifiability Jacobian matrix Local identifiability Parameter redundancya b s t r a c tIn this paper we develop a comprehensive approach to determining the parametric structure of models.This involves considering whether a model is parameter redundant or not and investigating model iden-tifiability.The approach adopted makes use of exhaustive summaries,quantities that uniquely define the model.We review and generalise previous work on evaluating the symbolic rank of an appropriate deriv-ative matrix to detect parameter redundancy,and then develop further tools for use within this frame-work,based on a matrix plex models,where the symbolic rank is difficult to calculate,may be simplified structurally using reparameterisation and by finding a reduced-form exhaus-tive summary.The approach of the paper is illustrated using examples from ecology,compartment mod-elling and Bayes networks.This work is topical as models in the biosciences and elsewhere are becoming increasingly complex.Ó2010Elsevier Inc.All rights reserved.1.Introduction1.1.Aims and outline of the paperStatistical inference may fail due to an inability to estimate,or estimate well,all of the parameters of a model.This may be be-cause of a lack of data.However,for some models it will never be possible to estimate all of the parameters by methods of classi-cal inference.For example this can occur if two parameters are confounded and only ever appear as a product.In such instances a model is termed non-identifiable or parameter redundant.For linear models we can use a constant design matrix to deter-mine parameter redundancy and impose appropriate constraints.For non-linear models,the subject of this paper,it can often be dif-ficult to determine if a model is parameter redundant.One ap-proach is to use a symbolic algebra computer package,which involves forming a suitable derivative matrix and then calculating its symbolic rank.The idea of determining parameter redundancy through the rank of a derivative matrix,which corresponds to the use of the design matrix for linear models,appears in [1–5]and others.However,the early work predated symbolic algebra packages.A more detailed discussion of previous work is given in Section 1.3.The early papers differed in what was differentiated with respect to the parameters to form the derivative matrix.Here we provide a unifying framework by first defining the quantity that is differentiated as an exhaustive summary.This is a term bor-rowed from compartment modelling [6],and is a vector of param-eter combinations that uniquely defines the structure of the model.A full discussion of exhaustive summaries and how they can be used to determine the parametric structure of models is given in Section 2.The symbolic approach is explained in Section 2.1.However,for structurally complex models,computer algebra packages may not be able to calculate the symbolic rank of the derivative matrices,due to computer memory limitations.In such cases,numerical methods have been used.To overcome this difficulty we need to use an exhaustive summary that is structurally simple.Here we present a collection of tools that can be used to create such exhaus-tive summaries.The most powerful tool is reparameterisation of the model in order to find a new simpler exhaustive summary based on the reparameterisation,and this is the subject of Sec-tion 3.We also employ the extension theorems of [5,7]which al-lows conclusions regarding specific cases of a model to be extended to more general forms of the same model structure (Sec-tion 2.2),although this is not always straightforward.We present the results of the paper through a formal presenta-tion of theorems,remarks and examples.The tools of this paper provide a framework that can be used to determine the parametric structure of models.If a model is parameter redundant,exhaustive summaries can be used to determine what combination of param-eters can be estimated.If the model is not parameter redundant it is termed full rank,and exhaustive summaries can be used to show if the model is always full rank or if there are points in the param-eter space where the model is parameter redundant (Section 2.3)0025-5564/$-see front matter Ó2010Elsevier Inc.All rights reserved.doi:10.1016/j.mbs.2010.08.004Corresponding author.Tel.:+441227823664.E-mail address:d.j.cole@ (D.J.Cole).URL:/ims/personal/djc24/index.htm (D.J.Cole).or if a model is only full rank in a region in the parameter space (Section 3.1).Several simple examples are given alongside the development of the theory,and extended examples are presented in Section4.Note that thefirst3examples are simple examples where existing theory is sufficient to examine whether the model is parameter redundant;these examples are used to illustrate clearly the existing and new methodology.The later examples show how the approach of the paper provides a general method that may be used in many different areas,and is often a viable alternative to developing a specific method for a particular class of problems.Implications for both classical and Bayesian inference are discussed in Section5.The Maple code for the examples can be found at /ims/personal/djc24/maplecode.htm.Partly as a result of fast computers,we are seeing an increase in the complexity of models being used throughout biosciences. There is therefore a current need for the developments in this pa-per.For instance[8,9]provide examples from capture–recapture and latent-class modelling where the methods of this paper pro-vide definitive answers to essential questions of parameter redun-dancy which were not known previously and only examined through time-consuming numerical investigations.1.2.Definitions of identifiability and parameter redundancyA model is not identifiable if different sets of parameter values result in the same model[10].More formally,let M(h)be the func-tion that defines a model,which has unknown parameters h2X, where X is a dim(h)-dimensional vector space and dim(h)denotes the number of terms in a general vector h.For example M(h)could be a suitable probability distribution.Definition 1.A model is globally identifiable if M(h1)=M(h2) implies that h1=h2.A model is locally identifiable if there exists an open neighbourhood of any h such that this is true.Otherwise a model is non-identifiable.Non-identifiability occurs if a model has too many parameters, which is termed parameter redundancy.A parameter redundant model can be written in terms of a smaller set of parameters [5].Definition2.A model is parameter redundant if we can write M(h) as a function just of b,where b=f(h)2X b,in which X b has dimension dim(b)<dim(h).Models which are not parameter redundant are described as full rank.Definition3.An essentially full rank model is full rank for all h.A conditionally full rank model is full rank for some but not all h.This distinction depends upon the specification of the parame-ter space X.1.3.Previous use of symbolic methods1.3.1.Symbolic rank of a derivative matrixModels where M(h)is the exponential family probability den-sity function were considered in[5],who showed that whether or not a model defined by M(h)is parameter redundant can be determined by checking the symbolic rank of the derivative matrix, D=[@l k/@h i],where l k is the expectation of the k th observation,y k, and h i is the i th model parameter.If the symbolic rank of D is equal to the number of parameters p,the number of rows of D,then the model is full rank.If the symbolic rank of D is less than p the model is parameter redundant and not identifiable.The symbolic rank of D can in principle be obtained by using a symbolic algebra com-puter package such as Maple;see[11].In earlier work on exponen-tial family models,[12]adopted the equivalent criterion of differentiating the canonical parameter.We shall use derivative matrix terminology,whereas others sometimes refer to such matrices as Jacobians.We use the shorthand notation x to refer to1Àx.Example1–The Cormack Jolly Seber model for capture–recapture data.The derivative matrix method can be illustrated using the Cor-mack Jolly Seber(CJS)model[13–15].The CJS model is used for capture–recapture data,where animals are marked in a given year, and then recaptured in a subsequent year.Table1shows the form of the data collected from a capture–recapture experiment,where N i animals are marked in year i,m i,j denotes the number of individ-uals released in year i and next recaptured in year j for r years of marking and c years of recovery.Note that m i,j=0for j<i,as it is not possible to recapture an animal before it is marked;see for example Section2of[16].In the model examined here all model parameters are time-dependent.The probability that an animal marked in year i isfirst seen in year j ispi;j¼Y jk¼i/k!Y jk¼iþ1pk!pjþ1for16i6r;i6j6c;where06/k61is the probability an individual alive at time k sur-vives until year k+1,for k=1,...,cÀ1,and06p k61is the prob-ability of recapture in year k for k=2,...,c.The probability that an animal is never seen again is1ÀP cj¼ipi;j,for16i6r.The mean l of this product-multinomial model is then made up of terms of the form N i p i,j and N i1ÀP cj¼ipi;j.Consider the model with3years of marking and3years of recapture(r=3,c=4);the mean is thenl¼N1/1p2N1/1/2p2p3N1/1/2/3 p2 p3p4N1ð1À/1p2À/1/2 p2p3À/1/2/3 p2 p3p4ÞN2/2p3N2/2/3 p3p4N2ð1À/2p3À/2/3 p3p4ÞN3/3p4N3ð1À/3p4Þ266666666666666664377777777777777775;ð1Þand the parameters are h=[/1,/2,/3,p2,p3,p4].As a result of differ-entiating the elements of l with respect to the elements of h,a derivative matrix is formed as given in Table2.It is also shown in [5]that it is sufficient to consider a derivative matrix formed from differentiating the non-zero p i,j or ln(p i,j)with respect to h.Deriva-tive matrices can be simplified further by multiplying by any ele-mentary matrix as this does not change the derivative matrix rank.For example[7]suggest the scaled derivative matrix diag(h)D. Some of these alternative derivative matrices are also given in Ta-ble2.All of the derivative matrices can be seen to have symbolic rank5,but differ in complexity.The conclusion is that as there are six parameters the model is parameter redundant.Here this is obvious because the parameters/3and p4only ever appear as the product,/3p4,and are therefore confounded.Table1Form of capture–recapture data modeled by the CJS model.Year of marking Number released Year of recapture23ÁÁÁc 1N1m1,1m1,2...m1,c2N2m2,2...m2,c.........r N r m r,cD.J.Cole et al./Mathematical Biosciences228(2010)16–3017The idea of using a derivative matrix and its rank to test for the identifiability of a model,rather than parameter redundancy,was considered for a general econometric model by Rothenberg[1]. He showed that if the expected information matrix,I=ÀE[(@2log f)/ (@h i@h j)],for probability density function f,is non-singular then the model is locally identifiable.In[5,12]this test is shown to be equivalent tofinding the rank of the derivative matrix for exponen-tial family models.Ref.[1]also shows how‘reduced form’param-eters can be used to determine identifiability.Reduced-form parameters,h i(h),are functions of the original parameters and,if the model is rewritten in terms of the h i,then all the h i are identi-fiable.If the derivative matrix[@h i/@h j]has rank p,then h was shown to be locally identifiable.Further,if the functions h i are lin-ear then conditions are given for when h is globally identifiable.Other users of derivative matrices and their ranks to determine model parametric structure include[2]for latent-class models,[4] for non-linear regression models and a range of authors within the area of compartment modelling and dynamic systems,which are considered below.For any linear compartment model,M(h)is the output function:yðt;hÞ¼Cxðt;hÞ;with @@txðt;hÞ¼AðhÞxðt;hÞþBðhÞuðtÞ;for suitable matrices A,B and C,and input function u.The transfer function approach to determine parametric structure in compart-ment models was introduced by Bellman and Aström[17].This in-volves taking Laplace transforms to form a transfer function QðsÞ¼~yðsÞ=~uðsÞ,where~yðsÞis the Laplace transform of y(t),etc. The numerator and denominator of Q(s)are both polynomials in s and the non-constant coefficients of the powers of s are set equal to constants j i;these equations are called the moment invariants. This results in a set of equations that can be solved tofind h in terms of j.If there is only one solution then the model is globally identi-fiable;if there are a countable number of solutions,the model is lo-cally identifiable;if there are an infinite number of solutions the model is not identifiable[18].In[18]it is proved that,if the rank of the derivative matrix,formed from differentiating j i with respect to the parameters,is equal to the number of parameters,then the system is at least locally identifiable.There are several other meth-ods that are used to determine if a compartment model is identifi-able or not.For example,[19]present a Markov parameter matrix approach involving a derivative matrix and a rank test.Another method for determining identifiability in compartment models is the Taylor series approach of[20],which is also applicable to non-linear compartment models,and for which the rank Jacobian test was introduced in[3].There is also a‘similarity transform’ap-proach[6,21],extended to non-linear compartment models in[22]. The use of a Jacobian matrix and a rank test for this similarity trans-form approach was developed in[23].Example2:Simple linear compartment model.We consider a simple compartment model,first considered in [17],and used subsequently by many other authors.In this com-partment model,dx1¼Àðh1þh2Þx1þh3x2þu;dx2dt¼h2x1Àðh3þh4Þx2;and y¼x1;with x(0)=0,u(0)=1and u(t)=0otherwise.The model parameters are h=[h1,h2,h3,h4],which are here all assumed to be positive for simplicity.This model has a zero initial state;this initial condition can be relaxed,but that is not considered here again for simplicity. The transfer function isQðsÞ¼sþh3þh4s2þsðh1þh2þh3þh4Þþh1h3þh1h4þh2h4with non-constant coefficientsj1¼h3þh4;j2¼h1þh2þh3þh4;j3¼h1h3þh1h4þh2h4:ð2ÞAs there are three equations and four unknowns,obviously there are an infinite number of solutions to Eq.(2).We can check thisTable2Various derivative matrices for the CJS model,all of rank5.j is what is differentiated with respect to the parameters to form the derivative matrix.The last case includes the scaled derivative matrix,diag(h)D,as suggested in[7].j Derivative matrixlD1::6;1::4¼N1p2N1/2 p2p3N1/2/3 p2 p3p4ÀN1ðp2þ/2 p2ðp3þ/3 p3p4ÞÞ0N1/1 p2p3N1/1/3 p2 p3p4ÀN1/1 p2ðp3þ/3 p3p4Þ00N1/1/2 p2 p3p4ÀN1/1/2 p2 p3p4N1/1ÀN1/1/2p3ÀN1/1/2/3 p3p4ÀN1/1ð1À p2ð/2þ/2/3p4ÞÞ0N1/1/2 p2ÀN1/1/2/3 p2p4ÀN1/1 p2ð/2À/2/3p4Þ00N1/1/2/3 p2 p3ÀN1/1/2/3 p2 p32666666437777775D1::6;5::9¼00000N2p3N2/3 p3p4N2ðÀp3À/3 p3p4Þ000N2/2 p3p4ÀN2/2 p3p4N3p4ÀN3p4 N2/200000ÀN2/2/3p4N2ðÀ/2þ/2/3p4Þ000N2/2/3 p3ÀN2/2/3 p3N3/3ÀN3/3 2666666437777775p ijD¼p2/2 p2p3/2/3 p2 p3p4000 0/1 p2p3/1/3 p2 p3p4p3/3 p3p40 00/1/2 p2 p3p40/2 p3p4p4 /1À/1/2p3À/1/2/3 p3p4000 0/1/2 p2À/1/2/3 p2p4/2À/2/3p40 00/1/2/3 p2 p30/2/3 p3/3 2666666437777775ln(p ij)D¼/À11/À11/À110000/À12/À12/À12/À120 00/À130/À13/À13pÀ12À pÀ12À pÀ120000pÀ13À pÀ13pÀ13À pÀ1300pÀ140pÀ14pÀ14266666664377777775diagðhÞD¼111000011110 001011 1Àp2 pÀ12Àp2 pÀ1200001Àp3 pÀ131Àp3 pÀ130 001011 266666643777777518 D.J.Cole et al./Mathematical Biosciences228(2010)16–30result formally by forming the rank of appropriate derivative matri-ces.One derivative matrix can be formed by differentiating Eq.(2)with respect to the parameters.This derivative matrix is given in Table 3.An alternative derivative matrix can be formed from a Tay-lor series expansion of y (t ),the first 3non-constant coefficients of which arey ð2Þð0Þ¼Àh 1Àh 2;y ð3Þð0Þ¼ðÀh 1Àh 2Þ2þh 3h 2;y ð4Þð0Þ¼ðÀh 1Àh 2ÞðÀh 1Àh 2Þ2þh 3h 2n oþh 3h 2ðÀh 1Àh 2Àh 3Àh 4Þ:ð3ÞThe Taylor series expansion is infinite.However,for a linear system of compartment models with n compartments,only the first (2n À1)derivatives,y (k )(0)(k P 2)are required [24,25].As in this example there are two compartments,we only need the y (k )(0)gi-ven by Eq.(3).The resulting derivative matrix is also given in Ta-ble 3,and again has rank 3.In this instance the Markov parameter matrix approach,[19],results in the same derivative ma-trix as the Taylor series approach.Mention of [24]prompts us to acknowledge substantial related work in the literature of dynamic systems;here we allude to just a sample of the more closely-related papers.Thowsen [24]attributes the link between structural identifiability and the rank of the Taylor series-related derivative matrix to pp.163–164of [26].Delforge [27]considers linear time-invariant systems like compartment models,with the square matrix A unknown and B and C known,as is the case with Example 2.It is assumed that the eigenvalues of A are distinct and ‘known from experiment’,so identifiability of A is equivalent to that of the (non-singular)matrix of eigenvectors;an approach based on a Jacobian matrix is developed.For the same sort of model,[28]compares four methods,all using matrices whose rank properties are the same as those of Jacobian matrices but which ‘make the calculation of the determinant easier’.Again B and C are assumed known.Approach 1is the Laplace transform approach of [17],Approach 2is the Markov matrix approach of [19],which is noted to be equivalent to the Taylor series approach,Approach 3is the approach in [27]and Approach 4is that developed in [29],which we have already cited.It is shown in [28]that the approaches are in theory equivalent and the paper investigates relationships between the determinants,which are zero in identical conditions but calcu-lation of which can vary considerably in complexity.The above paragraph restricts attention to linear systems.Wynn and Parkin [30]consider continuous-time non-linear dynamic models and define sensitivities as derivatives of states (x )or ouputs (y )with respect to parameters;sensitivity matrices play the part of our derivative matrices.They then assume that the outputs follow a non-linear regression model on t ,and relate identifiability to non-singularity of the information matrix.They calculate the sen-sitivity matrix of the set of derivatives of the output,of orders up to a particular level,and they relate investigation of its non-singular-ity to the Taylor series method for determining system identifiabil-ity.Donzéet al [31]consider solutions of sets of parametric non-linear differential equations (_x¼f ðx ;h ;t Þ)and define the sensitiv-ity matrix as the matrix of first derivatives of elements of x with respect to elements of h .The sensitivity matrix consequently satis-fies a system of ordinary differential equations,as it did in [30].In some dynamic systems problems,the direct evaluation of the derivatives that represent sensitivities turns out to be computa-tionally expensive,because of the temporal dependence implied by the model.However,the duality-based adjoint method can obviate this difficulty dramatically;see [32]and especially Sec-tion 5of [33]for an introduction to the adjoint method.In spite of what is a dynamic systems large literature,we found no discussion of reparameterisation or the other methodological aspects we consider in the rest of this paper.1.3.2.Determining estimable parameters for parameter redundant modelsThe rank of a derivative matrix gives more information than just whether or not a model is parameter redundant.The rank is equal to the number of independent parameter combinations that can be estimated;the number of independent parameter combinations that can be estimated is referred to as the number of estimable parameters.The derivative matrix can be used to determine exactly what is estimable,using results from [34].If a model is parameter redundant then the rank of D is equal to the number of estimable parameters and the model is said to have deficiency d =p Àrank(D ).It is possible to tell which,if any,of the original parameters are esti-mable by solving a (h )T D (h )=0.In this case there are d solutions to a (h )T D (h )=0,labelled a j (h ),for j =1,...,d ,with individual entries a ij (h ).Any a ij (h )which are zero for all j correspond to a parameter which is estimable [34].In order to find other parameter combina-tions which are also estimable,we need to solve the system of linearfirst-order partial differential equations P pi ¼1a ij @f =@h i ¼0;j ¼1;...;r [34].Also known as Lagrange equations,these are familiar from the analysis of linear stochastic models –see for example p.158of [35].A similar method for compartment models has also been devel-oped in [23,36].Example 1–CJS model continued.Recall that in this case the rank of the derivative matrix was 5and that there were 6parameters in the model.Therefore there are 5estimable parameters and the model has deficiency 1.The single solution,up to an arbitrary scaling,of a T D =0is a T =[0,0,À/3/p 4,0,0,1].From the positions of the zeros relative to the order of dif-ferentiation in the derivative matrix,we can see that /1,/2,p 2and p 3are estimable,but /3and p 4are not.The remaining estimable term then results from solving the partial differential equationÀ@f @/3/3p 4þ@f@p 4¼0;the solution of which shows we can estimate /3p 4,as already observed.For recent research in stochastic models for carcinogenesis see [37,38].Table 3Derivative matrices for the simple compartment model.j is what is differentiated with respect to the parameters to form the derivative matrix.In both cases the derivative matrix is of rank 3.jDerivative matrixEq.(2)D ¼01h 3þh 401h 411h 111h 1þh 226643775Eq.(3)D ¼À12h 1þ2h 2À3h 21À6h 1h 2À3h 22À2h 2h 3À12h 1þ2h 2þh33h 21À6h 1h 2À3h 22À4h 2h 3À2h 1h 3Àh 23Àh 3h 40h 2À2h 1h 2À2h 22À2h 2h 3Àh 2h 400Àh 2h 326643775D.J.Cole et al./Mathematical Biosciences 228(2010)16–30192.Exhaustive summaries and their useAs shown in Section1.3,models in different areas of application may have a variety of different starting points for determining the parametric structure of models.In Tables2and3different param-eter vectors are differentiated to obtain the derivative matrices. What is differentiated is the key to determining the parametric structure of models.In order to provide a general framework we call this quantity an exhaustive summary.Exhaustive summaries for compartment models are defined in[6,21]and this can be ex-tended to any parametric model.An exhaustive summary is a set of parameter combinations that uniquely defines the model,and a formal definition is given below,adapted from[6].Definition4.A parameter vector j(h)is an exhaustive summary if knowledge of j(h)uniquely determines M(h).We have already seen several examples of exhaustive summa-ries.In Example1three alternative exhaustive summaries are used.Thefirst is j1(h)=l,where l is given by Eq.(1),and the other two are j2ðhÞ¼½/1p2;/1/2 p2p3,/1/2/3 p2 p3p4;/2p3;/2/3 p3p4; /3p4T and j3(h)=ln{j2(h)}.In Example2two different exhaustive summaries are used,given by Eqs.(4)and(5),j1ðhÞ¼½h3þh4;h1þh2þh3þh4;h1h3þh1h4þh2h4 T;ð4Þj2ðhÞ¼Àðh1þh2Þ;ðh1þh2Þ2þh3h2;Àðh1þh2Þ3hÀh3h2ð2h1þ2h2þh3þh4Þ;...i T:ð5ÞExhaustive summaries are useful because they specify a partic-ular aspect of the model,M(h),which can be used to determine model parametric structure.Theorem1.A model is globally(locally)identifiable if(there exists a region such that)j(h)=j(h0))h=h0.The proof of Theorem1follows from Definitions1and4.We show how an exhaustive summary can be used to determine para-metric structure in Section2.1.2.1.Detecting parameter redundancy using exhaustive summariesIn a similar way to[5,23],and other references mentioned above,the basis of determining parametric structure is to form an appropriate derivative matrix,D=[@j j/@h i],where j j is the j th element of the exhaustive summary,and h i is the i th of p parame-ters.The derivative matrix can be used to examine parameter redundancy or identifiability as laid out in the following theorem.Theorem2.Testing parameter redundancy.(a)(i)If D has rank equal to p then the model is full rank.(ii)If the rank of D is equal to q<p,then the model is parameter redundant.There are q estimable parameters and the modelhas deficiency d=pÀq.(b)If the model is parameter redundant the estimable parameterscan be determined by solving a(h)T D(h)=0,which has d solu-tions,labelled a j(h)for j=1,...,d,with individual entries a i j(h).Any a ij(h)which are zero for all d solutions correspond to a parameter,h i,which is estimable.The solutions of the system of linearfirst-order partial differential equations(PDEs),X p i¼1a ij@f@h i¼0;j¼1;...;r;ð6Þform the set of estimable parameters.Parameterised in terms of the estimable parameters,the model is full rank.Theorem2is a natural extension of similar theorems given in [5,34].Proofs follow the same lines as in those papers.Example1:CJS model continued.Theorem2has already been demonstrated for Example1in Section1.3,where it was shown that the CJS model is parameter redundant with deficiency1and that the set of estimable parame-ters is{/1,/2,p2,p3,/3p4}.The exhaustive summaries and their derivative matrices are given in Table2.The simplest exhaustive summary to use is j3(h)=ln{j2(h)}.Example2:simple linear compartment model continued.Part(b)of Theorem2has also already been demonstrated for Example2in Section1.3,where it was shown that the model was parameter redundant.The simplest exhaustive summary to use is that given by Eq.(4)and its derivative matrix is presented in Table3.As the rank of the derivative matrix was3and there were4parameters in this model the deficiency is1.We canfind exactly what is estimable by solving a T D=0.This gives a T=[Àh2/ h3,h2/h3,À1,1].Solving the PDEÀ@f@h1h2h3þ@f@h2h2h3À@f@h3þ@f@h4¼0gives the estimable parameter combinations h1+h2,h2h3,h3+h4. This can be readily appreciated from Eq.(1),where we can see that j is a function of these three parameter combinations.2.2.Extension theoremThe method of Theorem2requires calculating the symbolic rank of the derivative matrix.If the derivative matrix is large and/or with complicated algebraic expressions,then it may not be possible tofind the symbolic rank.One solution is to simplify the exhaustive summary,so that the derivative matrix is structur-ally simpler,and we consider this in Section3.Another method of simplifying the calculation is to consider the smallest structural version of a model and then extend the size of that model whilst maintaining the model structure.This idea is advanced in[5]for full rank product-multinomial models and extended in[7]to parameter redundant product-multinomial models.This is gener-alised in Theorem3.Consider a model whose parametric structure is being exam-ined using exhaustive summary j1(h1),with parameters h1.The derivative matrix is D1(h1)=[@j1/@h1].This model is then ex-tended,adding extra parameters,h2,and the exhaustive summary is extended to be j(h0)=[j1(h1),j2(h0)],with h0=[h1,h2].The deriv-ative matrix of the extended model isD¼D1ðh1ÞD2;1ðh1Þ0D2;2ðh2Þ!with D2;1¼@j2@h1!and D2;2¼@j2@h2!:Theorem3.If the original model is full rank(i.e.D1is full rank)and D2,2is full rank,then the extended model is full rank also.This approach can often be generalised to all models of the same type using induction.Proof of Theorem3follows from the fact that,as D1and D2,2are full rank,D is also full rank,as shown in[5].Remark1.If the original model is not full rank,wefirst need to find a reparameterisation of the model that is full rank.Then Theorem3can be applied to the reparameterised model,so that deficiency of the general model can be deduced.Extension of parameter redundant models is considered explicitly in[7].20 D.J.Cole et al./Mathematical Biosciences228(2010)16–30。