A Partial Join Approach for Mining Co-location Patterns
- 格式:pdf
- 大小:185.28 KB
- 文档页数:9
布地格福与布地奈德福莫特罗治疗COPD稳定期效果的对照研究贺李明(江西萍乡矿业集团有限责任公司总医院呼吸与危重症医学科,萍乡 337000)摘 要 目的:比较布地格福与布地奈德福莫特罗治疗慢性阻塞性肺疾病(COPD)稳定期的临床效果。
方法:收集2021年3月至2022年3月采用药物治疗的COPD稳定期患者60例,用随机数字表法分为观察组和对照组各30例。
观察组给予布地格福吸入气雾剂治疗,2揿/次,2次/d,共10 d。
对照组给予布地奈德福莫特罗吸入粉雾剂治疗,1吸/次,2次/d,共10 d。
比较两组治疗前后血气分析指标、肺功能指标、圣·乔治呼吸问题调查问卷(SGRQ)评分变化情况以及治疗期间药物相关不良反应发生情况。
结果:观察组治疗后血酸碱度(pH)、动脉血氧分压(PaO2)检测结果均高于对照组,动脉血二氧化碳分压(PaCO2)检测结果低于对照组(P<0.05)。
观察组治疗后FEV1%、FEV1/FVC检测结果均高于对照组(P<0.05);症状评分、日常活动评分、心理状态评分及SGRQ总评分均低于对照组(P<0.05)。
观察组和对照组治疗期间各有2例出现药物相关不良反应(P>0.05)。
结论:布地格福与布地奈德福莫特罗治疗COPD 稳定期均有显著的疗效和良好的安全性,但布地格福效果更佳,可作为优选药物。
关键词慢性阻塞性肺疾病;布地格福;布地奈德福莫特罗;血气分析;肺功能中图分类号:R974 文献标志码:A 文章编号:1006-1533(2023)24-0037-04引用本文贺李明. 布地格福与布地奈德福莫特罗治疗COPD稳定期效果的对照研究[J]. 上海医药, 2023, 44(24): 37-40.A comparative study of the effect of budigfu and budesonide formoterol in thetreatment of chronic obstructive pulmonary disease in the stable stageHE Liming(Department of Respiratory and Critical Illness Medicine of General Hospital of Pingxiang Mining Group Co., Ltd, Jiangxi 337000, China) ABSTRACT Objective: To compare the clinical effects of budigfu and budesonide formoterol in the treatment of chronic obstructive pulmonary disease (COPD) in the stable phase. Methods: From March 2021 to March 2022, 60 patients with COPD in the stable stage were collected, and were divided into an observation group and a control group with 30 cases in each group with random number table method. The observation group was treated with budigfu inhalation aerosol, 2 presses/time, 2 times/day, for a total of 10 days, and the control group was treated with budesonide formoterol inhalation powder, 1 presses/time, 2 times/day, for a total of 10 days. The changes of blood gas analysis index, pulmonary function index, St. George’s respiratory questionnaire(SGRQ) score and the occurrence of drug-related adverse reactions were compared between the two groups before and after treatment.Results: After treatment, the detection results of blood pH and arterial partial pressure of oxygen(PaO2) in the observation group were higher than those in the control group, and the result of arterial partial pressure of carbon dioxide(PaCO2) was lower than that in the control group(P<0.05); the detection results of FEV1%, FEV1/FVC in the observation group were higher than those in the control group after treatment(P<0.05); the symptom score, daily activity score, mental state score and SGRQ total score in the observation group were lower than those in the control group(P<0.05). There were two incidence cases of drug-related adverse reactions in the both observation groups(P>0.05). Conclusion: Budigfu and budesonide formoterol have significant efficacy and good safety in the treatment of COPD in the stable stage, but budigfu is more effective and can be used as the preferred drug.通信作者:贺李明。
高三英语学术研究方法创新不断探索单选题30题1. In academic research, a hypothesis is a ______ that is tested through experiments and observations.A. predictionB. conclusionC. theoryD. assumption答案:D。
本题考查学术研究中“假说”相关的基本概念。
选项A“prediction”意为“预测”,通常是基于现有信息对未来的估计;选项B“conclusion”指“结论”,是在研究后得出的最终判断;选项C“theory”是“理论”,是经过大量研究和验证形成的体系;选项D“assumption”表示“假定、设想”,更符合“假说”的含义,即在研究初期未经充分验证的设想。
2. The main purpose of conducting academic research is to ______ new knowledge and understanding.A. discoverB. createC. inventD. produce答案:A。
此题考查学术研究目的相关的词汇。
选项A“discover”意思是“发现”,强调找到原本存在但未被知晓的事物;选项B“create”意为“创造”,侧重于从无到有地造出新的东西;选项C“invent”指“发明”,通常指创造出新的工具、设备等;选项D“produce”有“生产、产生”的意思,比较宽泛。
在学术研究中,主要是“发现”新知识和理解,所以选A。
3. A reliable academic research should be based on ______ data and methods.A. accurateB. preciseC. correctD. valid答案:D。
本题关于可靠学术研究的基础。
选项A“accurate”侧重于“准确无误”,强调与事实完全相符;选项B“precise”意为“精确的、明确的”,更强调细节的清晰和明确;选项C“correct”指“正确的”;选项D“valid”表示“有效的、有根据的”,强调数据和方法具有合理性和可靠性。
Combining classifiers of pesticides toxicity through aneuro-fuzzy approachEmilio Benfenati1, Paolo Mazzatorta1, Daniel Neagu2, and Giuseppina Gini21 Istituto di Ricerche Farmacologiche "Mario Negri" Milano,Via Eritrea, 62, 20157 Milano, Italy{Benfenati, Mazzatorta}@ marionegri.it2 Dipartimento di Elettronica e Informazione, Politecnico di Milano,Piazza L. da Vinci 32, 20133 Milano, ItalyNeagu@fusberta.elet.polimi.it, Gini@elet.polimi.ithttp://airlab.elet.polimi.it/imagetoxAbstract. The increasing amount and complexity of data in toxicity predictioncalls for new approaches based on hybrid intelligent methods for mining thedata. This focus is required even more in the context of increasing number ofdifferent classifiers applied in toxicity prediction. Consequently, there exist aneed to develop tools to integrate various approaches. The goal of this researchis to apply neuro-fuzzy networks to provide an improvement in combining theresults of five classifiers applied in toxicity of pesticides. Nevertheless, fuzzyrules extracted from the trained developed networks can be used to performuseful comparisons between the performances of the involved classifiers. Ourresults suggest that the neuro-fuzzy approach of combining classifiers has thepotential to significantly improve common classification methods for the use intoxicity of pesticides characterization, and knowledge discovery.1 IntroductionQuantitative structure–activity relationships (QSARs) correlate chemical structure to a wide variety of physical, chemical, biological (including biomedical, toxicological, ecotoxicological) and technological (glass transition temperatures of polymers, critical micelle concentrations of surfactants, rubber vulcanization rates) properties. Suitable correlations, once established and validated, can be used to predict properties for compounds as yet unmeasured or even unknown.Classification systems for QSAR studies are quite usual for carcinogenicity [9], because in this case carcinogenicity classes are defined by regulatory bodies such as IARC and EPA. For ecotoxicity, most of the QSAR models are regressions, referring to the dose giving the toxic effect in 50% of the animals (for instance LC50: lethal concentration for 50% of the test animals). This dose is a continuous value and regression seems the most appropriate algorithm. However, classification affords some advantages. Indeed, i) the regulatory values are indicated as toxicity classes and ii) classification can allow a better management of noisy data. For this reason we investigated classification in the past [7], [8], [9] and also in this study. No generalrule exists to define an approach suitable to solve a specific classification problem. In several cases, a selection of descriptors is the only essential condition to develop a general system. The next step consists in defining the best computational method to develop robust structure–activity models.Artificial neural networks (ANNs) represent an excellent tool that have been used to develop a wide range of real-world applications, especially when traditional solving methods fail [3]. They exhibit advantages such as ideal learning ability from data, classification capabilities and generalization, computationally fastness once trained due to parallel processing, and noise tolerance. The major shortcoming of neural networks is represented by their low degree of human comprehensibility. More transparency is offered by fuzzy neural networks FNN [14], [16], [18], which represent a paradigm combining the comprehensibility and capabilities of fuzzy reasoning to handle uncertainty, and the capabilities to learn from examples.The paper is organized as follows. Section 2 briefly presents the aspects of data preparation, based on chemical descriptors, some of the most common classification techniques and shows how they behave for toxicology modeling, with a emphasis to pesticides task. Section 3 proposes the neuro-fuzzy approach in order to manage the integration of all the studied classifiers, based on the structure developed as FNN Implicit Knowledge Module (IKM) of the hybrid intelligent system NIKE (Neural explicit&Implicit Knowledge inference system [17]). Preliminary results indicate that combination of several classifiers may lead to the improved performance [5], [11], [12]. The extracted fuzzy rules give new insights about the applicability domain of the implied classifiers. Conclusions of the paper are summarized in the last section.2 Materials and Methods2.1 Data setFor this paper a data set constituted of 57 common organophosphorous compounds has been investigated. The main objective is to propose a good benchmark for the classification studies developed in this area. The toxicity values are the result of a wide bibliographic research mainly from “the Pesticide Manual”, ECOTOX database system, RTECS and HSDB [1]. An important problem that we faced is connected with the variability that the toxicity data presents [2]. Indeed, it is possible to find different fonts showing for the same compound and the same end–point LC50 different for about two orders of magnitude. Such variability is due to different factors, as the different individual reactions of organisms tested, the different laboratory procedures, or is due to different experimental conditions or accidental errors.The toxicity value was expressed using the form Log10 (1/LC50). Then the values were scaled in the interval [-1..1]. Four classes were defined: Class 1 [-1..-0.5), Class 2 [-0.5..0), Class 3 [0..0.5), Class 4 [0.5..1] (Table 2).2.2 DescriptorsA set of about 150 descriptors were calculated by different software: Hyperchem 5.01, CODESSA 2.2.12, Pallas 2.13. They are split into six categories: Constitutional (34 descriptors), Geometrical (14), Topological (38), Electrostatic (57), Quantum–chemicals (6), and Physico–chemical (4). In order to obtain a good model, a selection of the variables, which better describe the molecules, is necessary. There is the risk that some descriptors does not add information, and increase the noise, making more complex the result analysis. Furthermore, using a relatively low number of variables, the risk of overfitting is reduced. The descriptors selection (table 1) was obtained by Principal Components Analysis (PCA), using SCAN4:Table 1. Names of the chemical descriptors involved in the classification task.Cat. Cod.Moment of inertia A G D1Relative number of N atoms C D2Binding energy (Kcal/mol) Q D3DPSA-3 Difference in CPSAs (PPSA3-PNSA3) [Zefirov’s PC] E D4Max partial charge (Qmax) [Zefirov’s PC] E D5ZX Shadow / ZX Rectangle G D6Number of atoms C D7Moment of inertia C G D8PNSA-3 Atomic charge weighted PNSA [Zefirov’s PC] E D9HOMO (eV) E D10LUMO (eV) Q D11Kier&Hall index (order 3) T D122.3 Classification algorithmsThe classification algorithms used for this work are five: LDA (Linear Discriminant Analysis), RDA (Regularized Discriminant Analysis), SIMCA (Soft Independent Modeling of Class Analogy), KNN (K Nearest Neighbors classification), CART (Classification And Regression Tree). The first four are parametric statistical systems based on the Fisher’s discriminant analysis, the fifth and sixth are not parametrical statistical methods, the last one is a classification tree.LDA: the Fischer’s linear discrimination is an empirical method based on p–dimensional vectors of attributes. Thus the separation between classes occurs by an hyperplane, which divides the p–dimensional space of attributes.RDA: The variations introduced in this model have the aim to obviate the principal problems that afflict both the linear and quadratic discrimination. The regulation more efficient was carried out by Friedman, who proposed a compromise between the two previous techniques using a biparametrical method for the estimation (λ and γ).1 Hypercube Inc., Gainsville, Florida, USA2 SemiChem Inc., Shawnee, Kansas, USA3 CompuDrug; Budapest, Hungary4 SCAN (Software for Chemometric Analysis) v.1.1, from Minitab: SIMCA: the model is one of the first used in chemometry for modeling classes and, contrarily to the techniques before described, is not parametrical. The idea is to consider separately each class and to look for a representation using the principal components. An object is assigned to a class on the basis of the residual distance, rsd 2, that it has from the model which represent the class itself:()22ˆigj igj igj x xr −=, )(22j j igjM p r rsd −=∑ (1)where x igj = co –ordinates of the object’s projections on the inner space of the mathematical model for the class, x igj = object’s co –ordinates, p=number of variables, M j = number of the principal components significant for the j class.KNN: this technique classifies each record in a data set based on a combination of the classes of the k record(s) most similar to it in a historical data set (where k = 1). CART is a tree –shaped structure that represents sets of decisions. These decisions generate rules for the classification of a data set. CART provides a set of rules that can be applied to a new (unclassified) data set to predict which records will have a given outcome. It segments a data set by creating two –way splits.The classification obtained using these algorithms is shown in Table 2.2.4 ValidationThe more common methods for validation are: i) Leave –one –out (LOO); ii) Leave –more –out (LMO); iii) Train & Test; iv) Bootstrap. We used LOO, since it is considered the best working on data set of small dimension [10]. According to LOO, given n objects, n models are computed. For each model, the training set consists of n –1 objects and the evaluation set consists of the object left. To estimate the predictive ability, we considered the gap between the experimental (fitting) and the predicted value (cross –validation) for the n objects left, one by one, out from the model.Table 2. True class and class assigned by the algorithms for each compound 5.True Class CART LDA KNN SIMCARDAAnilofos 2 2 2 1 2 2 Chlorpyrifos1 2 2 1 2 2 Chlorpyryfos-methyl 2 2 2 1 2 2 Isazofos 1 1 1 2 1 1 Phosalone 2 2 2 2 2 2 Profenofos 1 2 2 1 2 2 Prothiofos 2 2 2 2 2 2 Azamethiphos 2 2 2 1 4 2 Azinphos methyl 1 1 1 2 1 1 Diazinon 3 3 1 1 4 1 Phosmet2 2 2 1 2 2 Pirimiphos ethyl 1 1 1 1 1 1 Pirimiphos methyl2312115 The 40 molecules with a blank background were used to train the neuro-fuzzy classifier.Pyrazophos 2 2 1 4 2 1Quinalphos 1 1 1 2 1 1Azinphos-ethyl 1 1 1 1 2 1Etrimfos 1 1 1 3 3 1Fosthiazate 4 2 2 2 4 2Methidathion 1 1 1 1 1 1Piperophos 3 3 3 2 2 3Tebupirimfos 4 1 1 3 4 1Triazophos 1 1 1 2 1 1Dichlorvos 2 4 2 2 2 2Disulfoton 3 3 3 1 3 3Ethephon 4 4 4 4 4 4Fenamiphos 1 1 3 2 1 1Fenthion 2 2 3 2 2 3Fonofos 1 1 3 2 1 3Glyphosate 4 4 4 4 4 4Isofenphos (isophenphos) 3 3 3 1 3 3Methamidophos 4 4 4 3 4 4Omethoate 3 3 3 3 3 3Oxydemeton-methyl 3 3 3 3 3 3Parathion ethyl (parathion) 2 2 2 3 1 3Parathion methyl 3 3 3 3 3 3Phoxim 2 2 1 1 1 1Sulfotep 1 1 3 2 2 2Tribufos 2 2 2 2 2 2Trichlorfon 2 2 2 1 2 4Acephate 4 4 1 3 4 4Cadusafos 2 2 3 3 2 2Chlorethoxyfos 2 2 2 3 2 2Demeton-S-methyl 3 3 3 3 3 3Dimethoate 3 3 1 1 3 3Edifenphos 2 2 3 1 2 2EPN 2 2 2 2 2 2Ethion 2 2 2 2 2 2Ethoprophos 3 3 3 2 2 3Fenitrothion 3 2 3 3 3 3Formothion 3 3 2 3 3 3Methacrifos 2 2 2 2 2 3Phorate 1 1 3 2 1 3Propetamphos 3 3 3 4 2 3Sulprofos 3 3 3 2 3 3Temephos 3 3 2 1 3 2Terbufos 1 1 3 2 3 3Thiometon 3 3 3 3 3 33.1 The neuro-fuzzy combination of the classifiers3.2 Motivations and architectureCombining multiple classifiers could be considered as a direction for the development of highly reliable pattern recognition systems, coming from the hybrid intelligent systems approach. Combination of several classifiers may result in improved performances [4], [5]. The necessity of combining multiple classifiers is arising from the main demand of increasing quality and reliability of the final models. There are different classification algorithms in almost all the current pattern recognition application areas, each one having certain degrees of success, but none of them beingas good as expected in applications. The combination technique we propose for the toxicity classification is a neuro-fuzzy gating of the implied classifiers, trained against the correct data. This approach allows multiple classifiers to work together.For this task, the hybrid intelligent system NIKE was used, in order to automate the processes involved, from the data representation for toxicity measurements, to the prediction of toxicity for given new input. It also suggests how the fuzzy inference produced the result, when required [17], based on the effect measure method to combine the weights between the layers of the network in order to select the strongest input-output dependencies [6]. Consequently, for NIKE, we defined the implicit knowledge as the knowledge acquired by neural/neuro-fuzzy nets.Fig. 1. Implicit Knowledge Module implemented as FNN2.The IKM-FNN is implemented as a multilayered neural structure with an input layer, establishing the inputs to perform the membership degrees of the current values, a fully connected three-layered FNN2 [16], and a defuzzification layer [17] (fig.1). The weights of the connections between layer 1 and layer 2 are set to one. A linguistic variable X i is described by m i fuzzy sets, A ij, having the degrees of membership performed by the functions µij(x i), j=1,2,...,m i, i=1,2,..,p., (in our case, p=5, all m i=4, on the classes of the prediction result of the classifiers, as inputs, and on the classes of the toxicity values, as the output y defuz). The layers 1 and 5 are used in the fuzzification process in the training and prediction steps, and the layers 2-4 are organized as a feedforward network to represent the implicit rules through FNN training [15][19]. 3.2 ResultsSince NIKE modules process only data scaled into the interval [0..1], every class was represented by the centroid of each of the four classes in which the available domain was split: 0.135 (class 1), 0.375 (class 2), 0.625 (class 3), and 0.875 (class 4). The inputs and the output followed a trapezoidal (de)fuzzification (fig. 2): VeryLow (0-0.25), Low (0.25-0.5), Medium (0.5-0.75), High (0.75-1).The neuro-fuzzy network was trained on a training set of 40 objects (70% of the entire set, as depicted in Table 2). The training set was used for the adjustment of the connections of the neural and neuro-fuzzy networks with backpropagation (traingdx) algorithm; traingdx is a network training function that updates weight and bias values according to gradient descent momentum and an adaptive learning rate. The neuro-fuzzy network was a multi-layered structure with the 5x4 above described fuzzy inputs and 4 fuzzy output neurons, the toxicity class linguistic variable (fig. 2.a). The number of hidden neurons parameterized the FNN. After different models (5 to 50 hidden units), a medium number of hidden units is desirable and had the same best results: IKM-FNN with 10, 12 and 19 neurons (fig. 3).(a) (b)Fig. 2. NIKE: (a)The fuzzy terms of the generic linguistic variable Class; (b) the FNN model. Table 3. Performances of the classification algorithms computed.NER% fitting NER%validation DescriptorsLDA 64.91 61.40 D1,D2, D3, D4RDA 84.21 71.93 D1, D2, D3, D4, D6, D7, D8, D11, D12, D13 SIMCA 92.98 77.19 D1, D2, D3, D4, D5, D6, D7, D8, D10, D11, D12 KNN - 61.40 D1, D12CART 85.96 77.19 D1, D2, D3, D4, D5, D9Table 4. Confusion matrix of the neuro-fuzzy combination of classifiers.N° of objectsAssigned Class1 2 3 41 132 15True Class2 20 203 1 15 164 6 6Table 5. True class and class assigned by all the classifiers for each compound wrong predicted by the neuro-fuzzy combination of classifiers.True Class CART LDA KNN SIMCA RDA FNN Chlorpyrifos 1 2 2 1 2 2 2Profenofos 1 2 2 1 2 2 2Fenitrothion 3 2 3 3 3 3 2(a)(b)(c)(d)(e)(f)Fig. 3. The results of training FNNs: (a) 3-5 errors, the best are FNN10H, FNN12H and FNN19H; (b) the chosen model, FNN10H, against the SIMCA results and the real ones; (c) the bad fuzzy inference prediction for 2 cases in class 1 (Chlorpyrifos and Profenofos); (d) the bad fuzzy inference prediction for the case in class 3 (Fenitrothion); two samples of good prediction for test cases: (e) a class 1 sample (Phorate); (f) a class 2 sample (Edinfenphos).A momentum term of 0.95 was used (to prevent too many oscillations of the error function). The nets were trained up to 5000 epochs, giving an error about 0.015. The recognition error for the above models is 5.26% (table 4, 5, fig. 3).The confusion matrix shows the ability in prediction of our approach. Looking of Table 3, we notice that the best performance was obtained by SIMCA, which could correctly classify almost 93% of the molecules. This encouraging result was obtained with whole data set involved in developing the model. If we take a look to the NER% validated with LOO, we can notice that we loss a lot of the reliability of the model when we predict the toxicity of an external object. Such a behavior proves the ability in modeling of these algorithms, but shows also their incapacity in generalization. The neuro-fuzzy approach seems to overcome this problem, succeeding in voting for the best opinion and underling all the considered classification algorithms (fig. 3).3.3 Interpreting the results of the neuro-fuzzy combination of the classifiers The most relevant fuzzy rules were extracted from the IKM-FNN structures using Effect Measure Method [6][13]. Finally, after deleting the contradictory rules, the next list of the most trusty fuzzy rules were considered for the chosen net IKM-FNN10H: IF CarFit1 is:VeryLow THEN class is:High (39.22%)IF CarFit1 is:Low THEN class is:High (82.30%)IF CarFit1 is:Medium THEN class is:High (48.74%)IF CarFit1 is:High THEN class is:High (39.04%)IF SimFit1 is:VeryLow THEN class is:Medium (61.25%)IF SimFit1 is:Low THEN class is:Medium (36.04%)IF SimFit1 is:High THEN class is:Medium (43.72%)IF RdaFit1 is:VeryLow THEN class is:Low (75.65%)IF RdaFit1 is:Low THEN class is:Low (100.00%)IF RdaFit1 is:High THEN class is:High (76.39%)Three types of fuzzy rules were obtained: some could be grouped by the same output, or by having the same fuzzy term in the premise and conclusion, and, finally, rules with mixed terms in premises and conclusion parts. From the first two groups of fuzzy rules (italics), we could conclude that, the opinion of the entry classifier is not important for the given output. More precisely, CART prediction for High values of toxicity (class 4) is better to not be taken in consideration.IF (CarFit1 is:VeryLow) OR (CarFit1 is:Low) OR (CarFit1 is:Medium) OR (CarFit1 is:High) THEN class is:HighSimilarly, SIMCA outputs are not so important for predicting class 3 (Medium toxicity: the second group of fuzzy rules). From the second last group of rules, we could find which is the best classifier from the involved systems. In our case, in order to predict class 2 (Low toxicity) is better to consider the opinion coming from RDA. The same opinion is very important for predicting the class 4 (High toxicity) cases too.ConclusionsClassification of the toxicity requires a high degree of experience from computational chemistry experts. Several approaches were described to generate suitable computer-based classifiers for the considered patterns. We investigated five different classifiers and a neuro-fuzzy correlation of them, to organize and classify toxicity data sets. Our approach shown an improved behaviour as a combination of classifiers. Some results viewing fuzzy rules extraction, as well as the possibility to interpret particular inferences suggest that the Neuro-Fuzzy approach has the potential to significantly improve common classification methods for the use in toxicity characterization. AcknowledgmentThis work is partially funded by the E.U. under the contract HPRN-CT-1999-00015. References1. Benfenati, E., Pelagatti, S., Grasso, P., Gini, G.: COMET: the approach of a project in evaluatingtoxicity. In: Gini, G. C.; Katritzky, A. R. (eds.): Predictive Toxicology of Chemicals: Experiences and Impact of AI Tools. AAAI 1999 Spring Symposium Series. AAAI Press, Menlo Park, CA (1999) 40-43 2. Benfenati, E., Piclin, N., Roncaglioni,A., Varì, M.R.: Factors Influencing Predictive Models ForToxicology. SAR and QSAR in environmental research, 12 (2001) 593-603.3. Bishop, C.M.: Neural networks for pattern recognition. Clarendon Press, Oxford (1995)4. Chen, K., Chi, H.: A method of combining multiple probabilistic classifiers through soft competition ondifferent feature sets. Neurocomputing 20 (1998) 227-2525. Duin, R.P.W., Tax, D.M.J.: Experiments with Classifier Combining Rules. Lecture Notes in ComputerScience, 1857, Springer-Verlag, Berlin (2000) 16-296. Enbutsu, I., Baba, K., Hara, N.: Fuzzy Rule Extraction from a Multilayered Network, in Procs. ofIJCNN'91, Seattle (1991) 461-4657. Gini, G., Benfenati, E., Boley, D.: Clustering and Classification Techniques to Assess Aquatic Toxicity.Procs. of the Fourth Int'l Conf. KES2000, Brighton, UK, Vol. 1 (2000) 166-1728. Gini, G., Giumelli, M., Benfenati, E., Lorenzini, P., Boger, Z.: Classification methods to predict activityof toxic compounds, V Seminar on Molecular Similarity, Girona, Spain (2001 forthcoming)9. Gini, G., Lorenzini, M., Benfenati, E., Brambilla, R., Malvé, L.: Mixing a Symbolic and a SubsymbolicExpert to Improve Carcinogenicity Prediction of Aromatic Compounds. In: Kittler,J.,Roli,F.(eds.):Multiple Classifier Systems, Springler-Verlag, Berlin (2001)126-135.10. Helma, C., Gottmann, E., Kramer, S.: Knowledge discovery and data mining in toxicology. Statisticalmethods in medical research, 9 (2000) 131-13511. Ho, T., Hull, J., Srihari, S.: Decision combination in multiple classifier systems. IEEE Trans. PatternAnal. Mach. Intell. 16/1 (1994) 66-7512. Jacobs, R.A.: Methods for combining experts' probability assessments, Neur. Comp. 7/5(1995)867-88813. Jagielska, I., Matthews, C., Whitfort, T.: An investigation into the application of ANN, FL, GA, andrough sets to automated knowledge acquisition for classification problems. Neurocomp,24(1999)37-5414. Kosko, B.: Neural Networks and Fuzzy System. Prentice-Hall, Englewood Cliffs (1992)15. Lin, C.T., George Lee, C.S.: Neural - Network Based Fuzzy Logic Control and Decision System. IEEETransactions on Computers, 40/12 (1991) 1320-133616. Nauck, D., Kruse, R.: NEFCLASS-X: A Neuro-Fuzzy Tool to Build Readable Fuzzy Classifiers. BTTech. J. 16/3 (1998) 180-192.17. Neagu, C.-D., Avouris, N.M., Kalapanidas, E., Palade, V.: Neural and Neuro-fuzzy Integration in aKnowledge-based System for Air Quality Prediction. App Intell. J. (2001 accepted)18. Palade, V., Neagu, C.-D., Patton, R.J.: Interpretation of Trained Neural Networks by Rule Extraction,Procs. of Int'l Conf. 7th Fuzzy Days in Dortmund (2001) 152-161.19. Rumelhart, D.E., McClelland, J.L.: Parallel Distributed Processing, Explanations in the Microstructureof Cognition. MIT Press (1986)。
- disruption ,: Global convergence vs nationalSustainable-,practices and dynamic capabilities in the food industry: A critical analysis of the literature5 Mesoscopic- simulation6 Firm size and sustainable performance in food -s: Insights from Greek SMEs7 An analytical method for cost analysis in multi-stage -s: A stochastic / model approach8 A Roadmap to Green - System through Enterprise Resource Planning (ERP) Implementation9 Unidirectional transshipment policies in a dual-channel -10 Decentralized and centralized model predictive control to reduce the bullwhip effect in -,11 An agent-based distributed computational experiment framework for virtual -/ development12 Biomass-to-bioenergy and biofuel - optimization: Overview, key issues and challenges13 The benefits of - visibility: A value assessment model14 An Institutional Theory perspective on sustainable practices across the dairy -15 Two-stage stochastic programming - model for biodiesel production via wastewater treatment16 Technology scale and -s in a secure, affordable and low carbon energy transition17 Multi-period design and planning of closed-loop -s with uncertain supply and demand18 Quality control in food -,: An analytical model and case study of the adulterated milk incident in China19 - information capabilities and performance outcomes: An empirical study of Korean steel suppliers20 A game-based approach towards facilitating decision making for perishable products: An example of blood -21 - design under quality disruptions and tainted materials delivery22 A two-level replenishment frequency model for TOC - replenishment systems under capacity constraint23 - dynamics and the ―cross-border effect‖: The U.S.–Mexican border’s case24 Designing a new - for competition against an existing -25 Universal supplier selection via multi-dimensional auction mechanisms for two-way competition in oligopoly market of -26 Using TODIM to evaluate green - practices under uncertainty27 - downsizing under bankruptcy: A robust optimization approach28 Coordination mechanism for a deteriorating item in a two-level - system29 An accelerated Benders decomposition algorithm for sustainable -/ design under uncertainty: A case study of medical needle and syringe -30 Bullwhip Effect Study in a Constrained -31 Two-echelon multiple-vehicle location–routing problem with time windows for optimization of sustainable -/ of perishable food32 Research on pricing and coordination strategy of green - under hybrid production mode33 Agent-system co-development in - research: Propositions and demonstrative findings34 Tactical ,for coordinated -s35 Photovoltaic - coordination with strategic consumers in China36 Coordinating supplier׳s reorder point: A coordination mechanism for -s with long supplier lead time37 Assessment and optimization of forest biomass -s from economic, social and environmental perspectives – A review of literature38 The effects of a trust mechanism on a dynamic -/39 Economic and environmental assessment of reusable plastic containers: A food catering - case study40 Competitive pricing and ordering decisions in a multiple-channel -41 Pricing in a - for auction bidding under information asymmetry42 Dynamic analysis of feasibility in ethanol - for biofuel production in Mexico43 The impact of partial information sharing in a two-echelon -44 Choice of - governance: Self-managing or outsourcing?45 Joint production and delivery lot sizing for a make-to-order producer–buyer - with transportation cost46 Hybrid algorithm for a vendor managed inventory system in a two-echelon -47 Traceability in a food -: Safety and quality perspectives48 Transferring and sharing exchange-rate risk in a risk-averse - of a multinational firm49 Analyzing the impacts of carbon regulatory mechanisms on supplier and mode selection decisions: An application to a biofuel -50 Product quality and return policy in a - under risk aversion of a supplier51 Mining logistics data to assure the quality in a sustainable food -: A case in the red wine industry52 Biomass - optimisation for Organosolv-based biorefineries53 Exact solutions to the - equations for arbitrary, time-dependent demands54 Designing a sustainable closed-loop -/ based on triple bottom line approach: A comparison of metaheuristics hybridization techniques55 A study of the LCA based biofuel - multi-objective optimization model with multi-conversion paths in China56 A hybrid two-stock inventory control model for a reverse -57 Dynamics of judicial service -s58 Optimizing an integrated vendor-managed inventory system for a single-vendor two-buyer - with determining weighting factor for vendor׳s ordering59 Measuring - Resilience Using a Deterministic Modeling Approach60 A LCA Based Biofuel - Analysis Framework61 A neo-institutional perspective of -s and energy security: Bioenergy in the UK62 Modified penalty function method for optimal social welfare of electric power - with transmission constraints63 Optimization of blood - with shortened shelf lives and ABO compatibility64 Diversified firms on dynamical - cope with financial crisis better65 Securitization of energy -s in China66 Optimal design of the auto parts - for JIT operations: Sequential bifurcation factor screening and multi-response surface methodology67 Achieving sustainable -s through energy justice68 - agility: Securing performance for Chinese manufacturers69 Energy price risk and the sustainability of demand side -s70 Strategic and tactical mathematical programming models within the crude oil - context - A review71 An analysis of the structural complexity of -/s72 Business process re-design methodology to support - integration73 Could - technology improve food operators’ innovativeness? A developing country’s perspective74 RFID-enabled process reengineering of closed-loop -s in the healthcare industry of Singapore75 Order-Up-To policies in Information Exchange -s76 Robust design and operations of hydrocarbon biofuel - integrating with existing petroleum refineries considering unit cost objective77 Trade-offs in - transparency: the case of Nudie Jeans78 Healthcare - operations: Why are doctors reluctant to consolidate?79 Impact on the optimal design of bioethanol -s by a new European Commission proposal80 Managerial research on the pharmaceutical - – A critical review and some insights for future directions81 - performance evaluation with data envelopment analysis and balanced scorecard approach82 Integrated - design for commodity chemicals production via woody biomass fast pyrolysis and upgrading83 Governance of sustainable -s in the fast fashion industry84 Temperature ,for the quality assurance of a perishable food -85 Modeling of biomass-to-energy - operations: Applications, challenges and research directions86 Assessing Risk Factors in Collaborative - with the Analytic Hierarchy Process (AHP)87 Random / models and sensitivity algorithms for the analysis of ordering time and inventory state in multi-stage -s88 Information sharing and collaborative behaviors in enabling - performance: A social exchange perspective89 The coordinating contracts for a fuzzy - with effort and price dependent demand90 Criticality analysis and the -: Leveraging representational assurance91 Economic model predictive control for inventory ,in -s92 -,ontology from an ontology engineering perspective93 Surplus division and investment incentives in -s: A biform-game analysis94 Biofuels for road transport: Analysing evolving -s in Sweden from an energy security perspective95 -,executives in corporate upper echelons Original Research Article96 Sustainable -,in the fast fashion industry: An analysis of corporate reports97 An improved method for managing catastrophic - disruptions98 The equilibrium of closed-loop - super/ with time-dependent parameters99 A bi-objective stochastic programming model for a centralized green - with deteriorating products100 Simultaneous control of vehicle routing and inventory for dynamic inbound -101 Environmental impacts of roundwood - options in Michigan: life-cycle assessment of harvest and transport stages102 A recovery mechanism for a two echelon - system under supply disruption103 Challenges and Competitiveness Indicators for the Sustainable Development of the - in Food Industry104 Is doing more doing better? The relationship between responsible -,and corporate reputation105 Connecting product design, process and - decisions to strengthen global - capabilities106 A computational study for common / design in multi-commodity -s107 Optimal production and procurement decisions in a - with an option contract and partial backordering under uncertainties108 Methods to optimise the design and ,of biomass-for-bioenergy -s: A review109 Reverse - coordination by revenue sharing contract: A case for the personal computers industry110 SCOlog: A logic-based approach to analysing - operation dynamics111 Removing the blinders: A literature review on the potential of nanoscale technologies for the ,of -s112 Transition inertia due to competition in -s with remanufacturing and recycling: A systems dynamics mode113 Optimal design of advanced drop-in hydrocarbon biofuel - integrating with existing petroleum refineries under uncertainty114 Revenue-sharing contracts across an extended -115 An integrated revenue sharing and quantity discounts contract for coordinating a - dealing with short life-cycle products116 Total JIT (T-JIT) and its impact on - competency and organizational performance117 Logistical - design for bioeconomy applications118 A note on ―Quality investment and inspection policy in a supplier-manufacturer -‖119 Developing a Resilient -120 Cyber - risk ,: Revolutionizing the strategic control of critical IT systems121 Defining value chain architectures: Linking strategic value creation to operational - design122 Aligning the sustainable - to green marketing needs: A case study123 Decision support and intelligent systems in the textile and apparel -: An academic review of research articles124 -,capability of small and medium sized family businesses in India: A multiple case study approach125 - collaboration: Impact of success in long-term partnerships126 Collaboration capacity for sustainable -,: small and medium-sized enterprises in Mexico127 Advanced traceability system in aquaculture -128 - information systems strategy: Impacts on - performance and firm performance129 Performance of - collaboration – A simulation study130 Coordinating a three-level - with delay in payments and a discounted interest rate131 An integrated framework for agent basedinventory–production–transportation modeling and distributed simulation of -s132 Optimal - design and ,over a multi-period horizon under demand uncertainty. Part I: MINLP and MILP models133 The impact of knowledge transfer and complexity on - flexibility: A knowledge-based view134 An innovative - performance measurement system incorporating Research and Development (R&D) and marketing policy135 Robust decision making for hybrid process - systems via model predictive control136 Combined pricing and - operations under price-dependent stochastic demand137 Balancing - competitiveness and robustness through ―virtual dual sourcing‖: Lessons from the Great East Japan Earthquake138 Solving a tri-objective - problem with modified NSGA-II algorithm 139 Sustaining long-term - partnerships using price-only contracts 140 On the impact of advertising initiatives in -s141 A typology of the situations of cooperation in -s142 A structured analysis of operations and -,research in healthcare (1982–2011143 - practice and information quality: A - strategy study144 Manufacturer's pricing strategy in a two-level - with competing retailers and advertising cost dependent demand145 Closed-loop -/ design under a fuzzy environment146 Timing and eco(nomic) efficiency of climate-friendly investments in -s147 Post-seismic - risk ,: A system dynamics disruption analysis approach for inventory and logistics planning148 The relationship between legitimacy, reputation, sustainability and branding for companies and their -s149 Linking - configuration to - perfrmance: A discrete event simulation model150 An integrated multi-objective model for allocating the limited sources in a multiple multi-stage lean -151 Price and leadtime competition, and coordination for make-to-order -s152 A model of resilient -/ design: A two-stage programming with fuzzy shortest path153 Lead time variation control using reliable shipment equipment: An incentive scheme for - coordination154 Interpreting - dynamics: A quasi-chaos perspective155 A production-inventory model for a two-echelon - when demand is dependent on sales teams׳ initiatives156 Coordinating a dual-channel - with risk-averse under a two-way revenue sharing contract157 Energy supply planning and - optimization under uncertainty158 A hierarchical model of the impact of RFID practices on retail - performance159 An optimal solution to a three echelon -/ with multi-product and multi-period160 A multi-echelon - model for municipal solid waste ,system 161 A multi-objective approach to - visibility and risk162 An integrated - model with errors in quality inspection and learning in production163 A fuzzy AHP-TOPSIS framework for ranking the solutions of Knowledge ,adoption in - to overcome its barriers164 A relational study of - agility, competitiveness and business performance in the oil and gas industry165 Cyber - security practices DNA – Filling in the puzzle using a diverse set of disciplines166 A three layer - model with multiple suppliers, manufacturers and retailers for multiple items167 Innovations in low input and organic dairy -s—What is acceptable in Europe168 Risk Variables in Wind Power -169 An analysis of - strategies in the regenerative medicine industry—Implications for future development170 A note on - coordination for joint determination of order quantity and reorder point using a credit option171 Implementation of a responsive - strategy in global complexity: The case of manufacturing firms172 - scheduling at the manufacturer to minimize inventory holding and delivery costs173 GBOM-oriented ,of production disruption risk and optimization of - construction175 Alliance or no alliance—Bargaining power in competing reverse -s174 Climate change risks and adaptation options across Australian seafood -s – A preliminary assessment176 Designing contracts for a closed-loop - under information asymmetry 177 Chemical - modeling for analysis of homeland security178 Chain liability in multitier -s? Responsibility attributions for unsustainable supplier behavior179 Quantifying the efficiency of price-only contracts in push -s over demand distributions of known supports180 Closed-loop -/ design: A financial approach181 An integrated -/ design problem for bidirectional flows182 Integrating multimodal transport into cellulosic biofuel- design under feedstock seasonality with a case study based on California183 - dynamic configuration as a result of new product development184 A genetic algorithm for optimizing defective goods - costs using JIT logistics and each-cycle lengths185 A -/ design model for biomass co-firing in coal-fired power plants 186 Finance sourcing in a -187 Data quality for data science, predictive analytics, and big data in -,: An introduction to the problem and suggestions for research and applications188 Consumer returns in a decentralized -189 Cost-based pricing model with value-added tax and corporate income tax for a -/190 A hard nut to crack! Implementing - sustainability in an emerging economy191 Optimal location of spelling yards for the northern Australian beef -192 Coordination of a socially responsible - using revenue sharing contract193 Multi-criteria decision making based on trust and reputation in -194 Hydrogen - architecture for bottom-up energy systems models. Part 1: Developing pathways195 Financialization across the Pacific: Manufacturing cost ratios, -s and power196 Integrating deterioration and lifetime constraints in production and - planning: A survey197 Joint economic lot sizing problem for a three—Layer - with stochastic demand198 Mean-risk analysis of radio frequency identification technology in - with inventory misplacement: Risk-sharing and coordination199 Dynamic impact on global -s performance of disruptions propagation produced by terrorist acts。
高三英语英语学习大数据分析单选题40题1.In the era of big data, we need to analyze large amounts of information _____.A.thoroughlyB.approximatelyC.randomlyD.occasionally答案:A。
thoroughly 意为“彻底地、完全地”;approximately 意为“大约、近似地”;randomly 意为“随机地、任意地”;occasionally 意为“偶尔、间或”。
在大数据时代,我们需要彻底地分析大量信息,所以选A。
2.Big data can provide _____ insights into customer behavior.A.preciousB.valuableC.worthlessD.trivial答案:B。
precious 意为“珍贵的、宝贵的”,通常用于形容物品或情感;valuable 意为“有价值的”,可用于形容信息、建议等;worthless 意为“无价值的”;trivial 意为“琐碎的、不重要的”。
大数据能提供有价值的关于客户行为的见解,所以选B。
3.The analysis of big data requires powerful _____ tools.putationalB.manualC.primitiveD.ineffective答案:A。
computational 意为“计算的”;manual 意为“手工的”;primitive 意为“原始的”;ineffective 意为“无效的”。
大数据分析需要强大的计算工具,所以选A。
4.Big data analytics can help businesses make more _____ decisions.rmedB.uninformedC.randomD.hasty答案:A。
informed 意为“有根据的、明智的”;uninformed 意为“无知的、未被通知的”;random 意为“随机的”;hasty 意为“匆忙的”。
Value Function Approximation on Non-Linear Manifoldsfor Robot Motor ControlMasashi Sugiyama Hirotaka Hachiya Christopher Towell and Sethu VijayakumarAbstract—The least squares approach works efficiently in value function approximation,given appropriate basis func-tions.Because of its smoothness,the Gaussian kernel is a popular and useful choice as a basis function.However,it does not allow for discontinuity which typically arises in real-world reinforcement learning tasks.In this paper,we propose a new basis function based on geodesic Gaussian kernels, which exploits the non-linear manifold structure induced by the Markov decision processes.The usefulness of the proposed method is successfully demonstrated in a simulated robot arm control and Khepera robot navigation.I.I NTRODUCTIONV alue function approximation is an essential ingredient of reinforcement learning(RL),especially in the context of solving Markov Decision Processes(MDPs)using policy iteration methods[1].In problems with large discrete state space or continuous state spaces,it becomes necessary to use function approximation methods to represent the value functions.A least squares approach using a linear com-bination of predetermined under-complete basis functions has shown to be promising in this task[2].Fourier func-tions(trigonometric polynomials),Gaussian kernels[3],and wavelets[4]are popular basis function choices for general function approximation problems.Both Fourier bases(global functions)and Gaussian kernels(localized functions)have certain smoothness properties that make them particularly useful for modeling inherently smooth,continuous functions. Wavelets provide basis functions at various different scales and may also be employed for approximating smooth func-tions with local discontinuity.Typical value functions in RL tasks are predominantly smooth with some discontinuous parts[5].To illustrate this, let us consider a toy RL task of guiding an agent to a goal in a grid world(see Fig.1(a)).In this task,a state corresponds to a two-dimensional Cartesian position of the agent.The agent can not move over the wall,so the value function of this task is highly discontinuous across the wall.On the other hand,the value function is smooth along the maze since neighboring reachable states in the maze have similar values (see Fig.1(b)).Due to the discontinuity,simply employing Fourier functions or Gaussian kernels as basis functions The authors acknowledgefinancial support from MEXT(Grant-in-Aid for Y oung Scientists17700142and Grant-in-Aid for Scientific Research(B) 18300057),the Okawa Foundation,and EU Erasmus Mundus Scholarship.Department of Computer Science,Tokyo Institute of Technology,2-12-1,O-okayama,Meguro-ku,Tokyo,152-8552,Japan sugi@cs.titech.ac.jpSchool of Informatics,University of Edinburgh,The King’s Buildings, Mayfield Road,Edinburgh EH93JZ,UK.H.Hachiya@, C.C.Towell@,sethu.vijayakumar@ tend to produce undesired,non-optimal results around the discontinuity,affecting the overall performance significantly. Wavelets could be a viable alternative,but are over-complete bases—one has to appropriately choose a subset of basis functions,which is not a straightforward task in practice. Recently,the article[5]proposed considering value func-tions defined not on the Euclidean space,but on graphs induced by the MDPs(see Fig.1(c)).V alue functions which usually contain discontinuity in the Euclidean domain(e.g., across the wall)are typically smooth on graphs(e.g.,along the maze).Hence,approximating value functions on graphs can be expected to work better than approximating them in the Euclidean domain.The spectral graph theory[6]showed that Fourier-like smooth bases on graphs are given as minor eigenvectors of the graph-Laplacian matrix.However,their global nature implies that the overall accuracy of this method tends to be degraded by local noise.The article[7]defined diffusion wavelets,which posses natural multi-resolution structure on graphs.The paper[8]showed that diffusion wavelets could be employed in value function approximation,although the issue of choosing a suitable subset of basis functions from the over-complete set is not discussed—this is not straight-forward in practice due to the lack of a natural ordering of basis functions.In the machine learning community,Gaussian kernels seem to be more popular than Fourier functions or wavelets because of their locality and smoothness[3],[9],[10].Fur-thermore,Gaussian kernels have‘centers’,which alleviates the difficulty of basis subset choice,e.g.,uniform allocation [2]or sample-dependent allocation[11].In this paper,we therefore define Gaussian kernels on graphs(which we call geodesic Gaussian kernel),and propose using them for value function approximation.Our definition of Gaussian kernels on graphs employs the shortest paths between states rather than the Euclidean distance,which can be computed efficiently using the Dijkstra algorithm[12],[13].Moreover, an effective use of Gaussian kernels opens up the possibility to exploit the recent advances in using Gaussian processes for temporal difference learning[11].When basis functions defined on the state space are used for approximating the state-action value function,they should be extended over the action space.This is typically done by simply copying the basis functions over the action space [2],[5].In this paper,we propose a new strategy for this extension,which takes into account the transition after taking actions.This new strategy is demonstrated to work very well when the transition is predominantly deterministic.→→→↓→→→→→→→→→→↑→↑→→→→↓→↓→→→→→→→→→→↑→→→↓↓→↓↓→→→→→↑→↑→↑→→→→↓→→→→→→→↑↑→↑→→→→→↓↓→↓→→→→→→→→↑→→→→→→↓→→→→→→→→↑→→→→→→→→↓↓↓↓↓→→→↑↑→→→↑→↓↓→↓↓↓↓↓→→→↑→→↑→↑→↓↓↓↓↓↓↓↓→→↑↑↑↑↑↑↑↑→↑→→→→→↑↑→→→→↑↑↑↑↑→↑→→↑→→→→→↑↑↑→→→→→→↑→→↑→↑↑↑↑↑→→→↑↑→↑↑↑→→→→→↑→↑→↑↑↑↑↑↑↑↑↑→→→↑↑↑↑↑↑↑→↑↑↑↑↑↑↑→→↑↑↑↑↑↑↑↑↑↑→↑↑↑→↑→↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑→↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑123456789101112131415161718192021 1234567891011121314151617181920(a)(b)(c)Fig.1.An illustrative example of an RL task of guiding an agent to a goal in the grid world.(a)Black areas are walls over which the agent cannot move while the goal is represented in gray.Arrows on the grids represent one of the optimal policies.(b)Optimal state value function(in log-scale).(c)Graph induced by the MDP and a random policy.II.F ORMULATION OF THE R EINFORCE MENT L EARNINGP ROBLEMIn this section,we briefly introduce the notation and reinforcement learning(RL)formulation that we will use across the manuscript.A.Markov Decision ProcessesLet us consider a Markov decision process(MDP),where is afinite1 set of states,is afinite set of actions,is the joint probability of making a transition to state if action is taken in state,is an immediate reward for making a transition from to by action, and is the discount factor for future rewards.The expected reward for a state-action pair is given as(1)Let be a deterministic policy which the agent follows.In this paper,we focus on deterministic policies since there always exists an optimal deterministic policy[2]. Let be a state-action value function for policy,which indicates the expected long-term discounted sum of rewards the agent receives when the agent takes action in state and follows policy thereafter.satisfies the following Bellman equation:(2)The goal of RL is to obtain a policy which results in maximum amount of long-term rewards.The optimal policy is defined as,whereis the optimal state-action value function defined by.1For the moment,we focus on discrete state spaces.In Sec.III-D,we extend the proposed method to the continuous state space.B.Least Squares Policy IterationIn practice,the optimal policy can not be directly obtained since and are usually unknown; even when they are known,direct computation of is often computationally intractable.To cope with this problem,the paper[2]proposed ap-proximating the state-action value function using a linear model:(3)where is the number of basis functions which is usu-ally much smaller than the number of states,are the parameters to be learned, denotes the transpose,and are pre-determined basis functions.Note that and can depend on policy,but we do not show the explicit dependence for the sake of simplicity.Assume we have roll-out samples from a sequence of actions:,where each tuple denotes the agent experiencing a transition fromto on taking action with immediate reward.Under the Least Squares Policy Iteration(LSPI)formulation[2], the parameter is learned so that the Bellman equation (2)is optimally approximated in the least squares sense2. Consequently,based on the approximated state-action value function with learned parameter,the policy is updated as(4)Approximating the state-action value function and updating the policy is iteratively carried out until some convergence criterion is met.III.G AUSSIA N K ERNELS ON G RAPHSIn the LSPI algorithm,the choice of basis functionsis an open design issue.Gaussian kernels have traditionally been a popular choice[2],[11],but they 2There are two alternative approaches:Bellman residual minimization andfixed point approximation.We take the latter approach following the suggestion in the reference[2].can not approximate discontinuous functions well.Recently, more sophisticated methods of constructing suitable basis functions have been proposed,which effectively make use of the graph structure induced by MDPs[5].In this section, we introduce a novel way of constructing basis functions by incorporating the graph structure;while relation to the existing graph-based methods is discussed in the separate report[14].A.MDP-Induced GraphLet be a graph induced by an MDP,where states are nodes of the graph and the transitions with non-zero transition probabilities from one node to another are edges. The edges may have weights determined, e.g.,based on the transition probabilities or the distance between nodes. The graph structure corresponding to an example grid world shown in Fig.1(a)is illustrated in Fig.1(c).In practice, such graph structure(including the connection weights)are estimated from samples of afinite length.We assume that the graph is connected.Typically,the graph is sparse in RL tasks,i.e.,,where is the number of edges and is the number of nodes.B.Ordinary Gaussian KernelsOrdinary Gaussian kernels(OGKs)on the Euclidean space are defined as(5) where are the Euclidean distance between states and;for example,when the Cartesian positions of and in the state space are given by and,respectively.is the variance parameter of the Gaussian kernel.The above Gaussian function is defined on the state space ,where is treated as a center of the kernel.In order to employ the Gaussian kernel in the LSPI algorithm,it needs to be extended over the state-action space. This is usually carried out by simply‘copying’the Gaussian function over the action space[2],[5].More precisely:let the total number of basis functions be,where is the number of possible actions and is the number of Gaussian centers.For the-th action and for the-th Gaussian center,the-th basis function is defined as(6)where is the indicator function,i.e.,if otherwise.Gaussian kernels are shift-invariant,i.e.,they do not directly depend on the absolute positions and,but depend only on the difference between two positions;more specifically,Gaussian kernels depend only on the distance between two positions.C.Geodesic Gaussian KernelsOn graphs,a natural definition of the distance would be the shortest path.So we define Gaussian kernels on graphs based on the shortest path:(7) where denotes the shortest path from state to state.The shortest path on a graph can be interpreted as a discrete approximation to the geodesic distance on a non-linear manifold[6].For this reason,we call Eq.(7)a geodesic Gaussian kernel(GGK).Shortest paths on graphs can be efficiently computed using the Dijkstra algorithm[12].With its naive implementation, computational complexity for computing the shortest paths from a single node to all other nodes is,where is the number of nodes.If the Fibonacci heap is employed,the computational complexity can be reduced to[13],where is the number of edges.Since the graph in value function approximation problems is typically sparse (i.e.,),using the Fibonacci heap provides signifi-cant computational gains.Furthermore,there exist various approximation algorithms which are computationally very efficient(see[15]and and references therein). Analogous to OGKs,we need to extend GGKs to the state-action space for using them in the LSPI method.A naive way is to just employ Eq.(6),but this can cause a‘shift’in the Gaussian centers since the state usually changes when some action is taken.To incorporate this transition,we propose defining the basis functions as the expectation of Gaussian functions after transition,i.e.,(8) This shifting scheme is expected to work well when the transition is predominantly deterministic(see Sec.IV and Sec.V-A for experimental evaluation).D.Extension to Continuous State SpacesSo far,we focused on discrete state spaces.However,the concept of GGKs can be naturally extended to continuous state spaces,which is explained here.First,the continuous state space is discretized,which gives a graph as a discrete approximation to the non-linear manifold structure of the continuous state space.Based on the graph,we construct GGKs in the same way as the discrete case.Finally,the discrete GGKs are interpolated,e.g.,using a linear method to give continuous GGKs.Although this procedure discretizes the continuous state space,it must be noted that the discretization is only for the purpose of obtaining the graph as a discrete approximation of the continuous non-linear manifold;the resulting basis func-tions themselves are continuously interpolated and hence,the state space is still treated as continuous as opposed to other conventional discretization procedures.→→→→↓→→→→↑→→→→→→→↓→↓→↑→→↓→→↑↓↓↓↓→↑→→↓→↑↑↓↓↓↓→↑→→↓→→→→→→→→→→→→↓→→→↓→→↑→→→→→→→→→→→↑↑↑→→→→→→↓→→→→↓→→→→↓→↑→→↓↓↓↓→↓→↑↑→→→→→↓↓→↓→→→→→→↓↓↓↓↓↓→→→→↑→↓↓↓↓↓↓→→→↑→→→→→↑↑↑→→↑→↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑12345678910111213141516171819201234567891011121314(a)Sutton’s maze→→↓→→↓→↓→→→→↑→→→→↑→→→→→↓→→→→→→→→→→→↑→→→→→↓→→→→→↑→→→↑↑↑↓→→→→↓→↓→→→↑→↑↑↑↑↑↓↓↓↓↓↓↓↓→→↑↑↑↑↑↑↑↑↓→→↓↓↓↓↓↓↓↓→→→→→↑↑↑↑↑↓↓→→→↓↓↓→→↑→→→→↑↑↑↓↓↓↓↓↓↓↓↓→→→→→↑↑↑↑↓↓↓↓↓→→→→→→→→↑↑↑↑↑↓↓↓↓↓↓↓→→→↑↑↑↑↑↑↑↑↓↓→→→↓↓→→→→↓↓→→→→→→↓→↓→→→→→→↓→↓↓→→↓→→↓→↓↓↓→→→↓↓↓↓↓↓↓↓→→→→↓↓↓↓↓↓→→→→↓↓↓↓↓↓↓→↓↓↓↓↓↓↓↓↓↓↓↓→→↓→↓→↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓↓12345678910111213141516171819201234567891011121314151617181920(b)Three-room maze(a)Sutton’s maze(b)Three-room mazeFig.3.Mean squared error of approximated value functions averaged overtrials for the Sutton and three room mazes.In the legend,the standard deviation of GGKs and OGKs is denoted in the bracket.(b)Three-room mazeFig.4.Fraction of optimal states averaged over trials for the Sutton and three room mazes.IV.E XPERIMENTAL C OMPARISONIn this section,we report the results of extensive and systematic experiments for illustrating the difference between GGKs and other basis functions.We employ two standard grid world problems illustrated in Fig.2,and evaluate the goodness of approximated value functions by computing the mean squared error (MSE)with respect to the optimal value function and the goodness of obtained policies by calculating the fraction of states from which the agent can get to the goal optimally (i.e.,in the shortest number of steps).series of random walk of length are gathered as training samples,which are used for estimating the graph as well as the transition probability and expected reward.We set the edge weights in the graph to (which is equivalent to the Euclidean distance between two nodes).We test GGKs,OGKs,graph-Laplacian eigenfunctions (GLEs)[5],and diffusion wavelets (DWs)[8].This simulation is repeated times for each maze and each method,randomly changing training samples in each run.The mean of the above scores as a function of the number of bases is plotted in Fig.4.Note that the actual number of bases is four times more becauseof the extension of basis functions over the action space (see Eq.(6)and Eq.(8)).GGKs and OGKs are tested with small/medium/large Gaussian widths.Fig.3depicts MSEs of the approximated value functions for each method.They show that MSEs of GGKs with width ,OGKs with width ,GLEs,and DWs are very small and decrease as the number of kernels increases.On the other hand,MSEs of GGKs and OGKs with medium/large width are large and increase as the number of kernels increases.Therefore,from the viewpoint of approximation quality of the value functions,the width of GGKs and OGKs should be small.Fig.4depicts the fraction of optimal states in the obtained policy.They show that overall GGKs with medium/large width give much better policies than OGKs,GLEs,and DWs.An interesting finding from the graphs is that GGKs tend to work better if the Gaussian width is large,while OGKs show the opposite trend;this may be explained as follows.Tails of OGKs extend across the wall.Therefore,OGKs with large width tend to produce undesired value function and erroneous policies around the partitions.This tail effect can be alleviated if the Gaussian width is made small.However,this in turn makes the approximated value function fluctuating;so the resulting policies are still erroneous.The fluctuation problem with a small Gaussian width seems to be improved if the number of bases is increased,while the tail effect with a large Gaussian width still remains even when the number of bases is increased.On the other hand,GGKs do not suffer from the tail problem thanks to the geodesic construction.Therefore,GGKs allows us to make the width large without being affected by the discontinuity across the wall.Consequently,smooth value functions along the maze are produced and hence better policies can be obtained by GGKs with large widths.This result highlights a helpful property since it alleviates the practical issue of determining the values of the Gaussian width parameter.V.A PPL ICATIONSAs discussed in the previous section,the proposed GGKs bring a number of preferable properties for making value function approximation effective.In this section,we in-vestigate the application of the GGK-based method to the challenging problems of a(simulated)robot arm control and mobile robot navigation and demonstrate its usefulness. A.Robot Arm ControlWe use a simulator of a two-joint robot arm(moving in a plane)illustrated in Fig.5(a).The task is to lead the end effector(‘hand’)of the arm to an object while avoiding the obstacles.Possible actions are to increase or decrease the angle of each joint(‘shoulder’and‘elbow’)by degrees in the plane,simulating coarse stepper motor joints.Thus the state space is the-dimensional discrete space consisting of two joint angles as illustrated in Fig.5(b).The black area in the middle corresponds to the obstacle in the joint angle state space.The action space involves actions:increase or decrease one of the joint angles.We give a positive immediate reward when the robot’s end effector touches the object;otherwise the robot receives no immediate reward. Note that actions which make the arm collide with obstacles are not allowed.The discount factor is set to. In this environment,we can change the joint angle exactly by degrees,so the environment is deterministic.However, because of the obstacles,it is difficult to explicitly compute an inverse kinematic model;furthermore,the obstacles intro-duce discontinuity in value functions.Therefore,this robot arm control task is an interesting test bed for investigating the behaviour of GGKs.We collected training samples from series of random arm movements,where the start state is chosen randomly in each trial.The graph induced by the above MDP consists of nodes and we assigned uniform weights to the edges.There are totally goal states in this environment (see Fig.5(b)),so we put thefirst Gaussian centers at the goals and the remaining centers are chosen randomly in the state space.For GGKs,kernel functions are extended over the action space using the shifting scheme(see Eq.(8))since the transition is deterministic in this experiment.Fig.6illustrates the value functions approximated using GGKs and OGKs3.The graphs show that GGKs give a nice smooth surface with obstacle-induced discontinuity sharply preserved,while OGKs tend to smooth out the discontinuity. This makes a significant difference in avoiding the obstacle: from‘A’to‘B’in Fig.5(b),the GGK-based value function results in a trajectory that avoids the obstacle(see Fig.6(a)). On the other hand,the OGK-based value function yields a trajectory that tries to move the arm through the obstacle by following the gradient upward(see Fig.6(b)).The latter causes the arm to get stuck behind the obstacle.Fig.7summarizes the performance of GGKs and OGKs measured by the percentage of successful movements(i.e., the end effector reaches the target)averaged over indepen-dent runs.More precisely,in each run,totally training samples are collected using a different random seed,a policy is then computed by the GGK-or OGK-based method using LSPI,and the obtained policy is tested.This graph shows that GGKs remarkably outperform OGKs since the arm can successfully avoid the obstacle.The performance of OGK does not go beyond even when the number of kernels is increased.This is caused by the‘tail effect’of ordinary Gaussian functions;the OGK-based policy can not lead the end effector to the object if it starts from the bottom-left half of the state spaceWhen the number of kernels is increased,the performance of both GGKs and OGKs once gets worse at around.This would be caused by our kernel center allocation strategy:thefirst kernels are put at the goal states and the remaining kernel centers are chosen randomly.When is less than or equal to,the approximated value function tends to have a unimodal profile since all kernels are put at the goal states.However,when is larger than,this unimodality is broken and the surface of the approximated value function gets slightlyfluctuated.This smallfluctuation can cause an error in policies and therefore the performance is degraded at around.This performance degradation tends to be improved as the number of kernels is further increased.Overall,the above result shows that when GGKs are combined with our kernel center allocation strategy,almost perfect policies can be obtained with a very small number of kernels.Therefore,the proposed method is computationally very advantageous.B.Robot Agent NavigationThe above simple robot arm control simulation shows that the GGK method is promising.Here we apply GGKs to a more challenging task of a mobile robot navigation,which involves a high-dimensional and continuous state space. We employ a Khepera robot illustrated in Fig.8(a)on a navigation task.A Khepera is equipped with infra-red 3For illustration purposes,let us display the state value function ,which is the expected long-term discounted sum of rewards the agent receives when the agent takes actions following policy from state. From the definition,it can be confirmed that is expressed.(a)Aschematic(b)State space(a)Geodesic Gaussian kernels(b)Ordinary Gaussian kernelsFig.6.Approximated value functions.Fig.7.Number of successful trials.sensors (‘s1’to ‘s8’in the figure)which measure the strength of the reflected light returned from surrounding obstacles.Each sensor produces a scalar value between and(which may be regarded as continuous):the sensor obtains the maximum value if an obstacle is just in front of the sensor and the value decreases as the obstacle gets farther till it reaches the minimum value .Therefore,the state space is -dimensional and continuous.The Khepera has two wheels and takes the following defined actions:forward,left-rotation,right-rotation and backward (i.e.,the action space contains actions).The speed of the left and right wheels for each action is described in Fig.8(a)in the bracket (the unit is pulse per 10milliseconds).Note that the sensor values and the wheel speed are highly stochastic due to the change of the ambient light,noise,the skid etc.Furthermore,perceptual aliasing occurs due to the limited range and resolution of sensors.Therefore,the state transition is highly stochastic.We set the discount factor to .The goal of the navigation task is to make the Khepera explore the environment as much as possible.To this end,we give a positive reward when the Khepera moves forward and a negative reward when the Khepera collides with an obstacle.We do not give any reward to the left/right rotation and backward actions.This reward design encourages the Khepera to go forward without hitting obstacles,through which extensive exploration in the environment could be achieved.We collected training samples fromseries of random movements in a fixed environment with several ob-stacles (see Fig.9(a)).Then we constructed a graph from the gathered samples by discretizing the continuous state space using the Self-Organizing Map (SOM)[16].The number of nodes (states)in the graph is set to (equivalent with the SOM map size of );this value is computed by the standard rule-of-thumb formula [17],where is the number of samples.The connectivity of the graph is determined by the state transition probability computed from the samples,i.e.,if there is a state transition from one node to another in the samples,an edge is established between these two nodes and the edge weight is set according to the Euclidean distance between them.Fig.8(b)illustrates an example of the obtained graph structure—for visualization purposes,we projected the -dimensional state space onto a -dimensional subspace spanned by(9)The -th element in the above bases corresponds to the output of the -th sensor (see Fig.8(a)).Therefore,the projection onto this subspace roughly means that the horizontal axis corresponds to the distance to the left/right obstacle,while the vertical axis corresponds to the distance to the front/back obstacle.For clear visibility,we only displayed the edges whose weight is less than .This graph has a notable feature:the nodes around the region ‘B’in the figure are(a)A schematic(b)State space projected onto a -dimensional subspace for visualization.Fig.8.Khepera robot.(a)Training(b)TestFig.9.Simulation environment(a)Geodesic Gaussian kernels(b)Ordinary Gaussian kernels Fig.10.Examples of obtained policy .Fig.11.Average amount of exploration.putation time.directly connected to the nodes at ‘A ’,but are not directly connected to the nodes at ‘C’,‘D’,and ‘E’.This implies that the geodesic distance from ‘B’to ‘C’,‘D’,or ‘E’is large,although the Euclidean distance is small.Since the transition from one state to another is highly stochastic in the current experiment,we decided to simply duplicate the GGK function over the action space (see Eq.(6)).For obtaining continuous GGKs,GGK functions need to be interpolated (see Sec.III-D).We may employ a simple linear interpolation method in general.However,the current experiment has unique characteristics—at least one of the sensor values is always zero since the Khepera is never completely surrounded by obstacles.Therefore,samples are always on the surface of the -dimensional hypercube-shaped state space.On the other hand,the node centers determined by the SOM are not generally on the surface.This means thatany sample is not included in the convex hull of its nearest nodes and we need to extrapolate the function value.Here,we simply add the Euclidean distance between the sample and its nearest node when computing kernel values;more precisely,for a state that is not generally located on a node center,the GGK-based basis function is defined as(10)where is the node closest to in the Euclidean distance.Fig.10illustrates an example of actions selected at each node by the GGK-based and OGK-based policies.We usedkernels and set the width to .The symbols ‘’,’’,‘’,and ‘’in the figure indicates forward,backward,left rotation,and right rotation actions.This shows that there is。
Shapes can be completed in a fraction of the time manually lofted patterns would allow and importantly, they are fabricated “right first Simply model the item in 3D T HE F AST WAY TO P RODUCE P ATTERNS R EADY FOR C UTTING“R IGHT F IRST T IME ”Take out the guesswork and eliminate the need for Calculations! This computer based systemcan handle thousands morecomputations which isreflected in the capacity and accuracy of the system. Particularly when compared to hand lofted pattern developments. FastSHAPES ® produces clean and precise geometry that you can export as a DXF (CAD) fileor into your CAM system forediting and/or nesting ifdesired.Highly accurate components offer a marked improvementin weld qu a lit y a n d consistency resulting in better fitup. ®System I NSTANT P LATE D EVELOPMENTSFastSHAPES ® completed this complex hydro-electric component in only minutesready for burning, with all forming detailsand even plate edge preparation (PEP).H IGHLIGHTS :• S PECIFICALLY DESIGNED FORTHICK PLATE!• I NCLUDES C OMMON , C OMPLEX AND H EAVY C ONSTRUCTION S HAPES • PRODUCTION-READY FLAT PATTERNS• F ULL M ANUFACTURING AND W ELD D ETAILS• A UTOMATIC & E ASY !• M ULTI -LANGUAGE SUPPORT • M ICROSOFT W INDOWSNuts and Bolts:Dynamic Display changes shape as data is enteredFull 3D Viewing inc stereoscopic, Plan, Elevation & Isometric ‘Cutaway’ feature lets you see inside the development Calculates Bend Lines, Forming Angle, Longitudinal Seam Off-sets and moreEvery component of a construc-tion can be viewed independently Plotted layouts2D & 3D DXF CAD output of shapes and flat patternIdeal for Boilermakers, Draftsmen & EngineersCommon ShapesFull welding preparation detailsFastSHAPES® is for medium to heavy plate manufacturing where the main jointing technology is welding. Complex multi componentsFastSHAPES ® has proven to offer substantial time & material savings ranging from mining buckets to wind-farm towers.FastSHAPES ® has also been used in the development of Naval Frigates and nuclear and conventional submarines.D ESIGNED FOR R EALWORLD A PPLICATIONSGreen Allowances Forming Angle Back to Back “Stitch” CuttingMarking of bend lines Mass of all Parts Assembly ListLongitudinal Seam OffsetsF ULL M ANUFACTURING D ETAILS TO I NTERNATIONAL S TANDARDSPower Transmission Power Transmission Chain Sprocket, Universal Sprocket, Rack and Pinion, Involute Gears.Heavy/Complex Transitions Heavy/Complex TransitionsTapered & Transforming Lobster-back Bends, Offset & Mitred Transformers, Conical Branches,Conical & Tubular Runs, Simple & Complex Birfircations.Common Blowpipe Common BlowpipeRectangle to Round, Elbow,Pipe Branch, Lobster, Multi Segment Cone, Oblique Cone, Cone Bifircate, Plate Branches.1976FastSHAPES ® offers themost dramatic savings when working with : • Large fabricated shapes which require a lot of additional fabrication information (often multi-segmented). • Complex equation based developments such as gears. • Predictable multi-component develop-ments.F a s t S H A P E S ® w a s specifically designed to be able to handle medium and thick plate. The program has been used in real world production solutions since 1989.Pipe branches, transitions, heavy ducting, anything for Hydro-electric, mining infra-structure or general heavy fabrication are handled superbly.FastSHAPES ® allows for multi strake, variable thickness construction with internal, mean or outside data entry.D EVELOP A FULL RANGE OF ‘P RODUCTION R EADY ’ F LAT P ATTERNSNote: With over 20 shapes in the library, FastSHAPES ® covers most of the shape development needs of fabricators and service centers.Full fabrication development includes Welds, Full/Partial Penetrations, Fillet, Butt/Groove. These large conical structures are used to support Turbines at a Wind Farm.FastSHAPES ® can rapidly produce useful estimation for a particular development via a Parts List (Cut Distance, Marking Distance, Mass of Cut Components).Annual Service Maintenance Agreements are also available. SMA’s give you Priority Support, Upgrades and M e m b e r D i s c o u n tEntitlements.Contact FastCAM for current pricing.The FastSHAPES ® System includes 30 days FREE technical support*technical support* to ensure that you are up and runningquickly. Free self-help is available via our web site and pay per incident support is availableas and when required.Training can be tailored to meet your needs. CAD experience is not r e q u i r e d h o w e v e rFastSHAPES ® does presumea working knowledge of fabrication techniques. Contact your local FastCAMOffice or reseller for more information. S PEED , A CCURACY & D ETAILING BENEFITS E STIMATION AND P ROTOTYPINGO FFERING S OLID S ERVICE , M AINTENANCE & S UPPORTA VAILABLE IN 3 DIFFERENT VERSIONS TO MEET YOUR N EEDS* Technical support by phone, fax, email. Excludes on-site, installation and O/S support.1976Tradesman in a Box™ Tradesman in a Box™Full shape development set to print co ordinates andtemplates.Ideal for hand cutting oroptical pattern production.FastSHAPES ® System SystemFull Shape Set. Multi Plate, Multi ThicknessOutput: 2D & 3D DXF, FastCAMfile, Coordinate Table.FastSHAPES ® / / FastCAM FastCAM ® NC Bundle NC BundleCombines FastSHAPES ® with the FastCAM ® System for a complete NC programming and nesting solution with integrated post processors.Refer to the FastCAM ® System brochure for more details.Note: For more details refer the FastSHAPES® Complete Technical Reference..FastSHAPES ® lets you develop even the most com-plex parts in minutes, model the item in 3D (if required) and produce a flat pattern with all manufacturing details.Design Draftsmen and Engineers use FastSHAPES ® to prototype designs knowing that they can be easily fabricated from plate. 2D & 3D DXF CAD outputsare very useful to take designs and patterns back into a CAD system for d o c u m e n t a t i o n a n d verification.Note: FastSHAPES ® can be used in conjunction with a CAD system however CAD is not a requirement.FastCAM Inc.—USA FastCAM Pty. Ltd.—Asia Pacific FastCAM China 8700 West Bryn Mawr, 96 Canterbury Road, No.34, 377 Chenhui Road, Suite 800 South, Middle Park, Victoria, 3206, Zhangjuang Tomson Garden, Chicago, 60631 3507 Australia Zhangjuang High Tech, Phone: 312 715 1535 Phone: 61 3 9699 9899 Pudong, Shanghai, 201203 Fax: 312 715 1536 Fax: 61 3 9699 7501 Phone: 8621 5080 3069 Email: fastcam@ Email: fastcam@.au Fax: 8621 5080 3071Email: fastcam@ Web: FastCAM has been supplying PC-based software for Burning, Shearing and Sawing/Drilling machines for over 25 years. The flagship product FastCAM ® offers unique integrated postprocessors, NC verification and NC code nesting that still sets it apart from other CAM and CAD/CAM systems. The new generation of FastCAM ® software is used in many countries, in many languages and in many different environments.Today the product line has been expanded to include dozens of trademarked products offering a wide range of solutions for metal fabrication and Service Center operation. FastCAM has OEM and Business Partners in North America, South America, China, Australia, New Zealand and Europe (Poland, UK, Czech, Hungary). We welcome all enquiries.F AST CAM W ORLDWIDE O FFICESFASTCAM SUPPORTS: BURNY, LYNX, WESTINGHOUSE, ESAB, PICOPATH, CREONICS, HANCOCK , FANUC, C & G, PCS, KOIKE SANSO, TANAKA, MESSER, FAGOR, FARLEY, ANCA,SIEMENS, JHE, HYBRID, MYNUC, ANCA, PDF32 AND MANY OTHERS. THIS IS NOT A COMPLETE LISTING — CONTACT YOUR RESELLER OR FASTCAM DIRECTLY FOR MORE. FastCAM ® will operate on most Pentium based PC’s with Microsoft Windows 98/ME/XP or NT4/2000 however the recommended system, particularly for nesting, is a Pentium IV 256/512Mb RAM 80Gb HDD 17” Monitor & Windows XP.Using FastSHAPES ®, this square to round transition was completed in less than 30 seconds ready to send to the burning machine. The development even included the line marking and number of degrees for forming every bend! Handles simple and complex shapes. All read directly into 2D and 3D CAD systems.。
A Partial Order Approach to Noisy Fitness FunctionsG¨unter RudolphDepartment of Computer ScienceUniversity of Dortmund44221Dortmund/Germanyrudolph@LS11.cs.uni-dortmund.deAbstract-If thefitness values are perturbed by noise then they do not have a definitive total order.As a consequence,tra-ditional selection procedures in evolutionary algorithms may lead to obscure solutions.A potential remedy is as follows:Construct a partial order on the set of noisyfit-ness values and apply those evolutionary algorithms that have been designed forfinding the minimal elements of partially ordered sets.These minimal elements are the only reasonable candidates for the unperturbed true so-lution.A method for reducing the number of candidate solutions is suggested.From a theoretical point of view it is worth mentioning that all convergence results for evo-lutionary algorithms with partially orderedfitness sets re-main valid for the approach considered here.1IntroductionThe Gaussian distribution is the predominant choice for mod-eling noise frequently observable in measurings of various kinds.Here,we hold the view that a noise distribution with unbounded support(like the Gaussian,Cauchy,Laplace,Lo-gistic,and others)may be quite unrealistic.Actually it is at least equally plausible to assume that the noise cannot exceed certain limits due to technical characteristics of the involved measurement unit.Even if a distributional shape close to a Gaussian appears reasonable we can resort to a symmetrical Beta distribution which can converge weakly to a Gaussian distribution under continuously increasing but bounded sup-port(see e.g.Evans et al.1993,p.36).This assumption will have significant theoretical and practical impacts on the evo-lutionary algorithms(EAs)considered here.Traditional measures for coping with noisyfitness func-tions in evolutionary algorithms include the resampling of the randomfitness value with averaging,the appropriate adjust-ment(i.e.,enlargement)of the population size,and in case of continuous search spaces also the rescaling of inherited mu-tations;see Beyer(2000)for a summary of work on EAs for noisyfitness functions.Here,we add yet another avenue for dealing with noisy fitness functions:Instead of using a selection procedure that is based on the totally ordered set of noisyfitness values we endow the probabilisticfitness set with an appropriate par-tial order and deploy EAs with those selection methods being explicitly designed for coping with arbitrary partially ordered fitness sets(Rudolph1998,2001;Rudolph and Agapie2000).Section2offers a brief introduction to partially orderedsets in general and in particular to interval orders(Fishburn 1985)which constitute thefirst step towards the partial or-der to be used later on.Since the noise is supposed to have bounded support we can easily equip these intervals with a probability measure(representing the noise distribution).Thus, the interval order turns to a partial order on random variables.This subject is detailed in Section3which also contains the presentation of the EA along with a discussion of the in-herited theoretical properties from the general case(Rudolph 2001).Preliminary experimental results can be found in Sec-tion4.2Partially Ordered SetsLet be a set.A reflexive,antisymmetric,and transitive re-lation“”on is termed a partial order relation whereas a strict partial order relation“”must be antireflexive,asym-metric,and transitive.The latter relation may be obtained by the former relation by setting.After these preparations one is in the position to turn to the actual objects of interest.Definition1Let be some set.If the partial order relation “”is valid on then the pair is called a partially ordered set(or short:poset).If for some thenis said to dominate.Distinct points are said to be comparable when either or.Otherwise,andare incomparable which is denoted by.If each pair of distinct points of a poset is comparable thenis called a totally ordered set or a chain.Dually,if each pair of distinct points of a poset are incomparable then is termed an antichain.For example,let be the set of closed intervals of and defineiffiffiffIt is easily seen that is a partially ordered set in which distinct intervals with a nonvoid intersection are incompara-ble.Similarly,the infinitely large but countable setwith withis a poset with incomparable elements whereas with is totally ordered and therefore a chain.An example for an antichain is the set of“minimal elements”introduced next.1Definition2An element is called a minimal element of the poset if there is no such that. The set of all minimal elements,denoted,is said to be complete if for each there is at least onesuch that.In contrast to infinitely large posets the completeness of is guaranteed forfinitely large posets.Of course, completeness of infinitely large posets is not precluded.For example,the set with is infinitely large and the set of minimal elements is complete.3Coping with Noisy Fitness Functions3.1AssumptionsLet be thefinite search set and assume that the determinis-ticfitness function is perturbed by additive noise ,i.e.,for.As mentioned ear-lier,here we insist that random variable has bounded and known support in form of a closed interval of.For exam-ple,may have a uniform or symmetric beta distribution on its support with.Atfirst it is assumed that every point/individual is evaluated only ter on this assumption is dropped.3.2Partial Order ApproachWhen an individual is evaluated viathen the noisyfitness value is an element of the interval.Since the EA only has knowledge of the support bound and in no case of the truefitness value,the noisy evaluation of only leads to the information that the truefitness value must be in the interval.Thus,each point or individual is associated with a realization of a random interval.Next we declare a strict partial order on these intervals and thereby also a strict partial order on the individuals.Let and w.l.o.g..If(1) then we define and thereby.This choice is reasonable because we can immediately infer fromthat with probability.One should mention that this partial order is a special case of a partial order introduced in Guddat et al.(1985),p.29.Moreover,notice that the connection to interval orders gets evident by the equivalence between equation(1)and(2) Thus,whenever two intervals as those above have a nonvoid intersection then the noisyfitness values and therefore also the individuals are incomparable,in symbols:resp..It remains to examine whether the set of minimal elements of such posets represents a reasonable and useful set of can-didate solutions.For this purpose definewith andwithIn general,and are random objects.But since it is as-sumed that each element is evaluated only once,one can hold the view that each element of has been evalu-ated already before the EA is run such that the set and the quantity are deterministic during the run of the EA.In this manner one obtains a unique partial order on and on for each run.The set of minimal elements is then given by(3)Needless to say,it is reasonable to postulate that the noisy image of an unperturbed optimal point is con-tained in the set of minimal elements.As shown below,this requirement is fulfilled.Theorem1For all with holds regard-less of the value of.Proof:First notice the equivalencewhich is easily deduced from equation(3).Since the support of random variable is the relation(4)holds with probability.For the same reason one obtains.Insertion in equation(4)leads to.The next result offers an assessment of the’solutions’con-tained in the set.Theorem2.Proof:Owing to equation(3)each element of is upper bounded by.Since the support of is one obtains.Putting all together yields the desired inequality.Notice that under this partial order the set of minimal ele-ments of a givenfinite population is determinable in linear time:Find the individual with the smallest perturbedfitness value in the population.This takes time.Each in-dividual withfitness moves to the set of minimal elements.Since this takes time the entire run time is .23.3The Base AlgorithmThe pseudo code given in Figure1is taken from Rudolph (2001).Notice that an individual of a population at generation gathers all quantities of interest,i.e.,.The partial ordering of the individuals is based on their noisyfitness values of course.The expressionin phase1encapsulates the operation of generating new individuals from the current population of size with at generation.initialize;setrepeat(*PHASE1*)(*PHASE2*)for each:if thenendifif for all thenendifendfor(*PHASE3*)if thenfill with elements from:1.2.3.untilendifuntil stopping criterion fulfilledFigure1:Pseudo code of the evolutionary algorithm with par-tially orderedfitness.3.4The Theoretical Property InheritedSince the base algorithms’properties are valid for arbitrary partially orderedfitness sets,any instantiation inherits the the-oretical properties from the general case(Rudolph2001). Theorem3Let the search space of the base algorithm in Figure1befinite and the partial order of thefitness set be as described in Section3.2.If every collection of offspring can be generated from an arbitrary collection of parents with some positive minimum probability,then the entire popula-tion will enter the set after afinite number of generations with probability and stays there forever.As we know from Theorems1and2the set of minimal elements contains the noisy version of the global mini-mum and each member of is at most away from the true minimum.In this sense,we may call also the set of-optimal solutions with.3.5The Virtue of Resampling RevisitedA-optimal solution may be sufficient or may not.For the latter case one should look for a method of decreasing this bound.Here we use the technique of resampling that is common practice in EAs with noisyfitness ually each point/individual is sampled several times and the thereby obtained noisyfitness values are averaged.This makes the es-timator of the truefitness value more and more reliable by re-ducing its variance by a factor ofwhere denotes the th smallest outcome of samples in total.Thus,after samples one knows for sure that thetrue value is somewhere in the interval given in equation(5).The uncertainty interval shrinks to for.The speed of narrowing can be deter-mined as follows:Let and. Then is the relative size of the uncertainty or incomparability interval after samples and the probability that it is then still larger than percent of its initial size is given byPP(6)then E.Thus,the closer should re-semble Gaussian noise the slower is the narrowing of the uncertainty or incomparability interval.4First Numerical Experiments4.1Instantiation of the Base AlgorithmThe search space is afinite subset of the-dimensional set of integers with box constraints.An individual is represented by a-tuple of integers(the chromosomes)and the bounds of the confidence interval that is needed for the comparison of the individuals according to equation(2).Mutations of the chromosomes obey a bilateral geometrical distribution on the integers(Rudolph1994)that is truncated at the box con-straints.If the self-adaptation of the mutation distribution is switched on then an additional parameter(chromosome)must be represented in the individual.The recombination of two chromosomes is realized by uniform crossover,the potential real-valued chromosomes for self-adaptation are averaged.4.2Realization of the Noise GeneratorThe noise used here is represented by a Beta random.A Beta random variable on with pa-rameter and probability density function as given in equation(6)can be generated via whereandwith uniformly distributed random numbers.4.3Preliminary ResultsThe search space is for the test functionFigure 2shows a typical run for a population size ofand a support bound for the uniformly dis-tributed noise ,whereas the support bound was in-creased to in Figure 3and in Figure 4.051015202530354045500500100015002000250030003500400045005000n u m b e r o f s u r v i v i n g p a r e n t sgenerations051015202530354045500500100015002000250030003500400045005000n u m b e r o f (3a )-o p t i m a l s o l u t i o n sgenerationsFigure 3:Number of surviving parents and -optimal so-lutions for noise bound .As can be seen in all three cases the number of survivingparents rapidly increases at about the maximum size as soon as the population enters the set of -optimal solutions.It appears plausible that this happens the earlier the larger is the noise bound ,but it must be noted that there is no statistical support for this conjecture at the moment.Nevertheless,we dare to use the number of surviving par-ents as an indicator for the event of entering the -optimal set.When this happens then the individuals cannot be com-pared due to the size of the noise interval.Therefore,the re-sampling of the fitness values should begin right now.Many rules for the indicator mechanism are possible.Here,we introduce a width parameter that is initiallyset to.If two individuals are incomparable and at least one of them has a confidence interval larger than ,then the individual with the largest interval is re-evaluated and its interval bounds are updated.This is repeated as long as the individuals are incomparable or both confidence intervals are smaller than .The decrease of by provided that.The EA is stopped as soon as and no parent has been replaced in the last selection process.Of course,these parameters are chosen arbitrarily for thenext experiment—other values may yields far better results.The identification of “good”parameter settings is not the goal here;rather,we like to gain first insights concerning the be-havior of this EA.51015202530354045500500100015002000250030003500400045005000n u m b e r o f s u r v i v i n g p a r e n t sgenerations051015202530354045500500100015002000250030003500400045005000n u m b e r o f (3a )-o p t i m a l s o l u t i o n sgenerationsFigure 4:Number of surviving parents and-optimal so-lutions for noise bound .For this purpose the noise bound is set to and the EA is run.Figure 5shows some characteristic quantities recorded during a typical run of the EA.At about generation 2000the population begins to enter the set of -optimal solutions.No re-evaluation of the in-dividuals was triggered until now.The number of surviving parents oscillates considerably while the width parameter is continuously decreased.This leads to a rapid increase of the number of re-evaluations along with a significant improve-ment of the solutions as can be seen from the number of par-ents below certain (normally unavailable)true fitness values.The number of re-evaluations,however,is much too large for practical use.This might be caused by an unlucky choice of the EA parameters.In the next experiment the width thresh-old was chosen a magnitude larger,namely,.Table 1summarizes the results obtained from 50independent runs.The number of re-evaluations is considerably smaller now,but the optimal true solution was never contained in the finalpopulation,in contrast to some other tests with.Needless to say,some parameter studies are necessary to find5mean std.dev.skew.Table1:Summary of results for50runs with.a useful parametrization of this EA.AcknowledgmentsThis work was supported by the Deutsche Forschungsgemein-schaft(DFG)as part of the Collaborative Research Center “Computational Intelligence”(SFB531). BibliographyB.C.Arnold,N.Balakrishnan,and H.N.Nagaraja(1992).AFirst Course in Order Statistics.New York:Wiley.H.-G.Beyer(2000).Evolutionary algorithms in noisy envi-ronments:Theoretical issues and guidelines for practice.Computer Methods in Applied Mechanics and Engineer-ing186(2-4),239–267.M.Evans,N.Hastings,and B.Peacock(1993).Statistical Distributions(2nd ed.).New York:Wiley.P.C.Fishburn(1985).Interval Orders and Interval Graphs:A Study of Partially Ordered Sets.New York:Wiley.J.Guddat,F.Guerra Vasquez,K.Tammer,and K.Wendler (1985).Multiobjective and Stochastic Optimization Based on Parametric Optimization.Berlin:Akademie-Verlag.G.Rudolph(1994).An evolutionary algorithm for inte-ger programming.In Y.Davidor,H.-P.Schwefel,and R.M¨a nner(Eds.),Parallel Problem Solving From Nature, 3,pp.139–148.Berlin and Heidelberg:Springer.G.Rudolph(1998).Evolutionary search for minimal ele-ments in partially orderedfinite sets.In V.W.Porto, N.Saravanan,D.Waagen,and A.E.Eiben(Eds.),Evolu-tionary Programming VII,Proceedings of the7th Annual Conference on Evolutionary Programming,pp.345–353.Berlin:Springer.G.Rudolph(2001).Evolutionary search under partially or-deredfitness sets.In Proceedings of the International Symposium on Information Science Innovations in En-gineering of Natural and Artificial Intelligent Systems (ENAIS2001).ICSC Academic Press.G.Rudolph and A.Agapie(2000).Convergence properties ofsome multi-objective evolutionary algorithms.In A.Za-lzala et al.(Eds.),Proceedings of the2000Congress on Evolutionary Computation(CEC2000),V ol.2,pp.1010–1016.Piscataway(NJ):IEEE Press.51015202530354045500500100015002000250030003500 numberofsurvivingparentsgenerations51015202530354045500500100015002000250030003500 numberofparentsbelowcertainvaluegenerations< 30000< 3000< 300< 30<3< 0.3< 0.030.0010.010.11101000500100015002000250030003500 decayofwidththresholdgenerations1101001000100001000001e+061e+071e+080500100015002000250030003500 cumulatednumberofre-evaluationsgenerationsFigure5:From top to bottom:Number of surviving parents, number of parents below certainfitness thresholds,decay of width threshold and cumulated number of re-evaluations.6。
基于跨模态检索的效率优化算法徐明亮; 余肖生【期刊名称】《《计算机技术与发展》》【年(卷),期】2019(029)011【总页数】4页(P67-70)【关键词】跨模态检索; 语义鸿沟; 典型相关分析; 主成分分析; 子空间投影【作者】徐明亮; 余肖生【作者单位】三峡大学计算机与信息学院湖北宜昌 443002【正文语种】中文【中图分类】TP3010 引言随着互联网技术的不断发展变化,人们越来越注重于信息的交互。
人们对于信息的需求已从最初的单一新闻上的文字发展到后来的图片、视频、声音等。
在各种网络平台上,这些不同类型的数据相互交织,互为补充,且存在一定的关联。
同一信息可能以不同类型的数据呈现。
为了从不同类型的数据中同时找到表示同一信息的数据,跨模态信息检索技术应运而生。
传统的信息检索主要针对同类型的数据提取特征向量,对其进行相似度度量,根据相似度的排名来实现单模态的信息检索。
而跨模态信息检索则是建立不同模态的隐式关系模型,让不同模态能在同一空间下像单模态度量一样进行相似度度量,从而完成不同模态间的相互检索。
不同类型的模态数据,由于提取的特征向量的方式不同,导致在同一空间投影和匹配时工作量巨大。
针对传统的跨模态检索算法在处理高维度计算量巨大的问题,文中提出了一种跨模态信息检索的优化方法。
实验表明与原有算法相比,该方法在保证查准率基本不变的情况下,可以大幅减少原有算法的计算量,提高检索效率。
1 相关研究跨模态信息检索主要包括三个步骤:一是提取不同模态的特征信息来构建特征子空间;二是采用某种算法判断不同模态间特征子空间数据的关联性;三是在特征子空间下进行相似度度量,得出相应结果。
1.1 模态信息特征表达为了提取不同类型数据信息,需对原始数据进行特征提取,取出原有数据特征向量。
根据图像的特征表示,图像类型的特征可分为全局特征和局部特征两大类。
对全局特征而言,常用的提取结果主要有颜色直方图和纹理灰度矩阵;对局部特征,常用的处理结果主要有尺度不变特征,方向梯度直方图等。
环境工程2019·05137当代化工研究Modern Chemical Research技术应用与研究蒸汽的有效流动,从而会降低动能;其三,针对蒸汽来说,经常会出现过冷的情况,在湿气出现较大损耗的时候,会令汽轮机动叶片进汽口出现严重的损伤,因此其会受到水蒸汽冲击。
按照以上提出的原因,提出与其相对应的处理措施,其具体的措施有:首先,安装祛湿装置;其次,针对中间再热循环给予有效的使用,并且运用抽汽去完成供热,或者提升给水的温度,降低其在能源上的损耗和浪费;其三,针对机组自身抗冲蚀作用给予有效的完善;最后,对有可能会产生积水的位置使用一些新型的材料进行预防。
(4)充分利用重热现象重热现象主要是出现在汽轮机组中损失或是并未起到作用的能量上,其自身能够在下一个生产环节得到二次利用,其一般是在多级的汽轮机组中被使用。
在多级汽轮机组其储存的热能并未被有效利用之后,剩余的热能会被传导到下一级的汽轮机组中,这样的情况会提升下一级汽轮机组其自身的进汽焓值,从而能够令热能的利用率得到提高。
可是当前这种方式仍然处于理论研究阶段,将其投入到具体电厂生产时难免会产生偏差,加上设备本身的热能重复使用率和具体热能之间的重复率并不能够完全的匹配,因此会有很多热能出现流失的情况。
因此,针对这种现象在进行深入研究的过程中,还需按照热电厂的实际情况,使其能够对重热现象给予深入的分析和总结,从而在保证热电厂发电安全性以及发电稳定性的情况下,令热电厂热能与动力工程效率能够得到提高。
(5)有效调节喷管作为热电厂不可或缺的设备,在对喷管进行调节时,一定要对调节方法进行有效的调整,真正的认识到对其进行调节的原理,只有这样才能够合理的使用热能动力。
喷管调节的差异性在使用过程中表现的非常显著,相对应的调节阀的数量也会对其造成一定的影响,并令其自身产生改变。
针对当前多个调节阀来说,其通过的流量最大值也存在不同。
假如有调节级,那么在其自身出现负荷变化的时候,和截流调节相比其能够获得的效率就会高出一些。
第49卷第5期2021年5月同济大学学报(自然科学版)JOURNAL OF TONGJI UNIVERSITY(NATURAL SCIENCE)Vol.49No.5May2021论文拓展介绍一种加权整体最小二乘估计的高效算法王建民1,2,倪福泽3,赵建军2(1.太原理工大学矿业工程学院,山西太原030024;2.成都理工大学地质灾害防治与地质环境保护国家重点实验室,四川成都610059;3.中煤(西安)航测遥感研究院有限公司,陕西西安710100)摘要:加权整体最小二乘法(WTLS)是估计errors-in-variables(EIV)模型参数严密的方法,当面临大数据集时,其计算效率有限。
针对EIV模型中设计矩阵呈现出的结构性特征,在最小二乘准则的约束条件下,通过仅给设计矩阵的随机列赋予权重,推证了适用于EIV模型参数估计的部分加权整体最小二乘法(PWTLS)。
PWTLS无需借助拉格朗日辅助法,能够精确估计EIV模型参数;另外,该算法缩减了矩阵的维数,同时在迭代过程中避免了估计设计矩阵的随机误差,从而减小了矩阵运算量,提升了计算效率。
最后以真实数据和模拟数据为例与其他7种同类算法进行对比,结果表明,PWTLS取得了与同类算法相同的精度,但计算效率显著提高,验证了算法的可行性。
关键词:整体最小二乘;变量误差模型;计算效率;坐标转换中图分类号:P207文献标志码:AAn Efficient Algorithm for Weighted Total Least Squares MethodWANG Jianmin1,2,NI Fuze3,ZHAO Jianjun2(1.College of Mining Engineering,Taiyuan University of Technology,Taiyuan030024;2.State Key Laboratory of Geohazard Prevention and Geoenvironment Protection,Chengdu University of Technology,Chengdu610059,China;3.Aerial Photogrammetry and Remote Sensing Research Institute Co.,Ltd.,Xi’an710100,China)Abstract:The weighted total least-squares(WTLS)adjustment is a rigorous method for estimating parameters in the errors-in-variables(EIV)model.However,the WTLS are not proper for larger data problem in terms of computational efficiency.Aimed at the structural characteristics of the design matrix in the EIV model,a partially weighted total least-squares(PWTLS)algorithm is proposed based on weighted least-squares(WLS)adjustment by weighting the random column of the design matrix.The PWTLS can obtain an exact solution of the EIV model without applying Lagrange multipliers in a straightforward manner.In addition,the PWTLS reduces the dimensions of the cofactor matrix and does not estimate the random error of the design matrix,as this would greatly improve the computational efficiency. Finally,real and simulated examples are used to demonstrate the accuracy and computational performance of the proposed algorithms.The results show that the PWTLS can obtain the same accuracy as the existing seven improved algorithms,but the computational efficiency is significantly improved.Key words:total least-squares;errors-in-variables;computational efficiency;coordinate transformations高斯‒马尔可夫模型在许多工程实践中得到成功应用,通常认为模型中的设计矩阵是无误差的。
外文原文:Roadheader applications in mining and tunneling industriesH. Copur1, L. Ozdemir2, and J. Rostami31Graduate Student, 2 Director and Professor, and 3 Assistant ProfessorEarth Mechanics Institute, Colorado School of Mines, Golden, Colorado, 80401 ABSTRACTRoadheaders offer a unique capability and flexibility for the excavation of soft to medium strength rock formations, therefore, are widely used in underground mining and tunneling operations. A critical issue in successful roadheader application is the ability to develop accurate and reliable estimates of machine production capacity and the associated bit costs. This paper presents and discusses the recent work completed at the Earth Mechanics Institute of Colorado School of Mines on the use of historical data for use as a performance predictor model. The model is based on extensive field data collected from different roadheader operations in a wide variety of geologic formations. The paper also discusses the development of this database and the resultant empirical performance prediction equations derived to estimate roadheader cutting rates and bit consumption.INTRODUCTIONThe more widespread use of the mechanical excavation systems is a trend set by increasing pressure on the mining and civil construction industries to move away from the conventional drill and blast methods to improve productivity and reduce costs. The additional benefits of mechanical mining include significantly improved safety, reduced ground support requirements and fewer personnel. These advantages coupled with recent enhancements in machine performance and reliability have resulted in mechanical miners taking a larger share of the rock excavation market.Roadheaders are the most widely used underground partial-face excavation machines for soft to medium strength rocks, particularly for sedimentary rocks. They are used for both development and production in soft rock mining industry (i.e. main haulage drifts, roadways, cross-cuts, etc.) particularly in coal, industrial minerals and evaporitic rocks. In civil construction, they find extensive use for excavation of tunnels (railway, roadway, sewer, diversion tunnels, etc.) in soft ground conditions, as well as for enlargement and rehabilitation of various underground structures. Their ability to excavate almost any profile opening also makes them very attractive to those mining and civil construction projects where various opening sizes and profiles need to be constructed.In addition to their high mobility and versatility, roadheaders are generally low capital cost systems compared to the most other mechanical excavators. Because of higher cutting power density due to a smaller cutting drum, they offer the capability to excavate rocks harder and more abrasive than their counterparts, such as the continuous miners and the borers.ROADHEADERS IN LAST 50 YEARSRoadheaders were first developed for mechanical excavation of coal in the early 50s. Today, their application areas have expanded beyond coal mining as a result of continual performance increases brought about by new technological developments and design improvements. The major improvements achieved in the last 50 years consist of steadily increased machine weight, size and cutterhead power, improved design of boom, muck pick up and loading system, more efficient cutterhead design, metallurgical developments in cutting bits, advances in hydraulic and electrical systems, and more widespread use of automation and remote control features. All these have led to drastic enhancements in machine cutting capabilities, system availability and the service life.Machine weights have reached up to 120 tons providing more stable and stiffer (less vibration, less maintenance) platforms from which higher thrust forces can be generated for attacking harder rock formations. . The cutterhead power has increased significantly, approaching 500 kW to allow for higher torque capacities. Modern machines have the ability to cut cross-sections over 100m2 from a stationary point. Computer aided cutterhead lacing design has developed to a stage to enable the design of optimal bit layout to achieve the maximum efficiency in the rock and geologic conditions to be encountered. The cutting bits have evolved from simple chisel to robust conical bits. The muck collection and transport systems have also undergone major improvements, increasing attainable production rates. The loading apron can now be manufactured as an extendible piece providing for more mobility and flexibility. The machines can be equipped with rock bolting and automatic dust suppression equipment to enhance the safety of personnel working at the heading. They can also be fitted with laser-guided alignment control systems, computer profile controlling and remote control systems allowing for reduced operator sensitivity coupled with increased efficiency and productivity. Figure-1 shows a picture of a modern transverse type roadheader with telescopic boom and bolting system.Mobility, flexibility and the selective mining capability constitute some of the most important application advantages of roadheaders leading to cost effective operations. Mobility means easy relocation from one face to another to meet the daily development and production requirements of a mine. Flexibility allows for quick changes in operational conditions such asFigure-1: A Transverse Cutterhead Roadheader (Courtesy of Voest Alpine)different opening profiles (horse-shoe, rectangular, etc.), cross-sectional sizes, gradients (up to 20, sometimes 30 degrees), and the turning radius (can make an almost 90 degree turn). Selectivity refers to the ability to excavate different parts of a mixed face where the ore can be mined separately to reduce dilution and to minimize waste handling, both contributing to improved productivity. Since roadheaders are partial-face machines, the face is accessible, and therefore, cutters can be inspected and changed easily, and the roof support can be installed very close to the face. In addition to these, high production rates in favorable ground conditions, improved safety, reduced ground support and ventilation requirements, all resulting in reduced excavation costs are the other important advantages of roadheaders.The hard rock cutting ability of roadheaders is the most important limiting factor affecting their applications. This is mostly due to the high wear experienced by drag bits in hard, abrasive rocks. The present day, heavy-duty roadheaders can economically cut most rock formations up to 100 MPa (~14,500 psi) uniaxial compressive strength (UCS) and rocks up to 160 MPa (~23,000 psi) UCS if favorable jointing or bedding is present with low RQD numbers. Increasing frequency of joints or other rock weaknesses make the rock excavation easier as the machine simply pulls or rips out the blocks instead of cutting them. If the rock is very abrasive, or the pick consumption rate is more than 1-pick/m3, then roadheader excavation usually becomes uneconomical due to frequent bit changes coupled with increased machine vibrations and maintenance costs.A significant amount of effort has been placed over the years on increasing the ability of roadheaders to cut hard rock. Most of these efforts have focused on structural changes in the machines, such as increased weight, stiffer frames and more cutterhead power. Extensive field trials of these machines showed that the cutting tool is still the weakest point in hard rock excavation. Unless a drastic improvement is achieved in bit life, the true hard rock cutting is still beyond the realm of possibility with roadheaders. The Earth Mechanics Institute(EMI) of the Colorado School of Mines has been developing a new cutter technology, the Mini-Disc Cutter, to implement the hard rock cutting ability of disc cutters on roadheaders,as well as other types of mechanical excavators (Ozdemir et al, 1995). The full-scale laboratory tests with a standard transverse cutterhead showed that MiniDisc Cutters could increase the ability of the roadheaders for hard rock excavation while providing for lesser cutter change and maintenance stoppages. This new cutting technology holds great promise for application on roadheaders to extend their capability into economical excavation of hard rocks. In addition, using the mini-disc cutters, a drum miner concept has been developed by EMI for application to hard rock mine development. A picture of the drum miner during full-scale laboratory testing is shown in Figure-2.Figure-2: Drum Miner CutterheadFIELD PERFORMANCE DATABASEPerformance prediction is an important factor for successful roadheader application. This deals generally with machine selection, production rate and bit cost estimation. Successful application of roadheader technology to any mining operation dictates that accurate and reliable estimates are developed for attainable production rates and the accompanying bit costs. In addition, it is of crucial importance that the bit design and cutterhead layout is optimized for the rock conditions to be encountered during excavation.Performance prediction encompasses the assessment of instantaneous cutting rates, bit consumption rates and machine utilization for different geological units. The instantaneous cutting rate (ICR) is the production rate during actual cutting time, (tons or m3 / cutting hour). Pick consumption rate refers to the number of picks changed per unit volume or weight of rock excavated, (picks / m3 or ton). Machine utilization is the percentage of time used for excavation during the projectTable-I: Classification of the Information in the DatabaseINFORMATION GROUP DETAILSGeneral Information Type/purpose of excavation (roadway, railway, sewer, mining gallery, etc.), contractor, owner, consultant, location, starting andthe Istanbul Technical University has established an extensive database related to the field performance of roadheaders with the objective of developing empirical models for accurate and reliable performance predictions. The database contains field data from numerous mining and civil construction projects worldwide and includes a variety of roadheaders and different geotechnical conditions.The empirical performance prediction methods are principally based on the past experience and the statistical interpretation of the previously recorded case histories. To obtain the required field data in an usable and meaningful format, a data collection sheet was prepared and sent to major contractors, owners, consultants, and roadheader manufacturers. In addition, data was gathered from available literature on roadheader performance and through actual visits to job sites. This data collection effort is continuing.The database includes six categories of information, as shown in Table-I. The geological parameters in the database consist generally of rock mass and intact rock properties. The most important and pertinent rock mass properties contained in the database include Rock Quality Designation (RQD), bedding thickness, strike and dip of joint sets and hydrological conditions. The intact rock properties are uniaxial compressive strength, tensile strength, quartz content, texture and abrasivity. The rock formations are divided into separate zones to minimize the variations in the machine performance data to provide for moreaccurate analysis. This also simplifies the classification of the properties for each zone and the analysis of the field performance data.The major roadheader parameters included are the machine type (crawler mounted, shielded), machine weight, cutterhead type (axial, transverse), cutterhead power, cutterhead-lacing design, boom type (single, double, telescopic, articulated), and the ancillary equipment (i.e.grippers, automatic profiling, laser guidance, bit cooling and dust suppression by water jets, etc.).The operational parameters generally affect the performance of the excavator through machine utilization. The most important operational parameters include ground support, back up system (transportation, utility lines, power supply, surveying, etc.), ground treatment (water drainage, grouting, freezing, etc.), labor (availability and quality), and organization of the project (management, shift hours, material supply, etc.).CONCLUSIONSThe evaluation and analysis of the data compiled in the roadheader field performance database has successfully yielded a set of equations which can be used to predict the instantaneous cutting rate (ICR) and the bit consumption rate(BCR) for roadheaders. A good relationship was found to exist between these two parameters and the machine power (P), weight (W) and the rock compressive strength (UCS). Equations were developed for these parameters as a function of P, W and UCS. These equations were found mainly applicable to soft rocks of evaporatic origin. The current analysis is being extended to include harder rocks with or without joints to make the equations more universal. In jointed rock, the RQD value will be utilized as a measure of rock mass characteristics from a roadheader cuttability viewpoint. It is believed that these efforts will lead to the formulation of an accurate roadheader performance prediction model which can be used in different rock types where the roadheaders are economically applicable.中文译文:掘进机在采矿和隧道业中的应用摘要掘进机为方便的挖掘中硬岩石提供了一个独特的能力。
The Role of Producer ServiceOutsourcing in the InnovationPerformance of New York StateManufacturing FirmsAlan MacPhersonCanada-United States T rade Center, Department of Geography, University at BuffaloThis paper assesses the contribution of external technical services to the innovation initiatives of New York State manufacturing ªrms. The results of a spatially and sectorally stratiªed postal survey of more than 400 manufacturing ªrms are presented. A major ªnding of the paper is that specialized technical services can support the product development efforts of innovative ªrms. The empirical results also point to signiªcant spatial variations in technical service utilization. Some of these variations reºect different supply and accessibility conditions among the state’s major regions and urban centers. The survey results are discussed in the context of recent empirical and theoretical ªndings on the role of producer services in urban and regional development. Particular attention is given to the empirical connection between producer service accessibility and industrial innovation. Key Words: external technical services, new product development, New York State manufacturing ªrms, regional patterns of service consumption.A substantial body of literature now high-lights the importance of advanced pro-ducer services to urban and regional de-velopment (see Harrington 1995). Specialist ªrms in this sector of the economy typically sup-ply other business units with high-order infor-mational inputs (e.g., management advice or market intelligence). Signiªcantly, recent re-views by Daniels (1989), Goe (1993), Hansen (1994), and Illeris (1994) suggest an interna-tional convergence of opinion regarding the contribution of these types of services to the operational efªciency and/or commercial per-formance of client ªrms—including manufac-turers.In this regard, policy interest in the industrial role of producer services has expanded quickly over the last few years (Britton 1993; Kelley and Brooks 1991; National Research Council 1993). In the U.S., a general lack of product and/or process innovation has seriously con-strained the international competitiveness of do-mestically owned ªrms (Shapira et al. 1995), many of which lack the in-house skills to keep pace with current rates of technological change (U.S. General Accounting Ofªce 1995). Re-course to external assistance is a partial correc-tive to this problem (Feldman 1994), notably for small and medium-sized ªrms (SMFs). From a policy standpoint, growing attention has fo-cused upon the extent to which independent consultants can assist the commercial and/or technological efforts of SMFs (Shapira 1990). Evidence is mounting that we can trace part of the vitality of the SMF sector to specialists that sell technical expertise in such spheres as pro-duction engineering (O’Farrell 1995), contract R&D (Haour 1992), industrial design (O’Con-nor 1996), and management support (Sinkula 1990). Across most of the advanced market economies, innovative industrial ªrms have be-come important buyers of these types of inputs, suggesting a technological interface between goods production and producer service activity (Freeman 1991).Although nonindustrial clients are the main buyers of high-order producer services (see Bey-ers and Lindahl 1994), recent work on indus-trial demand suggests that service-to-manufac-turing linkages are especially important in termsAnnals of the Association of American Geographers, 87(1), 1997, pp. 52–71©1997 by Association of American GeographersPublished by Blackwell Publishers, 350 Main Street, Malden, MA 02148, and 108 Cowley Road, Oxford, OX4 1JF, UK.of scientiªc and technical (S&T) interactions (Illeris 1994). According to Tyson (1993), these types of interactions can play a key role in the innovation performance of users, notably with regard to new-product development. Simply stated, the evidence suggests that professionally qualiªed consultants can deliver strategic beneªts to industrial clients, often at remarkably low cost (Berman 1995; Smallbone et al. 1993). Keeping these points in mind, this paper ex-amines the role of technical outsourcing in the innovation performance of New York State (NYS) manufacturing ªrms. The results of a re-cent postal survey show that the propensity to successfully bring new products to the market-place is often contingent upon the use of differ-ent blends of external expertise. The results also suggest that a ªrm’s regional context can inºuence the depth and nature of its outsour-cing activity. On balance, ªrms that reside in service-poor regions tend to be relatively weak performers in terms of successful product devel-opment. And, as we shall see later, part of the explanation for this pattern can be traced to sup-ply-side problems in terms of the local availabil-ity of external technical and/or management inputs.Set against this backdrop, the following analysis tackles three main questions. First, to what extent does technical service demand by industrial ªrms vary across regions of different types? Second, to what degree is technical serv-ice consumption a positive factor in the innova-tion performance of buyers? And, third, does the geography of supply and demand affect the structure of service-to-industry linkages at the regional scale? Partial answers to these questions come from an empirical investigation of NYS industrial ªrms across a range of locational set-tings. Additional insights come from a series of telephone interviews with a subsample of ªrms. Before reviewing the results, however, it is ap-propriate to sketch a brief theoretical context for the inquiry.Theoretical ContextThree interlinked strands of theory inform the empirical thrust of the paper. First, internal diseconomies of scope typically prohibit ªrms from achieving in-house competence across a full range of S&T functions (Rothwell 1992). This idea ºows from the ongoing division of labor that has been taking place between and within business units since the advent of indus-trialization (Scott 1986; Walker 1985; Young 1928). As such, there is nothing new about the rising segmentation of tasks between different parts of the economy. Rajan and Pearson (1986) trace at least some of the post-1945 growth of specialized producer services to an efªciency-driven fragmentation of jobs between ªrms, fol-lowing the classical logic of structural change outlined by Adam Smith. A formal economic model proposed by Lentnek et al. (1992) pre-sents this logic from a spatial standpoint. Ac-cording to this model, ªrms will choose external vendors whenever the relative costs of in-house supply are higher. This model also implies that time lags in the delivery of key inputs will force vendors to optimize their location with respect to client demand, leaving peripheral buyers in a potentially disadvantaged position.A second line of theory comes from policy-oriented work on the technical links between innovative industrial ªrms and external consult-ants (Bessant and Rush 1995; Lefebvre et al. 1991). Evidence dating back to the 1950s shows that many ªrms can sharpen their technological edge by subcontracting specialized work to out-side experts (Carter and Williams 1957; Myers and Marquis 1969). This literature indicates that in-house resources are best allocated toward core activities that match the ªrm’s existing skills, whereas esoteric or infrequently required jobs are better handled by independent vendors (Britton 1989). This body of theory differs from the question of scope economies in that outside talent is often solicited in response to in-house technical limitations (Haour 1992). Unlike the fragmentation thrust noted above, then, this second strand of theory stems from the idea that technological factors may force certain types of ªrms to seek outside help in ªelds that go be-yond their in-house competence (Feldman 1994).A third body of theory comes from a number of spatially focused ideas that have cropped up in the recent work on producer services and re-gional economic change. Several authors have proposed that regions with weak producer serv-ice endowment are unlikely to support major levels of new industrial expansion, innovation, and/or job growth (Coffey and Bailly 1993; Hitchens et al. 1994). An implication here is that certain types of ªrms may need quick access to a locally rooted supply of advanced servicesProducer Service Outsourcing53(Harrington and Lombard 1989). While certain types of inputs can be moved up or down the urban system via electronic means (or by mail), others require face-to-face meetings for proper delivery (Daniels 1989). For companies that de-pend upon a wide mix of services, then, the spatial implications that ºow from alternate de-livery options are partially analogous to a We-berian problem. In this case, however, the raw materials consist of knowledge, information, or skills, rather than physical resources.From a policy viewpoint, current interest in the locational relations between industrial and producer service establishments ºows from a suspicion that close proximity between these sectors is a necessary condition for efªcient in-terplay (Porter 1990). We can trace empirical support for this idea to Ellwein and Bruder (1982), Feldman and Florida (1994), and Meyer-Krahmer (1985), while theoretical sup-port has come from Britton (1989), Daniels (1989), and Goddard (1978). Other things be-ing equal, manufacturers in service-poor regions are likely to exhibit weaker performance than comparable ªrms in large urban centers (O’Far-rell et al. 1995). Moreover, if we accept the bal-ance of technological evidence noted by Feld-man (1994), Malecki (1994), and Rothwell (1991), then a further possibility is that certain types of ªrms may be locationally disposed to-ward inferior performance. Although there is lit-tle doubt that successful SMFs can be found in a variety of regional settings, including unfavor-able ones (Vaessen and Keeble 1995), few ana-lysts would deny that spatial proximity to the human capital resources of major metropolitan centers can offer strategic beneªts to potential innovators (Britton 1991; Feldman 1994; Malecki and T ootle 1996).On this note, four sets of empirical results from earlier studies offer a comparative context for the paper. First, Rothwell’s (1977, 1992) work shows that innovative ªrms usually obtain at least some of their S&T inputs from external sources. Signiªcantly, Rothwell’s data suggest an important role for such linkages, especially in spheres that pertain to technology development, product design, and/or management. Second, successful recourse to outside help is often con-tingent upon a ªrms’s ability to identity, specify, and evaluate its internal weaknesses across key areas of production and marketing (Sinkula 1990). The evidence also shows that in-house technical competence is a ªrst requirement for successful retrieval of outside help (Rothwell and Dodgson 1991). Third, the need for spe-cialized support varies by type of client. At one extreme, for example, plants that belong to mul-tinational ªrms can often bypass the external service environment by tapping the internal re-sources of the corporation as a whole (Malecki 1991). Fourth, the types of outside inputs that manufacturers seek vary considerably in terms of function, cost, and impact (Chandra 1992). Here the evidence implies that a ªrm’s position along the product life-cycle (PLC) can inºuence the structure of external input demand. In terms of innovation support, for instance, ªrms in ma-ture markets typically want services that relate to process improvement (new production meth-ods), whereas ªrms in younger markets more often demand inputs that assist product devel-opment (Britton 1989). This is not to deny the existence of intermediate positions among ªrms of different types, nor is it to suggest that a focus upon one mode of outside help is better than another. Rather, the suggestion is that a ªrm’s PLC position may inºuence the balance of in-puts demanded (Utterback and Abernathy 1975).Taken together, these ªndings imply that sin-gle-plant ªrms in peripheral regions are less likely to enjoy good access to high-order services than comparable ªrms in more central places. While ªrms can trade almost all types of ad-vanced producer services between regions, the option to import is far from universally applied (Malecki 1994). A limiting factor is that strate-gic services often require face-to-face interac-tions for efªcient delivery. According to O’hUallachaín (1991), for instance, the need for face-to-face discussions varies directly with the potential ambiguity of the information sought. For peripheral ªrms that need external help in complex areas, then, human skills must often be imported—either by sending in-house people to the supply-source or by bringing the vendor’s bofªns to the production site. Signiªcantly, there is evidence that these types of contact re-quirements can limit the external options of pe-ripheral ªrms, especially those operating with restricted ªnancial resources (Gertler 1995). Today, of course, the distance separating ven-dors from buyers might seem rather trivial as far as interaction potential is concerned. After all, modern telecommunications technologies have surely rendered some aspects of relative location less crucial than before (Hepworth 1989). Lest54MacPhersonwe get too cozy with the notion of a wired world, however, recent work on producer service delivery suggests a continuing role for face-to-face meetings at the consumption point (Goe 1991). Contemporary research on technology diffusion also points to a major role for physical and/or cultural proximity between buyers and sellers (Cornish 1997). For detailed examples of the transactional problems that can hinder pe-ripheral ªrms, see Gertler (1995) in the context of after-sales-servicing contracts for Canadian users of advanced machinery; for more general examples, see Lundvall (1988), Sabel et al. (1987), and Porter (1990). Additional evidence from Canada shows that more than 40 percent of the technical services consumed by goods-producing ªrms in Montreal come from inde-pendent producer service establishments (Cof-fey et al. 1994). These authors also show that face-to-face interaction is the main mode of in-put delivery, with about half of all such contacts taking place at the client’s production site. On balance, then, the recent literature implies that service accessibility has a potentially impor-tant bearing upon innovation success. In addi-tion, service accessibility would also appear to play a role in the inclination of industrial ªrms to seek external help in the ªrst place (Chandra 1992). If these types of connections have valid-ity beyond the regional contexts covered by he studies cited thus far, then a logical conjecture is that peripheral ªrms must expend more effort on competitive positioning than their metro-politan counterparts. Is this the case?In attempting to answer this question, how-ever, I should note that the taxonomy of pro-ducer services employed in this paper is a narrow one (Table 1). For instance, the ªnancial, insur-ance, real estate, and legal subsectors of the pro-ducer services have been ignored, because the original goal of the project was to examine only those services known to contribute directly to-ward the scientiªc, technical, and/or manage-ment dimensions of the production/innovation efforts of individual plants. While lawyers and bankers are important from an enabling point of view (try launching a new product without a good line of credit or a patent search), they rarely contribute to the hands-on work of prob-lem solving on the shopºoor. In employmentTable 1.Classes of External Producer Services for the New York State Survey aService Category Selected Studies Examples of User Impact Private servicesIndustrial design Chandra (1992), O’Connor (1994)New or better productsContract R&D Haour (1992), Lawton-Smith (1993)New products or procedures Management consulting Berman (1995), O’Farrell et al (1995)Better ways of doing business Marketing Sinkula (1990), Coffey et al (1994)Improved sales performance Advertising Beyers and Lindahl (1994)Finding new customersExport counseling Berman (1995), Britton (1989)Finding new export markets Equipment repair Lentnek et al. (1992)Reduced downtime/lower costs Data processing Hepworth (1989), Phillips (1995)Lower costs/professional quality Business software Phillips (1995), Yap et al. (1992)Improved management efªciency Laboratory testing Feldman and Florida (1994)Essential product information Production engineering Rothwell (1992), Britton (1993)New or better production methods Public servicesGovernment agencies Chrisman and Katrishen (1995)Market data and business planning Hospital research units Chandra (1992), MacPherson (1995)Clinical trials and researchT echnical colleges Lawton-Smith (1993)Applied R&D, engineering help Universities Haour (1992), Rothwell (1991)Basic and applied research Informal/nonmarket servicesOther manufacturing ªrms Lipparini and Sobrero (1994)New ideas and engineering advice Informal business networks Malecki and Veldhoen (1993), Malecki (1994)Market leads, business information Suppliers Gertler (1995), Soni et al. (1993)Innovative inputs, new ideas Customers Von Hippell (1978, 1988)Feedback on design ºaws Distributors Glasmeier (1990)Hints on customer/market needsa This table is not designed to supply a comprehensive or representative summary of the recent empirical or theoretical contribu-tions by scholars in this ªeld. Instead, the intent is simply to provide a snapshot of the types of inquiries conducted, along with some of the general impacts identiªed either explicitly or implicitly.Producer Service Outsourcing55terms, then, the taxonomy shown in this paper captures only a small part of the producer serv-ices—not the whole sector.MethodologyIn order to assess the service-to-industry rela-tionship at the ªrm level, I mailed self-adminis-tered questionnaires to more than 1,700 New York State manufacturers across four sectors (furniture, scientiªc instruments, fabricated metals, and electrical industrial products). These sectors were chosen for several reasons: ªrst, to obtain a technological cross section of ªrms, no-tably in terms of R&D effort, market focus, and export activity; second, to ªnd a sector-mix that would closely mirror the structure of industrial employment across the state as a whole; and third, to focus on sectors in which SMFs1 enjoy a prominent economic role. Earlier studies have shown that SMFs are potentially prime targets for external support, if only because this size-class often lacks a full range of in-house skills (Shapira 1990). In sum, the sample was de-signed to reºect the typical scale and sector-mix of manufacturing activity within the state’s main regions.The survey consisted of two rounds, spread over a period of fourteen months. In the ªrst phase of the project (September/October 1994), questionnaires were mailed to a systematic sam-ple of 1,700 ªrms (covering roughly half of the total population across the four sectors). This phase was designed to obtain detailed informa-tion on service spending, geographical sourcing, delivery methods, and user impact (among other things). Of the 1,700 ªrms in the sam-pling frame, 326 were subsequently eliminated as a result of either incorrect SIC listings (n = 75), recent business failure (n = 62), or job-shop status (n = 189), bringing the N-size down to 1,374. A total of 472 valid returns were received, giving a 34 percent response rate. A second sur-vey was mailed in the autumn of 1995, covering the other half of the population. This phase comprised an abbreviated and categorically structured version of the original questionnaire (focusing upon key areas of variation gleaned from the earlier survey). Because this phase of the project remains as a work-in-progress (to be reported upon at a later date), the discussion which follows conªnes itself to the results of the ªrst survey. Additional data come from the re-sults of 255 telephone interviews with business executives from a range of sectors and locations. In a preliminary effort to assess regional pat-terns of service demand, I divided New York State into a variety of areal groupings to test for scale and zoning effects. Here the goal was to ªnd a set of regional boundaries that would best reºect the underlying spatial features of the data. While the resulting regionalization (Figure 1) does not provide a perfect delineation as far as aggregation problems are concerned, the three divisions provide acceptable delineations for the purposes of this paper (for a discussion of the regionalization process and its attendant meth-odological problems, see Curtis and MacPher-son 1996).On this note, Table 2 shows that the spatial and sectoral pattern of responses broadly matches the population distribution for each re-gion. While this implies an element of repre-sentativeness in terms of regional and sectoral coverage, the data mask a number of distortions. For one, the sample exhibits a size-mix that is biased toward SMFs (Table 3). Although this is not too surprising, especially in light of the ris-ing prominence of SMFs in the state’s industrial base, the relatively low response rate for larger ªrms is troublesome.A second caveat is that this SMF bias is strongest for the electrical products sector, no-tably within the Buffalo and New York City metropolitan areas (for further details, see MacPherson 1997). This is problematic because these two areas contain the lion’s share of the state’s largest electrical products ªrms. Although several efforts were made to mitigate theseFigure 1.New York: the study regions and mainurban centers.56MacPhersonbiases, the discussion that follows should be treated with caution. In particular, I should note that the results pertain mainly to business units with 500 or fewer workers (Table 3), most of which (84 percent) are single-plant ªrms as op-posed to branches of multilocational companies (16 percent). While several distinctions emerged between these two groups (i.e., single vs. multi-plant units), plant status did not turn out to be a signiªcant variable in regard to the key factors discussed later.2 Keeping these caveats in mind, then, I will summarize some of the main results of the survey below.Survey ResultsTable 4 shows the regional pattern of external service spending by sector (annual averages for the period 1989–1993), aggregated for the full range of private input classes listed in Table 1.3 At least three notable patterns can be discerned here. First, aggregate levels of spending vary ap-preciably across the state. A locational rank-size effect can be seen, in that the largest region in terms of economic activity (the New York City metropolitan area) exhibits the highest spending estimates overall, whereas the smallest region (Upstate/Central) exhibits the lowest estimates. These differences are statistically signiªcant at p = <0.05 for all regional combinations (one-tailed t-tests). At the county level, moreover, a positive correlation (r = 0.5952; p = 0.05) was found between the external service expenditures of the survey ªrms (scaled as a proportion of their 1993 sales) and the regional distribution of business service employment (scaled as a per-centage of total county employment). Although several important outliers emerged from this ex-ercise, it is fair to say that the geography of ex-ternal spending closely matches the distribution of business service supply at the county level. In short, external spending is generally higher in supply-rich locations.A second aspect of the data is that the rank-orders for sectoral spending also vary by region. For example, the scientiªc instruments sector emerged as the biggest spender in the NYC re-gion (US$68,000 per annum/per ªrm). In con-trast, the biggest spenders from the Up-state/Central region (UC) were metal fabrica-tors (US$33,000), whereas the dominantTable 2.Response Rates by Sector and LocationSectorNew York City Upstate/Central Western New York All RegionsN n%N n%N n%N n%Metal159 5635.2128 4134.1188 5227.6 47514931.3 Electrical206 7134.4144 4128.4122 4940.1 47216134.1 Instruments106 3835.8 39 1538.4 89 2831.4 234 8134.6 Furniture 78 2430.7 64 2945.3 51 2854.9 193 8141.9 Total54918934.437512633.645015734.8137447234.3 N = sampling frame population; n = number of valid responses; % = response rate.Table 3.Size-Classes, Response Rates, and Plant Status of the Survey RespondentsEmployment Range a Sampling FramePopulation Valid Responses b Single-Plant FirmsResponseRates c N%n%n%%1–49 689 50.1254 53.823291.336.8 50–99 274 19.9103 21.8 8784.437.5 100–199 217 15.8 67 14.2 5480.530.8 200–499 116 8.5 31 6.6 1961.226.7 500+ 78 5.7 17 3.6 317.621.7 Total1374100.0472100.039583.634.3a Full-time job counts (ranges taken from the Commerce Register [1994] database).b Purged of incorrect listings.c Response by size-class.Producer Service Outsourcing57spenders from Western New York (WNY) were electoral equipment producers (US$56,000). Interestingly, the data also point to regional vari-ations in the types of services that ªrms buy across the four sectors. T o keep the description as simple as possible, the service-speciªc exam-ples listed in Table 4 refer to the single most important external spending categories by sec-tor. On this basis, NYC furniture producers al-locate more of their external budgets toward management services than their counterparts to the north (where equipment repair is the single most expensive external link). In the case of the electrical products sector, moreover, the data show that NYC ªrms are more oriented toward engineering consulting services than compara-bly sized ªrms elsewhere in the state (the top-ranking inputs for electrical products manufac-turers in UC and WNY are testing and repair services, respectively). With the exception of the scientiªc instruments industry, where design services consistently rank ªrst across all three regions, it would appear that NYC ªrms exhibit more sophisticated external purchasing practices than their UC and WNY counterparts.4A third feature of the data is that average spending on external help does not appear to amount to very much in absolute dollar terms.5 For this sample, in fact, average total spending per annum roughly translates into no more than the annual salary equivalent of a single skilled worker. At ªrst glance, then, this ªnding might appear to support the contention that industrial demand for external technical inputs has not been a potent factor in the recent growth of producer service activity. In spite of this, the data conceal other facets of industrial demand that ought to be considered.To begin with, Table 4 provides a snapshot of average spending for the sample as a whole. Note that there is major variability between the survey ªrms. Among service users themselves (deªned as ªrms that annually spent at least US$5,000 on external help from private sources), 4 percent spent more than US$250,000 per year, and 5 percent spent between US$100,000 and US$249,000, while a further 5 percent spent between US$75,000 and US$99,000. In short, the sample has captured a small but signiªcant nucleus of “important spenders.” Second, the data pertain to a sample of only 472 respon-dents. If these estimates are even remotely reºec-tive of typical spending levels for the manufac-turing sector as a whole, then it is safe to infer that industrial clients generate multibillion dol-lar earnings for the state’s technical advisory units. Third, and perhaps more important, modest spending on external help does not nec-essarily translate into modest impact. Indeed, as shall be shown presently, some of these seem-ingly minor expenditures are of considerable sig-niªcance to the production and/or innovation initiatives of buyers.A crude but suggestive illustration of this point is shown in Table 5, which collates the incidence of successful new product develop-ment against the presence of external linkages.6 Here the term “innovation” refers to the intro-duction and subsequent commercialization of a new or substantially improved product. GivenTable 4.Average Annual Spending on Producer Service Inputs (US$000s)aSectorNew York City Upstate/Central Western New York All Regionsb All UsersMainInput c All UsersMainInput All UsersMainInput All UsersMainInputMetal20.432.6PE21.832.9ER33.348.2ER25.338.3ER Electrical47.666.2PE17.125.0TS41.756.3MG38.153.1PE Instruments59.768.7DS 4.014.1DS23.534.6DS36.252.3DS Furniture 7.214.4MG 7.917.7ER14.533.9ER10.021.9ER Total36.652.7MG15.026.1ER30.846.9ER28.944.5DSa Average annual spending on producer services over the period 1989–1993.b Where All = average spending across all ªrms in each group; Users = spending by ªrms that have at least one signiªcant link (> US$5000) to private producer service vendors.c The single largest external spending category by sector.Where: PE = production engineeringMG = management consultingDS = industrial designER = equipment repairTS = testing services58MacPherson。
第四篇汉英常用采矿工程词汇(分类按汉语拼音顺序排列)煤柱尺寸pillar dimension灰色关联分析gray relationship analysis地质构造geological tectonic水文地质hydrogeology极不规则煤层extremely anomalistic coal seam预紧力、预应力pre-tightening force, pre-stress含水的hydrous水力采煤hydraulic cutting coal主(副)井main (auxiliary) shaft敏感性sensitivity国内外at home and abroad瓦斯抽放gas drainage机电一体化mechanical-electrical integration稳定性stability微地震micro-seism动载dynamic loading流变特征rheological characteristic蠕变creep人工神经网络artificial neural network定位orientation保水开采mining without detroying water resource高压注水water injection with high pressure中硬煤层medium hard coal seam层次分析法hierarchy anallysis proceeding煤与瓦斯突出coal and gas bursting模糊综合评判fuzzy comprehensive evaluation综(放)采工作面full-mechanized (caving) mining face炮采工作面blasting mining face机采工作面mechanized mining face(急)倾斜煤层(steeply) inclined coal seam走向strike倾向direction dip,dip,inclination矿山压力underground pressure支承压力abutment pressure上覆岩层overlying rock strata薄、中厚、厚煤层thin,medium thick,thick coal seam(松散)含水层(loose) aquifer裂缝带crack zone垮落带caving zone顶、底板roof,floor顶煤top coal(掩护式)液压支架(shield) powered support割煤机coal cutter刮板机scrapper胶带输送机belt conveyor片帮coal slide冒顶roof fall岩爆(矿震、冲积矿压)rock burst自燃spontaneous combustion 防水(砂)煤柱water(sand)-proof coal pillar陷落柱collapse column煤(岩)巷coal(rock) roadway沿空巷道roadway along gob围岩surrounding rock突水water bursting"三软" soft roof,soft coal,soft floor地应力ground stress(开采)地表沉陷(mining) surface subsidence高产高效high production and high efficiency回风巷ventilation roadway运输巷transportation roadway火成岩(岩浆岩)igneous rock厚冲积层thick alluvium条带开采strip mining移动、变形movement,distortion极限平衡区limit equilibrium area可靠性,可行性,合理性,适应性reliability, feasibility, rationality,adaptability采出率、回收率mining rate,recovery rate埋管注氮nitrogen injection with buried pipe复合顶板combined roof断层fault喷浆grouting动压dynamic pressure地质灾害geological disasterEH-4电导率成像系统EH-4 electro-conducibility imaging system透气性(渗透性)permeability倾向性liability井下underground设备配套equipment match选型设计lectotype design撤架dismantling supports加固reinforcing安全阀safety valve积水seeper柔性支架flexible support伪斜apparent dip绞车房hoist room(winder chamber)暗立(斜)井blind (slope) shaft底臌floor heave破碎顶板cracked roof解放层protective coal seam电阻应变仪resistance strain instument坑口电厂pithead power plant1.煤炭科学技术总论采矿系统工程mining system engineering 采矿系统优化模型model of mining systemoptimization采矿岩石学mining rock mechanics采煤coal m ining, coalextraction, coal getting采煤学 coal mining 地下开采 underground mining矸石 refuse, waste, dirt,debris固体可燃矿产 solid combustible mineral褐煤 lignite, brown c oal化石燃料 fossil fuel洁净煤技术 clean coal technology矿 [underground] mine矿床模型 model of mineral deposits矿井 [underground] mine矿井通风[学] mine ventilation,underground ventilation矿区 mining area矿山 mine矿山测量[学] mine surveying矿山电气工程 mine electric engineering,mine electro-technics矿山机械工程 mine mechanical engineering矿山建设 mine construction矿山提升[学] mine hoi s ting, winding矿山运输[学] mine transportation, minehaulage矿田 mine filed露天开采, '露天开采学 surface mining露天矿 surface mine煤 coal煤当量 coal equivalent煤的综合利用 comprehensive utilization ofcoal, coal utilization煤化学 coal chemistry煤矿 coal mine, colliery煤矿安全 mine safety 煤矿机械 coal mine machinery 煤转化 coal conversion 煤炭地下技术 underground coal gasification 煤炭环境保护 coal environmental control, environmental protection in coal mining 煤炭技术 coal technology 煤炭加工 coal processing 煤炭科学 coal science 煤田 coalfield 煤田地质勘探 coal e xploration, coal prospecting 煤田地质学 coal geology 煤岩学 coal petrology 石煤 stone-like coal 无烟煤 anthracite 选煤 coal preparation, coal cleaning 烟煤 bituminous coal 硬煤 hard coal2.地质测量 A 级储量 grade A reservesB 级储量 grade B reservesC 级储量 grade C reservesD 级储量 grade D reserves 边界角 limit angle 采场验收测量 pit acceptance survey 采出率 ratio recovery 采动系数 factor of full extraction 采掘工程平面图 mine map 采煤工作面测量 coal face survey 采区测量 mining district survey 采区联系测量 transfer survey in mining district 超前影响角 advance angle of influence 沉井凿井法施工测量 construction survey for drop shaft sinking 充分采动 full subsidence, critical/supercritical extraction充分采动角 ZHANGCHIZZubsidence储量管理 reserves management导入高程测量 elevation transfer survey 导水断裂带 water conducted zone 点下对中 centering under roof station 地表移动 surface m ovement, ground movement地表移动观测站 observation station of groundmovement地表移动盆地 subsidence trough,subsidence basin顶板测点 roof station定向连接测量 connection survey for shaftorientation 定向连接点 connection point for shaft orientation 动用储量 mining-employed reserves 冻结凿井法施工测量 construction survey for shaft sinking by freezing method 断层几何制图 fault geometrisation 断裂带 fractured zone 防水煤岩柱 safety pillar under water-bodies 非充分采动 subcritical extraction 刚性结构措施 structure rigidity-strengthening measures 工业场地平面图 mine yard plan 贯通测量 holing-through survey 光电测距导线 EDM traverse 拐点偏移距 deviation of inflection point 滑动层 sliding layer 缓冲沟 buffer trench 灰分等值线图 isogram of ash content 回采煤量 workable reserves激光指向 laser guide几何定向orientation by shaft plumbing 近井点near shaft control point井底车场平面图shaft bottom plan井上下对照图location map, site plan井田区域地形图topographic map of [underground] mine field井筒煤柱开采shaft pillar extraction井筒延深测量shaft deepening survey井下测量underground survey井下平面控制测量underground horizontal control survey开采沉陷mining subsidence开拓煤量developed reserves抗变形建筑物deformation resistant structure垮落带caving zone矿区控制测量control survey of mining area 矿山工程测量mine survey矿体几何学geometry of ore deposits矿体几何制图geometrisation of ore deposits矿田区域地形图topographic map of mine field立井定向shaft orientation立井施工测量construction survey for shaft sinking立井十字中线标定setting out cross line of shaft center立井中心标定setting out shaft center联系测量transfer survey连接三角形法connection triangle method 裂缝角angle of break临界变形值critical deformation values露天矿测量open-pit survey煤层等厚线图isothickness map of coal seam煤层等深线图isobath map of coal seam煤层褶皱几何制图fold geometrisation of coal seam瞄直法alignment method逆转点法reversal point method柔性结构措施structure yielding measures 三量Three Class of Reserves设计损失allowable loss实际损失储量actual loss of reserves水平移动系数displacement factor损失率loss percentage投点shaft plumbing陀螺方位角gyroscopic azimuth陀螺经纬仪定向gyroscopic orientation survey 弯曲带sagging zone围护带safety berm下沉系数subsidence factor巷道坡度线标定sitting out roadway gradient 巷道碎部测量detail survey for mine workings巷道验收测量footage measurement of roadway巷道中线标定setting out center line of roadway岩层移动strata movement移动角angle of critical deformation 移动盆地主断面major cross-section of subsidence trough影响传播角propagation angle中天法transit method主要影响半径major influence radius主要影响角正切tangent of major influence angle注浆凿井法施工测量construction survey for grouting sinking准备煤量prepared reserves钻井凿井法施工测量construction survey for shaft boring最大下沉角angle of maximum subsidence最大下沉速度角angle of maximum subsidence velocity3.地下开采安全平盘safety berm安全水头safety water head暗井blind s haft, staple s haft薄煤层thin seam爆破采煤工艺blast-winning technology爆破装煤blasting loading变形压力rock deformation pressure闭路供水closed-circuit water supply闭式落煤顺序close-type winning sequence 不规则垮落带irregularly caving zone部分开采partial extraction侧支撑压力side abutment pressure采场延伸pit deepening采动裂隙mining-induced fissure采动应力mining-induced stress采垛winning-block采垛角angle of winning-block采段extracting zone采段流煤上山extracting block coal-waterrise采高mining height采掘带cut采掘区block采宽cut width采矿站ore station采空区goaf, gob, waste采装loading采煤方法coal w ining m ethod, coal mining method采煤工作面coal face, working face采区district, panel采区车场district station, district inset采区准备preparation in district采区设计mining-district design, panel design采区上山district rise采区石门district cross-cut采区下山district dip仓储采煤法shrinkage stoping长壁放顶采煤法longwall mining with sublevel caving长壁工作面longwall face超前巷道advance heading充采比stowing ratio充填倍线stowing gradient充填步距stowing interval充填沉缩率setting ratio充填法stowing method充填能力stowing capacity冲击地压rock burst, pressure bump 承压含水层上采煤coal mining above aquifer初撑力强度setting l o ad de nsity, SLD 初次放顶initial caving初次来压first weighting出气孔production well出气强度production well capacity出入沟main access储量备用系数reserve factor of mine reserve 垂直切片terrace cut slice大巷main roadway带压开采mining under safe water pressure of aquifer单侧沟hillside ditch单煤层大巷main roadway for single seam单体液压支柱hydraulic prop倒台阶采煤法overhand mining底板floor底板载荷集度floor load intensity底帮foot slope底部境界线floor boundary line底鼓floor heave地表境界线surface boundary line地下煤气发生场underground gasifier地下气化工作面underground gasification face 地下气化效率efficiency of underground gasification叠加应力superimposed stress顶板roof顶板单位破碎度specific roof flaking ratio顶板回弹roof rebound顶板垮落roof caving顶板垮落角roof caving angle 顶板冒落roof fall顶板破碎度roof flaking ratio顶板弱化roof weakening顶板台阶下沉roof step顶板稳定性roof stability顶板压力roof pressure顶帮top slope顶底板移近量roof-to-floor convergence顶底板移近率roof-to-floor convergence ratio动压巷道workings subject to dynamic pressure动载系数dynamic load coefficient端帮end slope端面距tip-to-face distance端面冒顶roof flaking短壁采煤法shortwall mining断壁工作面shortwall face墩柱heavy-duty pier, breaker props房顶柱breaker props房式采煤法room mining, chamber mining房柱式采煤法room and pillar mining封闭圈closed level放采比drawing ratio放顶caving the roof放顶距caving interval放煤步距drawing internal放煤顺序drawing sequence非工作帮non-working slope非工作帮坡面non working slope face风力充填pneumatic stowing分层开采slicing分层巷道slice drift, sliced gateway 分带strip分带集中斜巷strip main incline drift分带斜巷strop inclined drift分段sublevel分段平巷sublevel e ntry, longitudinal subdrift分级提运separate transport and hoisting分流站distribution station分期开采mining by stages分区开采mining by areas分区域开拓areas development辅助水平subsidiary level矸石带waste pa ck, strip pa ck, interval of moving monitor,unit advance of monitor高压大射流high pressure large diameter jet干馏----干燥带distillation and drying zone 工作帮working slope工作帮坡面working slope face工作阻力yield l o ad, working resistance工作面working face工作面顶板控制roof control工作面端头face end工作面回风巷tailentry, tailgate, return airway工作面运输巷headentry, headgate, haulage gateway工作平盘working berm工作线front构造裂隙tectonic fissure构造应力tectonic stress规则垮落带regularly caving zone固定路线permanent haulage line管道水力运输pipeline hydrotransport横向前移cross removal恒阻支柱yielding prop厚煤层thick seam后退式开采retreating mining后支撑压力rear abutment pressure滑移顶梁支架slipping bar composite support还原带reduction zone缓沟easy access缓慢下沉法gradual sagging method缓斜煤层gently inclined seam回采巷道gateway, entry, gate回柱prop drawing基本顶main roof机械充填mechanical stowing集中大巷gathering main roadway急斜煤层steeply pitching s eam, steep seam坚硬岩层strong strata, hard strata 间断开采工艺discontinuous mining technology架后充填backfill建筑物下开采coal mining under buildings 阶段horizon阶段垂高horizon interval阶段流煤上山extracting block coal-water roadway阶段流煤巷main coal-water rise阶段斜长inclined length of horizon近距离煤层contiguous seams近水平煤层flat seam井田尺寸[underground] mine field size 井田境界[[underground] mine field boundary井田开拓[underground]mine field development静压巷道workings subject to static pressure 掘进率drivage ratio局部冒顶partial roof fall开采水平mining level, gallery level 开采水平垂高lift, level interval开路供水open-circuit water supply开切眼open-off cut开式落煤顺序open-type winning sequence 开拓巷道development roadway抗压入强度press-in strength垮采比caving-height ratio垮落法caving method跨采over-the-roadway extraction 矿井初步设计preliminary [underground]mine design矿井服务年限[underground]mine life矿井井型production scale of [underground] mine矿井开拓设计[underground]mine development design矿井可采储量workable [underground]mine reserves矿井可行性研究[underground]mine feasibility study矿井设计[underground]mine design矿井设计储量designed [underground]mine reserves矿井设计能力designed[underground]mine capacity矿井施工设计[underground]mine construction design矿井延深shaft deepening矿区地面总体设计general surface layout of mining area矿区规模mining area capacity矿区开发可行性研究feasibility study forming area exploitation矿区总体设计general design of mining area 矿山压力rock pressure in mine矿山压力显现strata behaviors控顶距face width离层bed separation离层带注浆充填grouting in separated-bed立井开拓vertical shaft development联络巷crossheading连续开采工艺continuous mining technology溜井draw shaft溜煤石煤coal-water cross-cut溜眼chute临界滑面critical sliding surface漏顶face roof collapse with cavity 锚梁网支护roof bolting with bar and wire mesh煤壁wall煤层产出能力coal-seam productivecapacity煤房room, chamber煤浆coal slurry煤门in-seam cross-cut煤面清扫cleaning煤水泵slurry pump煤水比coal-water ratio煤水仓coal-slurry sump煤水洞室coal slurry preparation room煤炭地下气化站underground coal gasification station煤岩固化coal/rock reinforcement煤柱coal pillar煤柱支撑法pillar supporting method明槽水力运输flume hydrotransport摩檫支柱frictional prop逆向冲采contrary efflux片帮rib s palling, sloughing平盘berm平硐开拓drift development平行推进parallel advance破煤coal br e aking, coal c utting 破碎顶板fractured roof, friable roof 普氏系数Protodyakonov coefficient普通机械化采煤工conventionally-mechanized coal winning technology气化贯通linkage气囊支架air-bag support前进式开采advancing mining前支撑压力front abutment pressure强制放顶forced caving倾倒toppling倾斜长壁采煤法longwall mining to the dip or to the rise倾斜短壁水力采煤法shortwall hydraulic mining in the dip倾斜分层采煤法inclined slicing切口stable, niche区段district sublevel区段集中平巷district sublevel gathering entry, district main entry区段平巷district sublevel entry,district longtudinal subdrift区域性切冒extensive roof collapse全柱开采full pillar extraction柔性掩护支架flexible shield support人工顶板artificial roof入换spotting, train e xchange入换站exchange station三下"开采" coal mining under buildings, railways and water-bodies扫清平盘cleaning berm山坡露天采场mountain surface mine射流打击力jet impact force射流烛心动压力jet axis dynamical pressure 扇型推进fan advance上覆岩层overlying strata上装upper level loading上山rise, raise上挖up digging上行式开采ascending mining上行式开采upward mining十字定梁cross bar石门cross-cut始采线beginning l i ne, mining starting line双工作面double-unit f a ce, double face水采回采巷道stopping entry水力采煤hydraulic coal mining technology水平分层采煤法horizontal slicing水平分段放顶采煤法top-sliming system of sublevel caving水平切片dropping cut slice水枪monitor水体下采煤coal mining underwater-bodies松动压力broken-rock pressure松软岩层weak strata松软岩层soft strata塌落fall台阶坡面线bench angle掏槽slotting挑顶roof ropping特种支柱specific props铁路下开采coal mining under railways 推垮型冒顶thrust roof fall往复式开采reciprocating mining挖底floor dinting挖掘系数excavation factor围岩surrounding rock围岩稳定性stability of surrounding rock 伪顶false roof伪倾斜柔性掩护支架采煤法 flexible shield mining in the false dip伪斜长壁采煤法oblique longwall mining喂煤机coal feeder无井式地下气化法shaftless gasification无煤柱护巷non-chain-pillar entry protection无特种柱放顶caving without specific props, caving without breaker props无支柱距prop-free front distance下装lower lever loading下山dip下挖down digging下行式开采descending m ining, downward mining巷道断面缩小率roadway reduction ratio巷旁充填roadside packing巷旁支护roadside support限厚开采limited thickness extraction 限制区间limit section协调开采harmonic extraction斜井开拓inclined shaft development 斜切分层采煤法oblique slicing旋转式推进revolving mining,turning longwall循环working cycle循环进度advance of working cycle循环平均阻力time-weighted mean resistance循环平均阻力mean load per unit cycle岩层控制strata control岩石内摩檫角internal friction angle of rock 岩石软化系数softening factor of rock岩石碎胀系数bulking f a ctor, swell factor岩石粘聚力rock cohesion岩应力降低区stress-relaxed area氧化带oxidation zone沿空巷道gob-side entry掩护支架采煤法shield mining移道步距shift spacing移动坑线temporary working ramp移动线路shiftable haulage line应力增高区stress-concentrated area迎山角prop-setting angle有井式地下气化法shaft gasification有效支撑能力practical supporting capacity 原岩virgin r o ck, virgin r o ck mass原岩应力initial stress, stress in virgin rock mass员生裂隙initial fissure运输大巷main haulage roadway, main haulageway运输平盘haulage berm增阻支柱late bearing prop再倒退overcasting再生顶板mat, regenerated roof整层开采full-seam mining正台阶采煤法heading-and-bench mining 支撑效率supporting efficiency支撑压力abutment pressure支垛crib支护刚度support rigidity支护强度supporting intensity支架可缩量nominal yield of support支柱密度prop density直接顶immediate r o of, nether roof直进坑线straight ramp中厚煤层medium-thick seam 中斜煤层inclined s eam, pitching seam终采线terminal line周期来压periodic weighting主石门main cross-cut主要上山main rise主要下山main dip注砂井storage-mixed bin准备巷道preparation roadway自然平衡拱dome of natural equilibrium, natural arch自重充填gravity stowing自重应力gravity stress综合机械化采煤工艺fully-mechanized coal winning technology综合开采工艺combined mining technology 综合开拓combined development总回风巷main return airway纵向前移longitudinal removal走向长壁采煤法longwall mining on the strike 走向短壁水力采煤法shortwall hydraulic mining on the strike组合台阶bench group4.矿山机械工程安全绞车safety winch安全制动safety braking, emergency braking扒爪collecting-arm耙斗scraper bucket耙斗装载机slusher, gathering-armloader, collecting-arm loader,gathering-arm, scraper loader耙式浓缩机rake thickener包角wrap angle, angle ofcontact刨刀bit, plough c utter刨链plough chain刨煤机kohlenhobel, coal l o ader, plough, plow刨头plough head刨削阻力ploughing resistance刨削深度ploughing depth刨削速度ploughing speed本架控制local control变位质量equivalent mass不平衡提升unbalanced hoisting部分断面掘进机selective roadheader,partial-size tunneling machine侧卸式装载机side discharge loader采煤机coal winning machine采煤机械coal winning machinery,coal getting machinery采煤联动机coal-face winning aggregate缠绕式提升drum winding铲斗bucket铲斗装载机bucket loader铲装板apron铲煤板ramp plate铲入力bucket thrust force沉降过滤式离心脱水机screen-bowl centrifuge沉降式离心脱水机bowl centrifuge冲压式成型机briquetting machine成组控制batch c ontrol, bank control成型机briquetting machine出绳角elevation angle弛张筛flip-flop screen齿轨机车rack track car, rack track locomotive齿辊破碎机toothed roll crusher串车trip, train串车train磁选机magnetic separator错绳圈live turns带式输送机belt conveyor带压移架sliding advance of support 单轨吊车overhead m onorail, overhead rope monorail挡车栏arrester挡煤板spillplate底座base电气牵引electrical haulage电液控制elector-hydraulic control等厚筛banana vibrating screen低速刨煤slow-speed ploughing调高vertical steering调向油缸lifting ram调斜roll steering定距控制fixed-distance control定压控制fixed-pressure control动力刨煤机dynamischer Hobel, dynamic pl o ugh, activated pl o ugh多段提升multistage hoisting多水平提升multilevel hoisting, multilevel winding垛式支架chock端头支架face-end support对辊成型机roller briquetting machine 二级制动two stage braking, two period braking翻车机tippler, rotary c ar dum per 反井钻机raise boring machine防倒装置tilting prevention device防滑安全系数antiskiding factor防滑装置slippage prevention device 防坠器safety catches, parachute 放顶煤支架sublevel caving hydraulic support 风镐air pick, pneumatic pick 高速刨煤heigh-speed pl o ughing, rapid ploughing钢丝绳牵引运输wire rope haulage工作机构working m echanism, operating organ, service braking轨道运输track t r ansport, track haulage刮板链scraper chain, flight chain 刮板输送机scraper c onveyor, flight conveyor滚筒drum, pulley滚筒采煤机shearer, shearer--loader过渡槽ramp pan过放overfall过放高度overfall height, overfall distance, overfall clearance过卷overwind, overtravel过卷高度overwind he ight, overwind distance, overraiseclearance过滤机filter过煤高度under clearance, passage height under machine过速overespeed后牵引rear haulage弧形筛seive bend护帮板face guard滑架guiding ramp, plough guide滑行刨煤机Gleithobeol, sliding plough滑行拖钩刨煤机Gleit-schwerthobel, sliding drag-hook plough环式成型机impact briquetting machine 机面高度machine height机头部drive head unit机尾部drive end unit机械搅拌式浮选机subaeration flotation machine, agitation froth machine机械牵引mechanical haulage即时前移支架immediate forward support, IFS, one-web back system挤压成型机single lead-screw extruding briquetting machine间隔圈interval turns检验圈inspection cutting turns架节support uni t, support section胶套轮机车rubber-tyred locomotive截槽kerf截齿pick, bit截齿配距lacing pa ttern, pick arrangement进刀sumping截割比能耗specific energy of cutting截割部cutting unit截割高度cutting height截割滚筒cutting drum截割阻抗cutting resistance截割速度cutting speed, bit speed截割头cutting head截距ntercept截链cutting chain截煤机coal cutter截盘cutting j i b, cutting ba r截深web, web depth截线cutting line节式支架frame [support]经济提升量economic hoisting capacity, economic winding capacity经济提升速度economic hoisting speed, economic winding speed井架headframe, hoist t o wer, shaft tower井筒掘进机down-the hole shaft boring machine静力刨煤机statischer Hobel, static plough卷筒winding dr u m, hoist dr u m 掘进机械road heading machinery, driving machinery卡轨车road railer, coolie car可爬行坡度passable gradient可伸缩带式输送机extensible belt conveyor可弯曲刮板输送机flexible flight conveyor, armored face conveyor, AFC矿车mine car, pit tub矿井提升机mine winder, mine hoist 矿井提升绞车mine hoist矿井提升阻力winding resistance of mine 矿井提升阻力系数coefficient of winding resistance of mine矿用机车mine locomotive矿用绞车mine w inch, mine w inder 空气脉动跳汰机air pulsating jig空气室air chamber离地间隙ground clearance of machine 离心脱水机centrifuge立轮重介质分选机vertical lifting wheel separator连续采煤机continuous miner链牵引chain haulage邻架控制adjacent control落道derailment迈步支架walking support螺旋滚筒screw drum, helical vaner drum锚杆钻机roof bolter锚固支架anchor support煤电钻electric coal drill 煤浆准备器pulp preprocessor磨擦轮friction pul l ey, Koipe wheel磨擦圈holding t u rns, spare t u rns, dead turns磨擦式提升friction hoi s ting, Koepe hoisting磨砺性系数coefficient of abrasiveness内喷雾internal spraying内牵引internal traction, integral haulage爬车机creeper爬底板采煤机off-pan shearer,floor-based shearer,喷射式浮选机jet flotation machine偏角fleet angle平衡提升balanced hoisting铺网支架support with mesh-lying device骑槽式采煤机conveyor-mounted shearer普通机械化采煤机组conventionally-mechanized coal winning face unit气力输送air c onveying, pneumatic conveying气腿airleg牵引力haulage p ull, haulage speed牵引链haulage c hain, pulling chain牵引速度travel speed钎杆stem钎头bit, bore bi t钎尾shank, bit s hank前牵引front haulage前探梁forepole, cantilever r o of bar潜孔冲击器down hole hammer潜孔钻机down-the hole drill, percussive drill桥式转载机stage loader切槽cutting groove切削深度cutting depth驱动装置drive unit全断面掘进机tunnel boring machine, TBM, full facer乳化液泵站emulsion power pack人车man car伸缩梁extensible canopy深锥浓缩机deep cone thickener上出绳overlap上漂climbing输送conveying输送带conveying belt输送机conveyor双扭线机构lemniscate linkage双速刨煤dual-speed ploughing水力输送hydraulic conveying顺序控制sequential control梭行矿车shuttle car探钻装置probe drilling system提升不均衡系数hoisting unbalance factor提升富裕系数hoisting abundant factor提升容器hoisting conveyance提升容器自重减轻系数unbalance coefficient of sole weight of hoisting conveyance提升容器载重减轻系数unloading coefficient of hoisting conveyance搪瓷溜槽enameled trough天轮head sheave, sheave wheel 推车机car pusher, ram推移装置pusher jack拖板base plate, articulated拖钩刨煤机reibhakenhobel,drag-hook plough拖缆装置cable handler托辊carrying i d ler, supporting roller外喷雾external spraying外牵引external traction, independent haulage尾绳tail r o pe, balance r o pe无轨运输trackless transport无极绳牵引运输endless-rope haulage无链牵引chainless haulage下出绳underlap下切深度dinting de pth, undercut下扎dipping, penetration巷道掘进机roadheader, heading machine, tunneling m achine行走部travel unit, traction unit行走机构travel mechanism, traction mechanism行走驱动装置travel driving unit悬boom悬臂式掘进机boom-type r o adheader, boom roadheader压滤器press filter岩石电钻electric rock drill掩护梁caving shield掩护式支架shield [support]摇臂ranging arm摇床shaking table, concentrating table液压牵引hydraulic haulage液压支架hydraulic s upport, powered support圆盘式真空过滤机disk-type filter凿岩机hammer dr i ll, percussion rock drill凿岩台车drill jumbo, drill carriage 张紧装置tensioner, bridge conveyor, take-up device真空过滤机vacuum filter振动筛vibrating screen支撑式支架standing support支撑掩护式支架chock-shield [support]支架伸缩比support extension ratio制动空行程时间time lag, dead time制动系统brake system中部槽linepan终端载荷end load重力运输gravitational c onveying, gravity haulage主顶梁main canopy主绳main r o pe, head r o pe主尾绳牵引运输main-and-tail rope haulage, main-and-tail haulage专用车special car, specialty car装煤面coal loader装岩机rock loader装岩机muck loader装载机械loader自然加速度natural acceleration自然减速度natural deceleration综合机械化采煤机组fully-mechanized coalwinning face unit阻车器car stop, retarder钻杆drill rod钻井机shaft boring machine,shaft borer钻孔采煤机auger, auger machine钻孔机械drilling machine钻装机drill leader钻头bit, bore bi t钻巷机drift boring machine钻削采煤机trepanner钻削头treoan wheel最大工作高度maximum working height最大结构高度maximum constructive height最小工作高度minimum working height最小结构高度minimum constructive height最小转弯半径minimal curve radius5. 煤矿安全保护层protective seam被保护层protected seam避难硐室refuge chamber并联网络parallel network测风站air measuring station 残存瓦斯residual gas尘肺病pneumoconiosis抽出式通风exhaust ventilation串联通风series ventilation串联网络series network电化学式瓦斯测定器electrochemical type gas detector等积孔equivalent orifice低瓦斯矿井low gaseous mine调压室pressure balance chamber独立风流separate airflow惰性气体防灭火inert gas for fire extinguishing对角式通风radial ventilation反风reversing the air反风道air-reversing way反风风门doors for air reversing防爆门breakaway explosion door防尘口罩dust mask防火门fire-proof door防火墙fire stopping, water [proof] dam防水闸门water door, bulkhead风表校正曲线calibration curve of anemometer风窗air regulator风电闭锁装置fan-stoppage breaker风电甲烷闭锁装置fan-stoppagemethane-monitor breaker风阻特性曲线air way characteristic curve风量air flow, air quantity风量按需分配air distribution风量调节air regulation风量自然分配natural distribution of airflow 风门air door风墙air stopping风桥air crossing风筒air duct风硐fan drift风障air brattice分区通风parallel ventilation, separate ventilation粉尘dust粉尘采样器dust sampler粉尘粒度分布dust size distribution粉尘浓度dust concentration浮尘airborne dust辅助通风机booster fan负压negative pressure高瓦斯矿井gassy mine隔爆explosion suppression灌浆grouting呼吸器respirator呼吸性粉尘respirable dust混合式通风compound ventilation火风压fire-heating air pressure火区sealed fire area火灾气体fire gases回风return airflow 机械通风mechanical ventilaton甲烷报警器methane alarm甲烷断电仪methane-monitor breaker甲烷遥测仪remote methana monitor检定管detector tube角联网络diagonal network进风intake airflow局部阻力shock resistance局部阻力系数coefficient of shock resistance局部通风local ventilation局部通风机auxiliary fan绝对瓦斯涌出量absolute gas emission rate均压防灭火pressure balance for air control可燃性气体inflammable gases矿尘mine dust矿井堵水water-blocking in mines, sealing off mine water矿井防治水mine water management, prevention of mine water矿井火灾mine fire矿井空气mine air矿井空气调节mine air conditioning矿井内部漏风underground leakage矿井内部漏风率underground leakage rate矿井气候条件climatic condition in mine矿井通风mine ventilation矿井突水water bursting in mines矿井瓦斯mine gas矿井瓦斯涌出量[underground] mine gas emission rate矿井外部漏风surface leakage矿井外部漏风率surface leakage rate矿井有效风量effective air quantity, ventilation efficiency矿山救护队mine rescue team溃浆burst of mortar扩散器fan diffuser扩散通风diffusion ventilation漏风air leakage落尘settled dust煤(岩)与瓦斯突出coal (rock) and gas outburst mine煤(岩)与瓦斯突出矿井gas explosion煤层注水coal seam water infusion煤层透气性gas permeability of coal seam 煤层瓦斯含量gas content in coal seam煤层瓦斯压力coalbed gas pressure煤尘coal dust煤尘爆炸coal dust explosion煤尘爆炸危险煤层coal seam liable to dust explosion煤的自燃倾向性coal spontaneous combustion tendency。
基于静−动态特性协同感知的复杂工业过程运行状态评价褚 菲 1, 2 许 杨 1, 2 尚 超 3 王福利 4 高福荣 5 马小平 1摘要 针对当前过程监测和运行状态评价方法等对工况信息感知不全面、漏报和误报现象严重等问题, 在深入研究工业现场数据静−动态特性协同感知方法的基础上, 提出关键性能指标(Key performance indicators, KPI)驱动的慢特征分析(Slow feature analysis, SFA)算法. 将关键性能指标信息融入到慢特征分析中, 协同感知复杂工业过程的静−动态特性变化, 并进一步通过计算潜变量之间的相似度及其一阶差分间的相似度实现对过程稳态和过渡的评价. 在此基础上, 建立基于静−动态特性协同感知的过程运行状态评价统一框架. 针对非优状态, 提出基于稀疏学习的非优因素识别方法, 实现对非优因素变量的准确识别. 最后, 通过重介质选煤过程实际生产数据和田纳西·伊斯曼(Tennessee Eastman, TE)过程数据验证了该方法的有效性.关键词 复杂工业过程, 运行状态评价, 静−动态特性协同, 慢特征分析, 稀疏学习引用格式 褚菲, 许杨, 尚超, 王福利, 高福荣, 马小平. 基于静−动态特性协同感知的复杂工业过程运行状态评价. 自动化学报, 2023, 49(8): 1621−1634DOI 10.16383/j.aas.c201035Evaluation of Complex Industrial Process Operating State Based onStatic-dynamic Cooperative PerceptionCHU Fei1, 2 XU Yang1, 2 SHANG Chao3 WANG Fu-Li4 GAO Fu-Rong5 MA Xiao-Ping1Abstract Current process monitoring and operation performance evaluation methods suffer from inadequate cap-turing of process information as well as severe missed and false alarms. By performing in-depth analysis of methods for concurrent monitoring static-dynamic characteristic of industrial data, this paper proposes a key performance in-dicators (KPI)-driven slow feature analysis (SFA) algorithm. It integrates KPI information into SFA model in or-der to concurrently capture static-dynamic characteristic changes of complex industrial processes. The similarity between latent variables and that between first-order differences are computed to evaluate the optimality of static and transitional operations. On this basis, a unified framework for process operation performance assessment is es-tablished based on an integrated perception of static-dynamic characteristics. A sparse learning-based non-optimal factor identification method is proposed to effectively highlight root-cause variables that cause unsatisfactory per-formance. The feasibility and effectiveness of the proposed method are validated based on data collected from a real-world dense medium coal preparation process and the Tennessee Eastman (TE) process.Key words Complex industrial process, operation performance assessment, static-dynamic cooperative, slow fea-ture analysis (SFA), sparse learningCitation Chu Fei, Xu Yang, Shang Chao, Wang Fu-Li, Gao Fu-Rong, Ma Xiao-Ping. Evaluation of complex in-dustrial process operating state based on static-dynamic cooperative perception. Acta Automatica Sinica, 2023, 49(8): 1621−1634收稿日期 2020-12-14 录用日期 2021-06-06Manuscript received December 14, 2020; accepted June 6, 2021国家自然科学基金(61973304, 62003187, 62073060, 61873049),江苏省科技计划项目(BK20191339), 江苏省六大人才高峰项目(DZXX-045), 徐州市科技创新计划项目(KC19055), 矿冶过程自动控制技术国家重点实验室开放课题(BGRIMM-KZSKL-2019-10)资助Supported by National Natural Science Foundation of China (61973304, 62003187, 62073060, 61873049), Jiangsu Science and Technology Plan Project (BK20191339), Six Talent Peak Projects of Jiangsu Province (DZXX-045), Science and Techno-logy Innovation Plan Project of Xuzhou (KC19055), and Open Foundation of State Key Laboratory of Process Automation in Mining and Metallurgy (BGRIMM-KZSKL-2019-10)本文责任编委 谢永芳Recommended by Associate Editor XIE Yong-Fang1. 中国矿业大学信息与控制工程学院 徐州 2211162. 中国矿业大学地下空间智能控制教育部工程研究中心 徐州 2211163. 清华大学自动化系 北京 1000844. 东北大学信息科学与工程学院沈阳 1108195. 香港科技大学化工系 香港 9990771. School of Information and Control Engineering, China Uni-versity of Mining and Technology, Xuzhou 2211162. Under-ground Space Intelligent Control Engineering Research Center of the Ministry of Education, China University of Mining and Tech-nology, Xuzhou 2211163. Department of Automation, Tsinghua University, Beijing 1000844. College of Information Science and Engineering, Northeastern University, Shenyang 1108195. Department of Chemical Engineering, Hong Kong University of Science and Technology, Hong Kong 999077第 49 卷 第 8 期自 动 化 学 报Vol. 49, No. 8 2023 年 8 月ACTA AUTOMATICA SINICA August, 2023为保障生产安全和产品质量, 实现综合经济效益最大化, 过程监控一直是工业领域关注的热点问题. 目前, 过程监控技术已成功应用于矿物加工、冶金、石化等重要工业制造过程, 并取得了一定的经济效益[1−2]. 然而, 传统过程监测技术仅关注异常工况的发生, 在未出现显著异常状况时, 由于过程存在扰动和不确定性, 导致非优乃至较差运行状态频发. 特别在中国矿产加工、冶金等典型原材料加工工业中, 尚存在原材料变化频繁、运行环境复杂恶劣、工况波动剧烈、设备运行状态不佳、产品质量和工艺参数无法实时全面检测, 导致生产过程非优状态频发, 运行控制效果难以满足实际生产要求[3−5].近年来兴起的过程运行状态评价方法对保障复杂工业过程安全高效运行、提高企业综合经济效益具有重要意义[6−11], 相关研究成果不断涌现. 例如文献[6]提出一种面向多模式工业过程操作安全性和最优性评价的概率框架, 利用高斯混合模型表征多种模式, 构建安全性和最优性指标, 并将其划分为不同的稳态等级进行评估. 然而, 对于非最佳的运行状态, 该方法仅能进行定性分析, 无法实现定量分析. 文献[7]提出一种基于Fisher判别分析的过程运行状态在线评价方法, 通过计算数据间相似度将未知状态样本划分至相似状态, 有效减小数据间差异, 提高运行状态识别准确率; 然而, 该方法未能提取与综合经济性能指标相关的信息, 无法进一步识别非优因素. 文献[8]基于构建的软测量模型, 提出一种关键指标协调优化方法, 实现了生产过程的优化运行. 文献[9]提出一种基于综合经济指标相似度的在线运行状态评价方法, 将全潜结构投影法应用于过程运行状态评价. 通过进一步分解偏最小二乘法(Partial least squares, PLS)子空间和残差空间中的变化, 有效提取与综合经济指标相关的信息, 据此将数据分为不同的稳态等级, 进而实现在线评价. 文献[10]提出一种针对非高斯过程的运行最优性评估及非最优因素识别策略, 利用生产过程过渡阶段信息实现在线评估, 并结合各操作量贡献率识别非优因素[11]. 文献[12]提出一种全潜鲁棒偏M估计的最优状态鲁棒评价方法, 通过对样本数据进行加权, 消除离群点影响, 提高算法在低质量工业数据建模中的鲁棒性.然而, 由于实际生产装置存在明显动态变化和反馈调节作用, 过程测量值具有显著时序相关性,呈现出典型的多维时间序列特点. 单一的静态特性分析往往无法全面感知工况特性, 导致模型泛化能力不足、误报与漏报问题严重[13−15]. 为此, 学者们提出了各类过程动态统计建模方法, 例如动态PLS[16]、多尺度主元分析[17]、状态空间方程[18]等. 然而, 这些s P(s)˙s P(˙s)P(s)P(˙s)方法均无法清晰区分过程动态信息与稳态信息, 因而对动态异常不敏感. 慢特征分析(Slow feature analysis, SFA)算法是一种有效的能从多维时间序列数据中提取动态信息的无监督学习方法. SFA的潜变量具有缓慢变化的特点, 有效从多维时序数据中抽取过程内在动态变化规律. 文献[13]基于SFA 对潜变量的稳态分布与的动态分布分别进行统计建模, 利用稳态分布描述过程变量的静态特性, 利用描述过程动态特性, 赋予监控统计量不同物理含义, 提供更全面的监控信息.此外, SFA与现代控制理论存在着深刻联系, 因而,近年来在过程数据解析中得到了广泛应用[19−20]. 文献[21]结合典型变量分析与慢特征分析, 实现过程静态和动态特征的充分提取, 进而实现过程运行状态的有效监控. 文献[22]考虑闭环控制下过程动态特征提取, 实现对不同操作状态的精细识别. 文献[23]通过慢特征分析提取动态信息, 提出了动静协同的随机森林法, 实现过程故障精准分类.在实际生产过程中, 经验丰富的工程人员对过程运行状态的定性评价可归结为某种抽象的、蕴含深层次运行状态信息的关键性能指标(Key per-formance indicator, KPI), 用来统一表征大型复杂工业过程中决策者最为关心的重要指标, 这与温度、压力、流量、液位等传统参数共同构成了多源、多率过程数据. 然而, KPI本质上描述了低水平技术/组件的性能与高水平生产质量、生产效率、能源和原材料消耗等之间的定量关系[24−25], 而这类信息通常难以在线实时获取, 特别是运行状态评价信息需要有经验的操作人员进行事后分析得到. 一个可行的思路是基于历史数据建立运行状态评价结果与快采过程变量之间的数学模型, 实现KPI的实时预报,从而对过程运行状态进行在线评价. 例如文献[26]通过引入性能指标对SFA的动态信息进行分解的方法, 实现过程性能指标的有效监控.t˙t为此, 本文提出一种新的KPI驱动的SFA模型(KPI-driven SFA), 通过对过程静−动态特性进行协同感知, 实现了对运行状态评价信息的有效利用. 在此基础上, 建立了新的复杂工业过程运行状态评价框架. SFA本身是一种无监督学习算法, 本文在SFA优化目标的基础上, 进一步挖掘潜变量和KPI之间的相关关系, 将运行状态评价信息有效融入到慢特征中, 在协同感知过程静−动态特性信息的同时, 显著提高慢特征与KPI之间的相关性.通过计算潜变量及其一阶差分之间的相似度,从稳态和动态的不同角度对过程运行状态进行精细描述, 加深工程人员对过程运行状态的认知与理解.1622自 动 化 学 报49 卷在监测到非优状态后, 操作人员最为关心的问题在于如何有效识别导致运行状态非优的关键变量. 传统识别方法主要基于贡献图技术, 在变量维数高、采样噪声及干扰显著时, 存在识别准确率低等不足. 本文在运行状态评价方法基础上, 进一步提出一种基于组套索的非优因素识别方法, 消除无关变量影响, 实现对非优因素变量的准确识别, 为工程人员进行针对性调整与维护, 提供有益的指导信息.本文主要贡献如下: 1)采用KPI-driven SFA 算法, 实现对过程静−动态特性的感知, 通过对过程内在本质特性的提取, 能有效减少过程噪声等过程不确定性影响, 相比于一般运行状态评价方法, 本文方法能有效减少误报和漏报问题; 2)提出基于静−动态特性协同感知的运行状态评价统一框架, 在过程静−动态特性感知的基础上, 设计过程动态特性指标, 与静态指标协同分析, 对过程状态精确评价的同时实现对过程状态变化方向的识别; 3)提出基于稀疏学习的非优因素识别方法, 从原始过程变量中获取具有组稀疏性的非优因素变量, 通过组套索对某个变量在过去一段时间内的总体贡献施加惩罚, 迫使无关变量的整体贡献率趋近于零, 从而实现对非优因素变量的准确识别和定位.1 基于静−动态特性协同感知的运行状态评价方法框架基于静−动态特性协同感知的运行状态评价方法由离线建模、在线评价和非优因素识别3个步骤组成.针对传统方法无法全面感知工况信息的局限,本文提出KPI-driven SFA离线建模算法, 对建模数据静−动态信息进行协同提取, 并根据与经济指标的相关度对特征空间进行分解, 分离出与经济指标直接相关的特征空间, 并计算隐变量样本的一阶差分, 建立运行状态离线评价模型. 在此基础上, 提出基于静−动态特性协同感知的在线评价方法, 有效应对了当前运行状态评价方法难以全面解析过程静−动态特性的问题. 通过制定有效评价规则, 依据静态评价指标大小, 实现过程运行状态的在线识别;依据动态评价指标大小, 实现过程运行状态变化趋势在线识别, 完成对各个状态和过渡过程的综合评价. 针对非优因素识别容易将其他非关键变量错误识别出来的问题, 本文提出基于稀疏学习的非优因素识别, 通过组套索技术惩罚某个变量在过去一段时间的总体贡献, 迫使无关变量整体贡献率近似为零, 基于组贡献率(Group-wise contribution, GWC)大小, 有效识别非优因素变量.1.1 基于KPI-driven SFA的运行状态离线评价模型YY假设存在m个过程变量, X为由m维输入向量组成的数据矩阵, 为过程KPI指标向量. 过程运行状态评价通常可分为不同等级. 不失一般性,假设状态评价分为 {优, 良, 中} 3个等级, 此时可利用标量数值 {1, 2, 3} 来分别表示不同状态评价结果, 以此构成KPI指标向量.XSFA是一种无监督学习方法, 通过解决以下优化问题, 将映射至低维空间, 得到若干个互不相关的慢特征:需要满足的约束如下:˙s j(t)=s j(t)−s j(t−1)⟨·⟩tg j(·)g j(x):=w T j x对于离散时间信号, 表示其差分, 表示对时间的期望. 约束(2)和约束(3)要求特征具有零均值和单位协方差, 而约束(4)确保特征之间不会包含重复信息. 若为线性映射且仅求取变化最缓慢的特征, 上述优化问题可近似地描述为如下问题:∆X˙x(t)式中, 是由输入一阶差分构成的矩阵. 为解决潜变量与产品质量指标无关的局限性, 文献[27]提出一种与质量相关的监督学习方法. 在此基础上,本文提出一种适用范围更广、由关键性能指标驱动的慢特征分析方法即KPI-driven SFA, 抽取具有慢变化特点且与KPI紧密相关的潜变量. 其优化问题如下:S=Xwα>0αα式中, 目标函数的第1项旨在最大化潜变量向量和KPI指标向量Y之间的协方差, 实现慢特征与KPI之间具有显著互相关性, 第2项是SFA 的原始目标. 为正则化惩罚因子, 在两个目标间取得适当权衡, 使慢特征能同时描述过程变量X和KPI中的信息. 的合理取值存在上界, 随着取值不断增大, 目标函数中的二次项权值矩阵8 期褚菲等: 基于静−动态特性协同感知的复杂工业过程运行状态评价1623X T Y Y T X −α∆X T ∆X 终将不再正定, 导致预测误差开始增加, 并且产生振荡[27].为方便优化问题求解, 本文在非线性迭代偏最小二乘算法[28]基础上进行改进, 得到问题(5)的最优解, 其基本步骤见算法1. 算法1. KPI-driven SFA 算法X 、∆X 、Y 输入. 训练样本 .R t 输出. 回归系数矩阵 , 得分向量 .Y a u a ∆X a r a 1)随机选取 的一列作为初始 , 的一列作为初始 ;w 02)计算投影方向 :w a 3)得到标准化的向量 :X a w a X a ∆X aw a ∆X a 4)将 投影到 , 得到的得分即慢特征; 将 投影到 , 得到 的得分:5)剩余步骤与传统的非线性迭代偏最小二乘算法相同.˙s 利用KPI-driven SFA 得到的慢特征s (t )涵盖了过程数据中的静−动态信息. 当KPI 指标来自运行状态评价结果时, s (t )同时包含了过程运行状态评价信息, 在此基础上能够实现运行状态的准确评价. 基于各个不同状态等级下的过程数据, 计算其对应的慢特征s , 并对其求一阶差分 用以描述过程动态特性变化, 进而实现过程静−动态特性协同感知.1.2 基于静−动态特性协同感知的运行状态在线评价方法在对过程运行状态进行在线评价时, 首先利用滑窗长度为H 的在线数据, 计算其潜变量s 与各个状态等级下潜变量s 的中心距离:˙s 以及潜变量一阶差分 之间的中心距离:s c n N c ˙s c n d c k 式(7) 、式 (8)中, 表示在稳态运行状态c 下采集得到的 个潜变量样本, 表示其对应的一阶差分样本. 表示当前过程运行状态与运行状态 c˙d c k˙d c k 之间的相似度, 表示在不同运行状态之间进行过渡的趋势. E 为偏置量 , 以体现趋势方向. 若当前变化趋势与向量E 相反, 则 值较小. 图1为基于静−动态特性协同感知的复杂工业过程运行状态在线评价流程图.图 1 基于静−动态特性协同感知的复杂工业过程运行状态在线评价流程图Fig. 1 Flow chart of online evaluation of the operatingstatus of complex industrial processes based onstatic-dynamic cooperative perception1.3 基于稀疏学习的非优因素识别现有的非优因素识别方法以静态变量贡献分析为基础, 仅考虑了在当前采样时刻导致非优运行状态的过程变量. 然而, 由于复杂工业过程通常干扰大、噪声强、变量耦合严重, 传统的静态识别方法易受干扰和噪声的影响, 常常会将其他非关键变量识别为非优因素, 采集得到的工业数据质量普遍偏低,误导工程人员的操作决策[29−33]. 为此, 本文提出一种新的基于稀疏学习的非优因素变量识别方法. 非优的过程运行状态往往由少数变量引起, 因此非优变1624自 动 化 学 报49 卷d c k Index (x )=x TMx d c k 量呈现出一定稀疏性, 其识别在某种意义上可视为稀疏学习问题, 其中最为常用的技术是套索正则化技术. 然而, 在线应用时, 常使用滑动窗方法增强信噪比、降低噪声扰动对评价结果的影响, 窗口中不仅包含当前时刻的采样数据, 还包括若干个历史采样点. 此时, 模型的不同输入变量形成了若干组, 每组对应滑窗内某个变量的所有样本. 因此, 非优变量呈现出典型的组稀疏性, 套索正则化方法无法得到具有组稀疏性的估计值. 为此, 本文采用文献[34]提出的基于组套索正则化的故障诊断方法, 通过组套索惩罚某个变量在滑窗内的总体贡献, 迫使无关变量的整体贡献率趋近于零, 从原始过程变量中恢复得到具有组稀疏性的非优因素变量. 首先, 将相似度 表示为二次型形式 , 通过极小化 并利用组套索, 对某个变量在滑动窗内的总体贡献进行正则化惩罚, 根据组贡献识别导致过程状态发生变化的非优变量粗略位置, 之后观察变量贡献率确定非优变量.基于组套索正则化的思想, 所有过程变量按照时间可以分为m 个不同的组:x k 式中, 每组向量 的维度等于滑动窗口长度H . 非优变量识别问题转化为以下的无约束优化问题:m k =1√H ∥f k ∥2λ>0λλf →x λ{λ*1>λ*2>···>λ*J }λλ*f x ∗=x −f d c k λ式中, 组套索惩罚项 迫使解在组层面具有稀疏性. 正则化因子 用来平衡两个不同目标, 若 过大, 则迫使所有变量的贡献都近似为零; 若 太小, 使得 导致辨识结果失去意义. 因此, 的选择对非优变量识别结果具有显著影响. 一个可行的思路是, 基于一个有限的单调序列 选取 , 并且保证每个 对应不同的组稀疏性; 若 估计得当, 则重构变量 与最优状态具有较高的相似度 , 因此 的取值可按照如下准则确定[34]:γM Index (x ∗)λ式中, 为在最优操作条件下二次型指标Index(x )的控制限. 由于二次型指标 服从自由度为M 的卡方分布, 因此若 取值合适,统计量会落入控制限内, 这表明非优变量因素识别基本正确,在对输入变量进行重构后, 数据分布接近最优操作工况.通过求解式(10)中的正则化最小二乘问题[35],能够得到简明、清晰、复杂度低的识别结果, 同时增强结果的可解释性. 其中组贡献率定义如下:式中, 无关变量的组贡献率往往为零, 因此可以通过观察各个组贡献率确定非优变量.图2给出了基于稀疏学习的非优因素识别算法步骤.图 2 基于稀疏学习的非优因素识别流程图Fig. 2 Traceability flowchart of non-optimal factorsbased on sparse learning综上, 基于静−动特性协同感知的复杂工业过程运行状态评价的具体步骤如下:(x,y )(x c ,y c)c 步骤1. 假设历史数据 , 并且已知其中数据 对应的状态等级;步骤2. 对X 和Y 标准化后, 构建算法模型,求解目标函数(6):R (i )步骤3. 计算综合KPI 指标 :R (i )R R q 步骤4. 依据指标 对 中与质量无关的特征进行剔除, 得到 ;步骤5. 计算各个状态的得分向量:步骤6. 对各个状态得分向量进行时序增广, 计8 期褚菲等: 基于静−动态特性协同感知的复杂工业过程运行状态评价1625˙t c 算各个状态得分向量的一阶差分, 得到 ;K 步骤7. 构造时刻 的滑动数据窗口, 构建时序增广矩阵后进行标准化;步骤8. 计算在线数据中的得分向量:˙ton,q 步骤9. 对在线数据得分向量进行时序增广, 计算在线数据得分向量的一阶差分, 得到 ;步骤10. 根据式(7)计算的得分向量与各个状态等级的中心距离;步骤11. 根据式(8)计算的得分向量一阶差分与质量相关集合的状态等级一阶差分中心距离;d c k 步骤12. 根据距离 , 定义在线数据相对于各个状态等级的评价指标:˙d c k步骤13. 根据距离 , 定义在线数据相对于各个状态等级的评价指标:步骤14. 根据评价指标对过程运行状态进行在线评价:γc k =max 1≤c ≤C {γck }>δ˜c 1)当 时, 表明在线数据中的质量相关过程变异信息与状态等级中的变异信息一致, 可以判定过程的运行状态为 ;γ˜c k −w +1<···<γck 2)当1)不满足但条件 成立,表明过程运行状态正处于状态等级转换过程中, 即当前过程逐渐变化;˙γk =max {˙γk }=1/23)当 时, 表明在线数据中与质量相关过程变异信息的变化速度未有明显变化, 可以判定过程状态未发生变化;˙γk =max {˙γk }<1/24)当 时, 表明在线数据中与质量相关过程变异信息的变化速度发生变化,可以判定过程处于过渡状态, 并且过程状态开始恶化;˙γk =max {˙γk }>1/25)当 时, 表明在线数据中与质量相关过程变异信息的变化速度发生变化,可以判定过程处于过渡状态, 并且过程状态开始优化.步骤15. 综合步骤14结果, 给出静−动态特性协同感知的运行状态评价结果:˜c ˜c 1)当满足步骤14中1)和3)时, 从静态信息得出状态等级为 , 且动态信息表明过程状态未发生变化, 可以判定过程的运行状态为 ;γ˜c k =max 1≤c ≤C {γck }˜c 2)当满足步骤14中3)但不满足2)时, 从静态信息得出状态等级未发生变化且 ,动态信息表明过程状态未发生变化, 可以判定过程的运行状态为 ;3)当满足步骤14中2)和4)时, 从静态信息得出状态发生变化, 且动态信息表明过程状态向较差方向发展, 可以判定过程的运行状态为由前一时刻状态向较差的过程状态发展;4)当满足步骤14中2)和5)时, 从静态信息得出状态发生变化且动态信息表明过程状态不断改善, 可以判定过程的运行状态为由前一时刻状态向较好的过程状态演化;5)如果步骤14中1) ~ 5)都不满足, 则维持上一时刻评价结果.(B 1,B 2,···,B m )步骤16. 针对评价结果为非优的过程, 进行非优因素识别, 按采样时间对过程数据分组 ; 构建非优识别的无约束优化函数, 求解目标函数式(10);步骤17. 依据式(12), 计算各个组贡献率;步骤18. 根据计算的贡献率, 识别导致过程运行状态非优的原因变量.2 重介质选煤过程案例分析下面将本文提出的基于静−动态特性协同感知的复杂工业过程运行状态评价方法应用于重介质选煤过程, 基于来自实际现场的数据, 验证其有效性.2.1 重介质选煤过程简介重介质选煤过程是一个典型的运行在恶劣开放环境中的流程工业过程, 各种不确定性因素和干扰频发, 导致数据质量偏低. 选煤也称为洗煤, 主要目标是除去原煤中的杂质, 降低灰分含量, 为用户提供精煤和动力煤.选煤工艺流程中涉及多种仪表, 其大致工艺流程为: 原煤经过破碎和筛分, 粒度达到要求后, 与来自合格介质桶的重介质在混料桶内进行混合, 后经压力泵送入重介质旋流器分选, 最终选出灰分低的精煤和灰分高的尾煤; 重介质旋流器的底流和溢流经过脱介筛进行重介回收, 再经加水和磁铁粉调配密度后, 重新进入分选过程, 如图3所示[36]. 其中,煤泥含量是一个比较重要的变量. 评价煤的质量指标在煤炭行业中一般是灰分的量, 包括原煤和精煤.悬浮液密度作为分选环节中的重点控制指标,其主要作用是密度反馈, 确保密度可以符合稳定性的要求以及保障洗选产品的质量. 一般情况下, 对洗选后获得的精煤, 往往是通过快灰化验的方式来1626自 动 化 学 报49 卷。
Mining Cross-graph Quasi-cliques in Gene Expression and Protein Interaction Data∗Jian Pei Simon Fraser University jpei@cs.sfu.caDaxin JiangState University of New York at Buffalodjiang3@Aidong ZhangState University of New York at Buffaloazhang@1Problem Description and ModelA protein is the product of a gene.From the gene expres-sion data,we canfind co-expressed genes,which are groups of genes that demonstrate coherent patterns on samples.On the other hand,from the protein interaction data,we canfind groups of proteins that frequently interact with each other. If we can conduct a joint mining of both gene expression data and protein interaction data,then we mayfind the clus-ters of genes that are co-expressed and also their proteins interact.Such clusters found from the joint mining are interesting and meaningful for at least two reasons.First,both the gene expression data and the protein data are very noisy.The clusters confirmed by both data sets will strongly indicate the correlation/connection among the genes in a cluster.In other words,the clusters found from the joint mining are more reliable.We may thus have the high confidence that the genes in a cluster found as such are regulated by the same mechanism or belong to the same biological process.Second,although highly related,gene expression data and protein interaction data still carry different biological meaning.The coincidence of co-expressed genes and in-teracting proteins is biologically significant.As indicated in[5],many pathways exhibit two properties:their genes exhibit a similar gene expression profile,and the protein products of the genes often interact.1.1ModelTechnically,a gene expression data set is a matrix W= {w ij}for a set G of n genes and a set S of m samples, where w i,j(1≤i≤n,1≤j≤m)is the expression level of gene g i on sample s j.Two genes g1and g2are called coherent if they show similar expression patterns on the set of samples.There are ∗This research is partly supported by the Endowed Research Fellow-ship and the President Research Grant from Simon Fraser University,NSF grants IIS-0308001,DBI-0234895,and NIH grant1P20GM067650-01A1.All opinions,findings,conclusions and recommendations in this paper are those of the authors and do not necessarily reflect the views of the funding agencies.different methods to measure the similarity(or distance)be-tween gene expression patterns as required by the applica-tion domain,such as Euclidean distance,Pearson’s correla-tion coefficient,KL-distance[2],and pattern-based similar-ity measures[1,6].Without loss of generality,in this paper, we simply assume that a similarity measure sim(·)is spec-ified,and the higher the similarity value,the more similar the genes.We can define a binary relation∼on the set of genes. For genes g1and g2,g1∼g2if sim(g1,g2)≥δ,whereδis a user-specified minimum similarity threshold.Naturally,the relation∼can be represented as gene ex-pression graph geneG=(G,E):the genes are the vertices, and(g1,g2)∈E if g1∼g2.Similarly,for a set of proteins P,if we have the data about the interactions between proteins,we can define a protein interaction graph proteinG=(P,I):the proteins are treated as vertices,and(p,p )∈I if proteins p and p interact with each other.For gene expression data,a subset of genes forms a per-fect cluster if each gene in the subset is similar to all the others in the same subset.For protein interaction data,a subset of proteins forms a perfect cluster if each protein in the subset interacts with all the others in the same subset.To generalize,in gene expression/protein interaction graphs,a perfect cluster is a clique1.However,due to the noise in the data sets,we may not be able to expect perfect clusters.Instead,a user may be interested in a subset of genes/proteins as a cluster such that each gene in the subset is similar to most of the other genes in the cluster,and each protein in the subset interacts with most of the other proteins in the cluster.To quantify,for a user-specified thresholdγ(0<γ≤1),a subset C of k genes forms aγ-quasi-cluster if each gene g∈C is similar to at leastγ·(k−1)other genes in C.Similarly,we can defineγ-quasi-cluster for protein interaction data.Clearly,a maximalγ-quasi-cluster is aγ-quasi-clique in the corresponding gene expression/protein interaction graph.1In this paper,we follow the terminology usage that a clique is a maxi-mal subset of mutually adjacent vertices in a graph.Since a protein is a product of a gene,there is a mapping f from the set of proteins P to the set of genes G:f(p)=g if protein p is the product of gene g.We are particularly interested in subsets of proteins C such that C is aγ1-quasi-cluster in the protein interaction data and{f(c)|c∈C}is aγ2-quasi-cluster in the gene expression data,whereγ1andγ2are user-specified param-eters.We call C a cross-data set cluster.Moreover,C is particularly interesting if it is maximal.1.2Why Is the Problem Challenging?One may ask,“Can we solve the joint mining problem by a simple extension of the existing techniques?”Unfortu-nately,the answer is no.A natural thinking may be as follows.We can integrate the multiple graphs into one based on a similarity function between data objects.The integrated similar function com-bines the similarity between data objects in different data sets in some weighted manner.Then,we canfind quasi-cliques in the integrated graph.However,the above na¨ıve method does not work at all. The key is that vertices of cross-graph quasi-cliques can be connected in different ways in individual graphs.Therefore, the integrated graph cannot capture the cross-graph quasi-cliques.It is easy to come up with a counter example to show that a cross-graph quasi-clique is not a quasi-clique in the integrated graph.2Experimental ResultsWe use the cell-cycle gene expression data CDC28and the corresponding protein-protein interaction data from DIP as the data set.We found4,668matched gene-protein pairs between CDC28and DIP.For CDC28data set,we set the coherence thresholdρ=0.5(using Pearson’s correlation coefficient as measure).As a result,the gene graph G E contains865,080edges whose both endpoints(genes)ap-pear in the matched gene-protein pairs.After removing the self-interacting protein pairs,the protein graph G P contains 15,115edges whose both endpoints(proteins)appear in the matched gene-protein pairs.In our experiments,wefind the complete set of quasi-cliques across the gene graph G E and the protein graph G P.We setγE=1for G E,γP=0.5for G P,and min s=5.That is,we are interested in a subset of at least5genes whose expression patterns are coherent with each other,and the corresponding proteins frequently inter-act with each other.Figure1shows an example pattern Q(γE=1and γP=0.4).The induced graph of G E(the gene expres-sion graph)on Q is a perfect clique,so we only show the induced graph of G P(the protein interaction graph)on Q here.The pattern contains11vertices.We use the ORF (Open Reading Frame)names to identify the corresponding genes and proteins.Figure1.A cluster of11proteins.Although the exact biological meaning of this pattern is still under investigation,it is very interesting in biol-ogy since these11genes are highly coherent and the cor-responding11proteins are intensively interacting.3Related WorkFor more examples on joint mining of multiple sources, Page and Craven[3]surveyed the biological applications of mining multiple tables,such as pharmacophore discov-ery,gene regulation,information extraction from text and sequence analysis.Recently,joint mining of multiple biological data sets has received intense interest.As a pioneer work,Segal et al.[5]proposed a unified probabilistic model to learn the pathways from gene expression data and protein interaction data.However,their method requires the users to input the number of pathways that is usually unknown in advance.4About the Full Version of This PaperIn the full version of this paper[4],we built a general model,investigated the properties of the problem and the computational complexity,and developed an effective and efficient algorithm to tackle the problem.A systematic per-formance study was also reported.References[1]Cheng,Y.and Church,G.M.Biclustering of expression data.Proceedings of ISMB’00,8:93–103,2000.[2]Kasturi,J.,Ramanathan,M.and Acharya,R.An informationtheoretic approach for analyzing temporal patterns of gene ex-pression.Bioinformatics.[3]Page,D.and Craven,M.Biological Applications of Multi-relational Data Mining.SIGKDD Explorations,5(1):69–79, July2003.[4]Pei,J.,Jiang,D.and Zhang,A.On Mining Cross-graphQuasi-cliques.Technical Report TR2004-15,School of Com-puting Science,Simon Fraser University.[5]Segal,E.,Wang,H.and Koller,D.Discovering molecularpathways from protein interaction and gene expression data.Bioinformatics,19:i264–i272,2003.[6]Wang,H.,Wang,W.,Yang,J.et al.Clustering by PatternSimilarity in Large Data Sets.In Proceedings of SIGMOD’02, pages394–405,2002.。
A Partial Join Approach for Mining Co-location Patterns∗Jin Soung Y oo Department of Computer Science andEngineeringUniversity of Minnesota200Union ST SE4-192Minneapolis,MN55414jyoo@Shashi Shekhar Department of Computer Science andEngineeringUniversity of Minnesota200Union ST SE4-192Minneapolis,MN55414shekhar@ABSTRACTSpatial co-location patterns represent the subsets of events whose instances are frequently located together in geographic space.We identified the computational bottleneck in the execution time of a current co-location mining algorithm.A large fraction of the join-based co-location miner algo-rithm is devoted to computing joins to identify instances of candidate co-location patterns.We propose a novel partial-join approach for mining co-location patterns efficiently.It transactionizes continuous spatial data while keeping track of the spatial information not modeled by transactions.It uses a transaction-based Apriori algorithm as a building block and adopts the instance join method for residual in-stances not identified in transactions.We show that the algorithm is correct and complete infinding all co-location rules which have prevalence and conditional probability above the given thresholds.An experimental evaluation using syn-thetic datasets and a real dataset shows that our algorithm is computationally more efficient than the join-based algo-rithm.Categories and Subject DescriptorsH.2.8[Database Applications]:Data mining,GIS and Spatial databasesGeneral TermsAlgorithms∗This work was partially supported by Digital Technology Center of University of Minnesota,NASA grant No.NCC21231and the Army High Performance Computing Research Center under the auspices of the Department of the Army, Army Research Laboratory cooperative agreement number DAAD19-01-2-0014,the content of which does not necessar-ily reflect the position or the policy of the government,and no official endorsement should be inferredPermission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on thefirst page.To copy otherwise,to republish,to post on servers or to redistribute to lists,requires prior specific permission and/or a fee.GIS’04,November12–13,2004,Washington,DC,USA.Copyright2004ACM1-58113-979-9/04/0011...$5.00.KeywordsSpatial data mining,Association rule,Co-location,Join1.INTRODUCTIONA co-location represents a subset of spatial boolean events whose instances are often located in a neighborhood.Boolean spatial events describe the presence or absence of geographic object types at different locations in a two dimensional or three dimensional metric space,e.g.,surface of the Earth. Examples of boolean spatial events include business types, mobile service request,disease,crime,climate,plant species, etc.Spatial co-location patterns may yield important in-sights for many applications.For example,a mobile service provider may be interested in service patterns frequently re-quested in a close location,e.g.,‘today sales’and‘nearby stores’.The frequent neighboring request sets may be used for providing attractive location-sensitive advertisements, promotion,etc.Figure1shows the locations of different types of business in a downtown area of Minneapolis,Min-nesota.We can notice two prevalent co-location patterns, i.e.,{‘auto dealers’,‘auto repair shops’}and{‘department stores’,‘gift stores’}.Other application domains for co-locations are Earth science,environmental management,gov-ernment services,public health,public safety,transporta-tion,tourism,etc.Figure1:An example of co-location pat-terns found among different types of businesses in a city.A=Auto dealers,R=auto Repair shops,D=Department stores,G=Gift stores and H=Hotels.{A, B, C}’s instances={(A.2, B.4, C.2),(A.3, B.3, C.1)}(a)An example dataset{A, B, C}’s instances={(A.2, B.4, C.2)}(b){A, B, C}’s instances={(A.2, B.4, C.2)}or {(A.2, B.4, C.2) , (A.3, B.3, C.1)}(c)B.4C.2A BA.1B.1A.3B.3A.4B.3A.1 C.1A.3 C.3A.3 C.1B CA CB.3C.1A.3B.3C.1A.2B.4C.2A B C={(A.2, B.4, C.2) , (A.3, B.3, C.1)}A.2 C.2{A, B, C}’s instancesA.2B.4(d)Figure2:Examples to illustrate different approaches to discover co-location patterns(b)An explicit transac-tionization of a spatial dataset can split instances of co-locations.(c)The non-overlapping grouping method can generate sets of different instances.(d)The instance join method generates complete instances but computation is expensive.Co-location rule discovery is a process to identify co-location patterns from an instance dataset of spatial boolean events. It is not trivial to adopt association rule mining algorithms[1, 8,13,18]to mine co-location patterns since instances of spa-tial events are embedded in a continuous space and share a variety of spatial relationships.Reusing association rule al-gorithms may require transactionizing spatial datasets,which is challenging due to the risk of transaction boundaries split-ting co-location pattern instances across distinct transac-tions.Figure2(a)shows an example spatial dataset with three spatial events,A,B,and C.Each instance is repre-sented by its event type and unique instance id,e.g.,A.1. Solid lines show neighbor relationships over event instances. For example,{A.2,B.4,C.2}and{A.3,B.3,C.1}are the instances of co-location{A,B,C}since their event instances are neighbors of each other.Figure2(b)shows the problem of explicit transactionization.Rectangular grids are used to produce transactions over the spatial dataset.As can be seen by the solid line circle,the only identified instance of co-location{A,B,C}is{A.2,B.4,C.2}.The instance{A.3, B.3,C.1}is missed due to the split caused by the transaction boundaries.Related Work:In previous work on co-location pattern discovery,a few approaches have been developed to iden-tify instances of candidate co-location patterns.One ap-proach[12]groups neighboring instances arbitrarily with a non-overlapping instance grouping constraint.This disjoint grouping method may yield different instance sets by the order of grouping.For example,Figure2(c)illustrates dif-ferent instance sets of co-location{A,B,C}by the order of grouping instances of size2co-location{A,B}.If an in-stance{A.4,B.3}isfirst grouped,the instance{A.3,B.3} is not identified since B.3already belongs to instance{A.4, B.3}even if it is a neighborhood instance.Consequently, the instance{A.3,B.3,C.1}of co-location{A,B,C}is also not found.Another approach[15]generates instances of candidate co-locations without any missing by using an instance join method.For example,in Figure2(d),the instances of co-location{A,B}and the instances of co-location{A,C}are joined and their neighbor relations are checked for gener-ating instances of co-location{A,B,C}.{A.2,B.4,C.2} and{A.3,B.3,C.1}are correctly generated.The join-based algorithm may be useful in analyzing datasets of sparse in-stances.However,scaling the algorithm to substantially large dense spatial datasets is challenging due to the in-creasing number of co-location patterns and their instances. Other co-location mining work[17]presents a framework for extended spatial objects,e.g.,polygons and line strings.It also uses an instance join method to identify nearby spatial objects.This paper proposes a novel approach for efficient co-location pattern mining.We make the following contribu-tions.Our Contributions:First,we identified the computa-tional bottleneck in the execution time of the join-based co-location mining algorithm[15].A large fraction of the al-gorithm is devoted to computing joins to identify instances of candidate co-location patterns.Second,we propose a novel partial-join approach for mining co-location patterns efficiently.It transactionizes continuous spatial data while keeping track of the spatial information not modeled by transactions.This approach is based on an important ob-servation that only event instances having at least one cut neighbor relation are related to co-location instances split over transactions.Third,we present an efficient co-location mining algorithm to concretize the partial-join approach.It uses a transaction-based Apriori algorithm[1]as a building block and adopts the instance join method[15]of the join-based co-location mining algorithm for generating residual co-location instances not identified by transactions.Fourth, we prove that the partial join algorithm is correct and com-plete infinding all co-location rules with prevalence and con-ditional probability above the given thresholds.Fifth,we provide an algebraic cost model to characterize the dom-inance zone of the performance between our partial-join algorithm and the join-based algorithm.Finally,we con-ducted experiments using a real dataset as well as syntheticdatasets.The experimental evaluation shows that our algo-rithm is computationally more efficient than the full join-based mining algorithm.The remainder of the paper is organized as follows.Sec-tion2presents an overview of basic concepts of co-location pattern mining.In Section3,we present the partial join ap-proach for efficient co-location mining.Section4describes the partial join co-location mining algorithm.The proofs of correctness and completeness of the algorithm,and an alge-braic cost model are given in Section5.Section6presents experimental evaluations.We give the conclusion and dis-cuss future work in Section7.2.CO-LOCATION PATTERN MINING:BASIC CONCEPTSThis section describes the basic concepts for mining co-location patterns.Given a set of boolean spatial events E={e1,...,e k}, a set S of their instances{i1,...,i n},and a reflexive and symmetric neighbor relation R over S,a co-location C is a subset of boolean spatial events,i.e.,C⊆E whose instances I⊆S form a clique[3]using neighbor relation R.For simplicity,we use a metric-based neighbor relation R,i.e., neighbor(i1,i2)between event instances i1and i2defined by Euclidean distance(i1,i2)≤a user-specified threshold is used as a neighbor relation R.A co-location rule is of the form:C1→C2(p,cp),where C1and C2are disjoint co-locations,p is a value representing the prevalence measure,and cp is the conditional probabil-ity.A neighborhood instance I of a co-location C is a row instance(simply,instance)of C if I contains instances of all events in C and no proper subset of I does so.For example, in Figure2(d),{A.1,B.1}is a row instance of co-location {A,B}.{A.3,C.1,C.3}is a neighborhood in Figure2(a) but it is not a row instance of co-location{A,C}because its subset{A.3,C.1}contains instances of all events in{A, C}.The table instance of a co-location C is the collection of all row instances of C.For example,the table instance of{B,C}in Figure2(d)has two row instances,{B.3,C.1} and{B.4,C.2}.The conditional probability,P r(C1|C2),of a co-location rule C1→C2is the probability offinding an instance of C2 in the neighborhood of an instance of C1.Formally,it isestimated as|πC1(table instance of C1∪C2)|1,whereπis a pro-jection operation with duplication elimination.The participation index,P i(C)is used as a co-location prevalence measure.The participation index of a co-locationC={e1,...,e k}is defined as min ei∈C {P r(C,e i)},whereP r(C,e i)is the participation ratio for event type e i in a co-location C.P r(C,e i)is the fraction of instances of e i which participate in any instance of co-location C, |πei(table instance of C)|i,whereπis a projection operation with duplication elimination.For example,in Figure2(a),the total number of instances of event type A is4and the to-tal number of instances of event type C is3.From Fig-ure2(d),the participation index of co-location c={A,C}is min{P r(c,A),P r(c,C)}=3/4because P r(c,A)is3/4and P r(c,C)is3/3.A high participation index value indicates that the spatial events in a co-location pattern likely show up together.Lemma 1.The participation ratio and the participation index are monotonically non increasing with the size of the co-location increasing.Proof.Please refer to[15]for the proof.Lemma1ensures that the participation index can be used to effectively prune the search space of co-location pattern mining.3.PARTIAL JOIN APPROACH FORCO-LOCATION PATTERN MININGThis section defines our partial join approach for efficient co-location pattern mining.3.1Problem DefinitionWe formalize the co-location mining problem as follows: Given:1)A set of k spatial event types E={e1,...,e k}and a set of their instances S={i1,...,i n},each i∈S is a vector< instance id,spatial event type,location>,where location∈a spatial framework2)A symmetric and reflexive neighbor relation R over loca-tions3)A minimal prevalence threshold(min prev)and a mini-mal conditional probability threshold(min cond prob) Find:Find a correct and complete set of co-location rules with participation index>min prev and conditional probability >min cond prob.Objective:Minimize computation cost.Constraints:1)R is a distance metric based neighbor relation.2)Ignore edge effects in R.3)Correct and complete infinding all co-location rules sat-isfying given thresholds.4)Spatial dataset is a point dataset.3.2Partial Join ApproachThe basic idea of the partial join approach is to reduce the number of instance joins for identifying instances of candi-date co-locations by transactionizing a spatial dataset un-der a neighbor relationship and tracing only residual neigh-borhood instances cut apart via the transactions.The key component of our approach is how we identify instances of co-locations split across explicit transactions.It is based on an observation that only event instances having at least one cut neighbor relationship are related to the neighborhood instances split over transactions.To formalize this idea,we provide a set of definitions of key terms related to the partial join approach.Definition 1.A neighborhood transaction(simply, transaction)is a set of instances T⊆S that forms a clique using a neighbor relation R.A spatial dataset S is parti-tioned to a set of disjoint transactions{T1,...,T n}where T i∩T j=∅,i=j and∪(T1,...,T n)=S.We assume a spatial dataset S can be partitioned to a set of distinct transactions,i.e.,each event instance i∈S belongs to one transaction.For example,Figure4shows a set of transactions on the same example spatial dataset of Figure2(a).The dashed circle represents a neighborhoodregion centered at an arbitrary location on a spatial frame-work.The instances within the dashed circle are neighbors of each other and thus forms a transaction.For example,B.2 and B.5form a transaction.A spatial dataset can be differ-ently transactionized according to the partitioning method used.Thus the transactions generated using rectangular grids in Figure2(b)are a little different from the trans-actions illustrated in Figure4.For example,in Figure4, {A.3,C.1,C.3}forms a single transaction.By contrast,in Figure2(b),it is divided into two transactions,{A.3,C.3} and{C.1}.We will examine the effect of different transac-tionization methods in future work.Definition 2.A row instance I of a co-location C is an intraX row instance(simply,intraX instance)of C if all instances i∈I belong to a common transaction T.The intraX table instance of C is the collection of all intraX row instances of C.For example,in Figure4,{A.3,C.1}is an intraX instance of co-location{A,C}but{A.1,C.1}is not since its event instances A.1and C.1are members of different transactions. The intraX table instance of{A,C}consists of{A.3,C.1}, {A.3,C.3}and{A.2,C.2}.Definition 3.A neighbor relation r∈R between two event instances,i1,i2∈S,i1=i2is called a cut neighbor relation if i1and i2are neighbors of each other but belong to distinct transactions.Figure4presents cut neighbor relations as dotted lines. {A.1,C.1},{A.3,B.3}and{B.3,C.1}has cut neighbor relations.Definition 4.A row instance I of a co-location C is an interX row instance(simply,interX instance)of C if all instances i∈I have at least one cut neighbor relation.The interX table instance of C is the collection of all interX row instances of C.For example,in Figure4,{A.3,B.3}is an interX instance of co-location{A,B}because A.3has a cut neighbor relation with B.3and B.3also has cut neighbor relations with A.3 and with C.1.Note{A.3,C.1}is an interX instance as well as an intraX instance of{A,C}.InterX table instance of {A,C}has two interX instances{A.1,C.1}and{A.3,C.1}.Figure3:The cases of possible instances of size3 and of size4co-locations over transactionsFigure3illustrates the possible instances of size3co-location and of size4co-location located over neighborhood transactions.Black dots signify event instances,circles are transactions,and lines show neighbor relations between twoFigure4:An illustration of the partial join co-location mining algorithmevent instances.Especially,dotted lines signify cut neighbor relations.There are two types of instances of co-locations. One is all event instances of a co-location instance belong to a single transaction.The other is the event instances are distributed across two or more transactions.The for-mer is the case of an intraX instance and the latter is an interX instance.We can notify all event instances of in-terX instances are related to at least one cut neighbor rela-tion(dotted lines).Lemma 2.For a co-location C,the table instance of C is the union of intraX table instance of C and interX table instance of C.Proof.The table instance of a co-location C is the col-lection of all(row)instances of C.First,we will show any instance,I={i1,...,i n}of C is an intraX instance of C or an interX instances of C.Since I forms a clique using a neighbor relation,all event instances of I can be included in a single neighborhood transaction according to definition1.I becomes an intraX instance.By contrast,if all event in-stances of I are not in a single transaction,each member should have at least one cut neighborhood relation with the other members in different transactions due to their clique relation.Thus,I becomes an interX instance.Second,all instances of intraX table instance and interX table instance of C are row instances whose event instances form a clique according to definition2and definition4.4.PARTIAL JOIN CO-LOCATION MININGALGORITHMThis section describes the partial join co-location min-ing algorithm.A transaction-based Apriori algorithm[1]is used as a building block to identify all intraX instances of co-locations.InterX instances are generated using general-ized apriori gen function[15]of the join-based co-location mining algorithm.This approach is expected to provide a framework for efficient co-location mining since all instancesin the transaction are neighbors of each other and no spa-tial operation and combinatorial operation,i.e.,join,is re-quired tofind instances of candidate co-locations within a transaction,i.e.,intraX instances.The computation cost of instance join operations for generating only interX instances not identified in the transactions is relatively cheaper than one forfinding all instances of co-locations.The partial-join mining algorithm for co-location patterns is described as follows.InputsE:a set of boolean spatial event typesS:a set of instances<event type,event instance id,location> R:a spatial neighbor relationmin prev:prevalence value thresholdmin cond prob:conditional probability threshold OutputA set of all prevalent co-location rules withparticipation index greater than min prevand conditional probability greater thanmin cond probVariablesk:co-location sizeT:a set of transactionsC k:a set of size k candidate co-locationsP k:a set of size k prevalent co-locationsR k:a set of size k co-location rulesIntraX k:intraX table instances of C kInterX k:interX table instances of C k,P kMethod1)(T,InterX2)=transactionize(S,R);2)k=1;C1=E;P1=E;3)while(not empty P k)do{4)C k+1=gen candidate co-location(P k);5)for all transaction t∈T6)IntraX k+1=gather intraX instances(C k+1,t);7)if k≥28)InterX k+1=gen interX intances(C k+1,InterX k,R);9)P k+1=select prevalent co-location10)(C k+1,IntraX k+1InterX k+1,min prev);11)R k+1=gen co-location rule(P k+1,min cond prob);12)k=k+1;13)}14)return(R2,...,R k+1);Algorithm1:Partial join co-location algorithm Transactionization of a spatial dataset:Given a spa-tial dataset and a neighbor relation,the spatial dataset is partitioned for generating neighborhood transactions.There are several partitioning methods adopted for neighborhood transactions,e.g.,grids[14],maximal cliques[3],max-clique agglomerative clustering[20],min cut partitioning[6]etc. The ideal case is a method to generate a set of maximal cliques with minimizing the number of edges cut by parti-tions.In the case of a simple grid partitioning,rectangular grids of a proximity neighborhood size d×d,where d is a neighbor distance metric,are posed on a spatial frame-work,and event instances in each cell are gathered for a transaction.Cut neighbor relations can be detected by ex-amining all pairs(i1,i2)of instances in neighboring cells, i.e.,(i1,i2)∈R and i1.trans no=i2.trans no,where R is a neighbor relation.It can be implemented using geo-metric approaches, e.g.,plane sweep[2],space partition-ing[9],tree matching[10].Size2interX instances are gen-erated from all pairs(i1,i2)of instances having cut neigh-bor relations in each transaction,i.e.,i1∈B,i2∈B and i1.trans no=i2.trans no,where B is a set of event in-stances having cut neighbor relations,as well as cut neigh-borhood instances.Generation of candidate co-locations:We use the apriori gen[1]for generating candidate co-location sets. Size k+1candidate co-locations are generated from size k prevalent co-locations.The anti-monotonic property of the participation index makes event level pruning feasible. Scanning transactions and gathering intraX instances :In each iteration step,the transactions are scanned and the intraX instances of candidate co-locations are enumer-ated.This step is similar to the apriori algorithm.However, notice that the transactions of a spatial event dataset differ from the transactions of a market basket dataset.The tradi-tional market basket data transaction has only boolean item types,i.e.,an item is present in a transaction or not.By contrast,each item of our neighborhood transaction consists of an event type and its instance id as described in Figure4. One event type can have several instances in a transaction. To reuse an efficient trie data structure[4,7]in determining instances of candidate co-locations in a transaction,we con-vert several items of same event type with different instance ids to one event type item having a bitmap structure[5]in which corresponding instance id bits are set.The converted transactions are searched for gathering intraX instances of co-locations.Figure4shows a conceptual set of intraX table instances.Actually,all instances are enumulated in the trie structure of itemsets using bitmaps.Generation of interX table instances:The interX table instance of C k+1,k≥2are generated from interX table in-stance of C k using the generalized apriori gen function[15]. The SQL-like syntax is described below.forall co-location c k+1∈C k+1insert into c k+1.interX table instanceselect p.instance1,p.instance2,...,p.instance k ,q.instance kfrom c k.interX table instance1p,c k.interX table instance2qwhere(p.instance1,...,p.instance k−1)=(q.instance1,...,q.instance k−1)and(p.instance k,q.instance k)∈R;end;In Figure4,an interX table instance of{A,B}having {A.3,B.3}and an interX table instance of{A,C}having {A.1,C.1}and{A.3,C.1}are joined to produce interX ta-ble instance of{A,B,C}.Selection of Prevalent Co-locations:The participation index of co-location C k+1is calculated from the union of in-traX table instance(C k+1)and interX table instance(C k+1). Candidate co-locations are pruned using a given prevalence threshold,min prev.In Figure4,co-location{B,C}has two instances,i.e.,one is an intraX instance,{B.4,C.2}and the other is an interX instance{B.3,C.1}.The participa-tion index of co-location{B,C}is min{2/5,2/3}=2/5.If min prev is given as1/2,the candidate co-location{B, C}is pruned because its prevalence measure is less than1/2.Generation of Co-location Rules:This step generates all co-location rules with high conditional probability above a given min cond prob.5.ANALYSIS OF THE PARTIAL JOINCO-LOCATION MINING ALGORITHM In this section,we analyze the partial join co-location min-ing algorithm for completeness,correctness and computa-tional pleteness implies that no co-location rule satisfying given prevalence and conditional probability thresholds is missed.Correctness means that the participa-tion index values and conditional probability of generated co-location rules meet the user specified threshold.5.1Completeness and CorrectnessLemma 3.The partial join co-location mining algorithm is correct.Proof.The partial join co-location mining algorithm is correct if co-location patterns produced by algorithm1meets the thresholds of prevalence value and conditional probabil-ity.First,we will show that intraX instances and interX instances are correct in the neighbor relation.Step1in al-gorithm1generates neighborhood transactions according to definition1.Thus the intraX instances gathered in step6 are correct in the neighbor relation.The interX instances generated in step8are proved by the correctness of gener-alized apriori gen algorithm[15].That is,all instances of a generated interX instance are neighbor of each other.Sec-ond,step9ensures that only prevalent co-location sets are selected.Thus step11returns co-location rules above given thresholds correctly.Lemma 4.The partial join co-location mining algorithm is complete.Proof.We prove if a co-location is prevalent,it is found by algorithm1.First,the monotonicity of the participation index in lemma1proves the completeness of the event level pruning of candidate co-locations using apriori gen in step 4.Second,we will show that the intraX table instances and the interX table instances generated from algorithm1are complete,which will imply that all instances of co-locations are complete according to lemma2.All intraX table in-stances are completely found by the apriori algorithm in step6.Size2interX table instances generated from step 1are a superset of all neighboring instances necessary to generate size k+1,k≥2interX instances.In step8,the completeness of the instance join method to generate interX instances is the same as that of generalized apriori gen[15]. In step11,enumeration of the subsets of each of the preva-lence co-locations ensures that no spatial co-location rules satisfying given prevalence and conditional probabilities are missed.5.2Computational Complexity AnalysisThis section compares the computational cost of the join-based co-location mining algorithm and the partial join al-gorithm.Let T jb(k+1)and T pj(k+1)represent the costs of iteration k of the join-based algorithm and the partial join algorithm respectively.T jb(k+1)=T gen candi(P k)+T gen inst(table insts of P k)+T prune(C k+1)≈T gen inst(table insts of P k)T pj(k+1)=T gen candi(P k)+T gath intraX inst(transactions) +T gen interX inst(interX table insts of P k)+T prune(C k+1)≈T gen interX inst(interX table insts of P k)In the above equations,T gen candi(P k)represents the costof generating size k+1candidate co-location with the preva-lent size k co-locations.T gen inst(table insts of P k)repre-sents the cost of generating table instances of size k+1 candidate co-locations with size k table instances.T gath intraX inst(transactions)is the cost of scanning trans-actions and gathering the instances of the size k+1candi-date co-locations.T gen interX inst(interX table inst of P k)is the cost of generating interX table instances of the sizek+1candidate co-locations with size k interX table in-stances.T prune(C k+1)represents the cost for pruning non prevalent size k+1co-locations.The bulk of time is consumed in generating instances. We assume that the cost of gathering intraX instances from transactions is relatively cheaper than instance join cost, and that the other factors,T gen candi(P k)and T prune(C k+1) are illegible.Thus the computational ratio of the partial join algorithm over the join-based algorithm can be simplified asT pj(k+1)jb≈T gen interX inst(interX table insts of P k)gen inst insts kThe computational ratio is affected by the size of interX table instances and the size of table instances of co-locationP k.The dominance factors affecting the number of interX instances and the number of total instances can be the num-ber of cut neighbor relations and the data density of the neighborhood area.When the number of cut neighbor rela-tions isfixed and the data density in a neighborhood area grows,the size of table instances increases rapidly and the cost to generate the table instances is much greater than the cost to generate interX table instances.By contrast,as the number of cut neighbor relations increases,the size of interX table instances increases.Thus the average cost to gener-ate interX table instances grows.When all instances have cut neighbor relations,they are involved in interX table in-stances thus the cost to generate the interX table instancesis similar to the cost to generate table instances in the join-based algorithm.In our experiments,as described in the next section,we use the data density in neighborhood area and the ratio of cut neighbor relations as key parameters to evaluate the algorithms.We can expect that the partial join approach is likely more efficient than the join-based method when the locations of spatial events are clustered in neigh-borhood areas and the number of cut neighbor relations is smaller.6.EXPERIMENTAL EV ALUATIONWe evaluated the performance of the partial join algo-rithm with the join-based approach using synthetic and real datasets.In Subsection6.1,we describe an overall exper-imental design and a synthetic data generator.In Subsec-tion6.2,we evaluate the computational efficiency gained。