Mining Subsidence Paper_Ren_Li_Buckeridge_rev4_Jan2010
- 格式:pdf
- 大小:965.16 KB
- 文档页数:19
A review on time series data miningTak-chung FuDepartment of Computing,Hong Kong Polytechnic University,Hunghom,Kowloon,Hong Konga r t i c l e i n f oArticle history:Received19February2008Received in revised form14March2010Accepted4September2010Keywords:Time series data miningRepresentationSimilarity measureSegmentationVisualizationa b s t r a c tTime series is an important class of temporal data objects and it can be easily obtained from scientificandfinancial applications.A time series is a collection of observations made chronologically.The natureof time series data includes:large in data size,high dimensionality and necessary to updatecontinuously.Moreover time series data,which is characterized by its numerical and continuousnature,is always considered as a whole instead of individual numericalfield.The increasing use of timeseries data has initiated a great deal of research and development attempts in thefield of data mining.The abundant research on time series data mining in the last decade could hamper the entry ofinterested researchers,due to its complexity.In this paper,a comprehensive revision on the existingtime series data mining researchis given.They are generally categorized into representation andindexing,similarity measure,segmentation,visualization and mining.Moreover state-of-the-artresearch issues are also highlighted.The primary objective of this paper is to serve as a glossary forinterested researchers to have an overall picture on the current time series data mining developmentand identify their potential research direction to further investigation.&2010Elsevier Ltd.All rights reserved.1.IntroductionRecently,the increasing use of temporal data,in particulartime series data,has initiated various research and developmentattempts in thefield of data mining.Time series is an importantclass of temporal data objects,and it can be easily obtained fromscientific andfinancial applications(e.g.electrocardiogram(ECG),daily temperature,weekly sales totals,and prices of mutual fundsand stocks).A time series is a collection of observations madechronologically.The nature of time series data includes:large indata size,high dimensionality and update continuously.Moreovertime series data,which is characterized by its numerical andcontinuous nature,is always considered as a whole instead ofindividual numericalfield.Therefore,unlike traditional databaseswhere similarity search is exact match based,similarity search intime series data is typically carried out in an approximatemanner.There are various kinds of time series data related research,forexample,finding similar time series(Agrawal et al.,1993a;Berndtand Clifford,1996;Chan and Fu,1999),subsequence searching intime series(Faloutsos et al.,1994),dimensionality reduction(Keogh,1997b;Keogh et al.,2000)and segmentation(Abonyiet al.,2005).Those researches have been studied in considerabledetail by both database and pattern recognition communities fordifferent domains of time series data(Keogh and Kasetty,2002).In the context of time series data mining,the fundamentalproblem is how to represent the time series data.One of thecommon approaches is transforming the time series to anotherdomain for dimensionality reduction followed by an indexingmechanism.Moreover similarity measure between time series ortime series subsequences and segmentation are two core tasksfor various time series mining tasks.Based on the time seriesrepresentation,different mining tasks can be found in theliterature and they can be roughly classified into fourfields:pattern discovery and clustering,classification,rule discovery andsummarization.Some of the research concentrates on one of thesefields,while the others may focus on more than one of the aboveprocesses.In this paper,a comprehensive review on the existingtime series data mining research is given.Three state-of-the-arttime series data mining issues,streaming,multi-attribute timeseries data and privacy are also briefly introduced.The remaining part of this paper is organized as follows:Section2contains a discussion of time series representation andindexing.The concept of similarity measure,which includes bothwhole time series and subsequence matching,based on the rawtime series data or the transformed domain will be reviewed inSection3.The research work on time series segmentation andvisualization will be discussed in Sections4and5,respectively.InSection6,vary time series data mining tasks and recent timeseries data mining directions will be reviewed,whereas theconclusion will be made in Section7.2.Time series representation and indexingOne of the major reasons for time series representation is toreduce the dimension(i.e.the number of data point)of theContents lists available at ScienceDirectjournal homepage:/locate/engappaiEngineering Applications of Artificial Intelligence0952-1976/$-see front matter&2010Elsevier Ltd.All rights reserved.doi:10.1016/j.engappai.2010.09.007E-mail addresses:cstcfu@.hk,cstcfu@Engineering Applications of Artificial Intelligence24(2011)164–181original data.The simplest method perhaps is sampling(Astrom, 1969).In this method,a rate of m/n is used,where m is the length of a time series P and n is the dimension after dimensionality reduction(Fig.1).However,the sampling method has the drawback of distorting the shape of sampled/compressed time series,if the sampling rate is too low.An enhanced method is to use the average(mean)value of each segment to represent the corresponding set of data points. Again,with time series P¼ðp1,...,p mÞand n is the dimension after dimensionality reduction,the‘‘compressed’’time series ^P¼ð^p1,...,^p nÞcan be obtained by^p k ¼1k kX e ki¼s kp ið1Þwhere s k and e k denote the starting and ending data points of the k th segment in the time series P,respectively(Fig.2).That is, using the segmented means to represent the time series(Yi and Faloutsos,2000).This method is also called piecewise aggregate approximation(PAA)by Keogh et al.(2000).1Keogh et al.(2001a) propose an extended version called an adaptive piecewise constant approximation(APCA),in which the length of each segment is notfixed,but adaptive to the shape of the series.A signature technique is proposed by Faloutsos et al.(1997)with similar ideas.Besides using the mean to represent each segment, other methods are proposed.For example,Lee et al.(2003) propose to use the segmented sum of variation(SSV)to represent each segment of the time series.Furthermore,a bit level approximation is proposed by Ratanamahatana et al.(2005)and Bagnall et al.(2006),which uses a bit to represent each data point.To reduce the dimension of time series data,another approach is to approximate a time series with straight lines.Two major categories are involved.Thefirst one is linear interpolation.A common method is using piecewise linear representation(PLR)2 (Keogh,1997b;Keogh and Smyth,1997;Smyth and Keogh,1997). The approximating line for the subsequence P(p i,y,p j)is simply the line connecting the data points p i and p j.It tends to closely align the endpoint of consecutive segments,giving the piecewise approximation with connected lines.PLR is a bottom-up algo-rithm.It begins with creating afine approximation of the time series,so that m/2segments are used to approximate the m length time series and iteratively merges the lowest cost pair of segments,until it meets the required number of segment.When the pair of adjacent segments S i and S i+1are merged,the cost of merging the new segment with its right neighbor and the cost of merging the S i+1segment with its new larger neighbor is calculated.Ge(1998)extends PLR to hierarchical structure. Furthermore,Keogh and Pazzani enhance PLR by considering weights of the segments(Keogh and Pazzani,1998)and relevance feedback from the user(Keogh and Pazzani,1999).The second approach is linear regression,which represents the subsequences with the bestfitting lines(Shatkay and Zdonik,1996).Furthermore,reducing the dimension by preserving the salient points is a promising method.These points are called as perceptually important points(PIP).The PIP identification process isfirst introduced by Chung et al.(2001)and used for pattern matching of technical(analysis)patterns infinancial applications. With the time series P,there are n data points:P1,P2y,P n.All the data points in P can be reordered by its importance by going through the PIP identification process.Thefirst data point P1and the last data point P n in the time series are thefirst and two PIPs, respectively.The next PIP that is found will be the point in P with maximum distance to thefirst two PIPs.The fourth PIP that is found will be the point in P with maximum vertical distance to the line joining its two adjacent PIPs,either in between thefirst and second PIPs or in between the second and the last PIPs.The PIP location process continues until all the points in P are attached to a reordered list L or the required number of PIPs is reached(i.e. reduced to the required dimension).Seven PIPs are identified in from the sample time series in Fig.3.Detailed treatment can be found in Fu et al.(2008c).The idea is similar to a technique proposed about30years ago for reducing the number of points required to represent a line by Douglas and Peucker(1973)(see also Hershberger and Snoeyink, 1992).Perng et al.(2000)use a landmark model to identify the important points in the time series for similarity measure.Man and Wong(2001)propose a lattice structure to represent the identified peaks and troughs(called control points)in the time series.Pratt and Fink(2002)and Fink et al.(2003)define extrema as minima and maxima in a time series and compress thetime Fig.1.Time series dimensionality reduction by sampling.The time series on the left is sampled regularly(denoted by dotted lines)and displayed on the right with a largedistortion.Fig.2.Time series dimensionality reduction by PAA.The horizontal dotted lines show the mean of each segment.1This method is called piecewise constant approximation originally(Keoghand Pazzani,2000a).2It is also called piecewise linear approximation(PLA).Tak-chung Fu/Engineering Applications of Artificial Intelligence24(2011)164–181165series by selecting only certain important extrema and dropping the other points.The idea is to discard minor fluctuations and keep major minima and maxima.The compression is controlled by the compression ratio with parameter R ,which is always greater than one;an increase of R leads to the selection of fewer points.That is,given indices i and j ,where i r x r j ,a point p x of a series P is an important minimum if p x is the minimum among p i ,y ,p j ,and p i /p x Z R and p j /p x Z R .Similarly,p x is an important maximum if p x is the maximum among p i ,y ,p j and p x /p i Z R and p x /p j Z R .This algorithm takes linear time and constant memory.It outputs the values and indices of all important points,as well as the first and last point of the series.This algorithm can also process new points as they arrive,without storing the original series.It identifies important points based on local information of each segment (subsequence)of time series.Recently,a critical point model (CPM)(Bao,2008)and a high-level representation based on a sequence of critical points (Bao and Yang,2008)are proposed for financial data analysis.On the other hand,special points are introduced to restrict the error on PLR (Jia et al.,2008).Key points are suggested to represent time series in (Leng et al.,2009)for an anomaly detection.Another common family of time series representation approaches converts the numeric time series to symbolic form.That is,first discretizing the time series into segments,then converting each segment into a symbol (Yang and Zhao,1998;Yang et al.,1999;Motoyoshi et al.,2002;Aref et al.,2004).Lin et al.(2003;2007)propose a method called symbolic aggregate approximation (SAX)to convert the result from PAA to symbol string.The distribution space (y -axis)is divided into equiprobable regions.Each region is represented by a symbol and each segment can then be mapped into a symbol corresponding to the region inwhich it resides.The transformed time series ^Pusing PAA is finally converted to a symbol string SS (s 1,y ,s W ).In between,two parameters must be specified for the conversion.They are the length of subsequence w and alphabet size A (number of symbols used).Besides using the means of the segments to build the alphabets,another method uses the volatility change to build the alphabets.Jonsson and Badal (1997)use the ‘‘Shape Description Alphabet (SDA)’’.Example symbols like highly increasing transi-tion,stable transition,and slightly decreasing transition are adopted.Qu et al.(1998)use gradient alphabets like upward,flat and download as symbols.Huang and Yu (1999)suggest transforming the time series to symbol string,using change ratio between contiguous data points.Megalooikonomou et al.(2004)propose to represent each segment by a codeword from a codebook of key-sequences.This work has extended to multi-resolution consideration (Megalooi-konomou et al.,2005).Morchen and Ultsch (2005)propose an unsupervised discretization process based on quality score and persisting states.Instead of ignoring the temporal order of values like many other methods,the Persist algorithm incorporates temporal information.Furthermore,subsequence clustering is a common method to generate the symbols (Das et al.,1998;Li et al.,2000a;Hugueney and Meunier,2001;Hebrail and Hugueney,2001).A multiple abstraction level mining (MALM)approach is proposed by Li et al.(1998),which is based on the symbolic form of the time series.The symbols in this paper are determined by clustering the features of each segment,such as regression coefficients,mean square error and higher order statistics based on the histogram of the regression residuals.Most of the methods described so far are representing time series in time domain directly.Representing time series in the transformation domain is another large family of approaches.One of the popular transformation techniques in time series data mining is the discrete Fourier transforms (DFT),since first being proposed for use in this context by Agrawal et al.(1993a).Rafiei and Mendelzon (2000)develop similarity-based queries,using DFT.Janacek et al.(2005)propose to use likelihood ratio statistics to test the hypothesis of difference between series instead of an Euclidean distance in the transformed domain.Recent research uses wavelet transform to represent time series (Struzik and Siebes,1998).In between,the discrete wavelet transform (DWT)has been found to be effective in replacing DFT (Chan and Fu,1999)and the Haar transform is always selected (Struzik and Siebes,1999;Wang and Wang,2000).The Haar transform is a series of averaging and differencing operations on a time series (Chan and Fu,1999).The average and difference between every two adjacent data points are computed.For example,given a time series P ¼(1,3,7,5),dimension of 4data points is the full resolution (i.e.original time series);in dimension of two coefficients,the averages are (26)with the coefficients (À11)and in dimension of 1coefficient,the average is 4with coefficient (À2).A multi-level representation of the wavelet transform is proposed by Shahabi et al.(2000).Popivanov and Miller (2002)show that a large class of wavelet transformations can be used for time series representation.Dasha et al.(2007)compare different wavelet feature vectors.On the other hand,comparison between DFT and DWT can be found in Wu et al.(2000b)and Morchen (2003)and a combination use of Fourier and wavelet transforms are presented in Kawagoe and Ueda (2002).An ensemble-index,is proposed by Keogh et al.(2001b)and Vlachos et al.(2006),which ensembles two or more representations for indexing.Principal component analysis (PCA)is a popular multivariate technique used for developing multivariate statistical process monitoring methods (Yang and Shahabi,2005b;Yoon et al.,2005)and it is applied to analyze financial time series by Lesch et al.(1999).In most of the related works,PCA is used to eliminate the less significant components or sensors and reduce the data representation only to the most significant ones and to plot the data in two dimensions.The PCA model defines linear hyperplane,it can be considered as the multivariate extension of the PLR.PCA maps the multivariate data into a lower dimensional space,which is useful in the analysis and visualization of correlated high-dimensional data.Singular value decomposition (SVD)(Korn et al.,1997)is another transformation-based approach.Other time series representation methods include modeling time series using hidden markov models (HMMs)(Azzouzi and Nabney,1998)and a compression technique for multiple stream is proposed by Deligiannakis et al.(2004).It is based onbaseFig.3.Time series compression by data point importance.The time series on the left is represented by seven PIPs on the right.Tak-chung Fu /Engineering Applications of Artificial Intelligence 24(2011)164–181166signal,which encodes piecewise linear correlations among the collected data values.In addition,a recent biased dimension reduction technique is proposed by Zhao and Zhang(2006)and Zhao et al.(2006).Moreover many of the representation schemes described above are incorporated with different indexing methods.A common approach is adopted to an existing multidimensional indexing structure(e.g.R-tree proposed by Guttman(1984))for the representation.Agrawal et al.(1993a)propose an F-index, which adopts the R*-tree(Beckmann et al.,1990)to index thefirst few DFT coefficients.An ST-index is further proposed by (Faloutsos et al.(1994),which extends the previous work for subsequence handling.Agrawal et al.(1995a)adopt both the R*-and R+-tree(Sellis et al.,1987)as the indexing structures.A multi-level distance based index structure is proposed(Yang and Shahabi,2005a),which for indexing time series represented by PCA.Vlachos et al.(2005a)propose a Multi-Metric(MM)tree, which is a hybrid indexing structure on Euclidean and periodic spaces.Minimum bounding rectangle(MBR)is also a common technique for time series indexing(Chu and Wong,1999;Vlachos et al.,2003).An MBR is adopted in(Rafiei,1999)which an MT-index is developed based on the Fourier transform and in(Kahveci and Singh,2004)which a multi-resolution index is proposed based on the wavelet transform.Chen et al.(2007a)propose an indexing mechanism for PLR representation.On the other hand, Kim et al.(1996)propose an index structure called TIP-index (TIme series Pattern index)for manipulating time series pattern databases.The TIP-index is developed by improving the extended multidimensional dynamic indexfile(EMDF)(Kim et al.,1994). An iSAX(Shieh and Keogh,2009)is proposed to index massive time series,which is developed based on an SAX.A multi-resolution indexing structure is proposed by Li et al.(2004),which can be adapted to different representations.To sum up,for a given index structure,the efficiency of indexing depends only on the precision of the approximation in the reduced dimensionality space.However in choosing a dimensionality reduction technique,we cannot simply choose an arbitrary compression algorithm.It requires a technique that produces an indexable representation.For example,many time series can be efficiently compressed by delta encoding,but this representation does not lend itself to indexing.In contrast,SVD, DFT,DWT and PAA all lend themselves naturally to indexing,with each eigenwave,Fourier coefficient,wavelet coefficient or aggregate segment map onto one dimension of an index tree. Post-processing is then performed by computing the actual distance between sequences in the time domain and discarding any false matches.3.Similarity measureSimilarity measure is of fundamental importance for a variety of time series analysis and data mining tasks.Most of the representation approaches discussed in Section2also propose the similarity measure method on the transformed representation scheme.In traditional databases,similarity measure is exact match based.However in time series data,which is characterized by its numerical and continuous nature,similarity measure is typically carried out in an approximate manner.Consider the stock time series,one may expect having queries like: Query1:find all stocks which behave‘‘similar’’to stock A.Query2:find all‘‘head and shoulders’’patterns last for a month in the closing prices of all high-tech stocks.The query results are expected to provide useful information for different stock analysis activities.Queries like Query2in fact is tightly coupled with the patterns frequently used in technical analysis, e.g.double top/bottom,ascending triangle,flag and rounded top/bottom.In time series domain,devising an appropriate similarity function is by no means trivial.There are essentially two ways the data that might be organized and processed(Agrawal et al., 1993a).In whole sequence matching,the whole length of all time series is considered during the similarity search.It requires comparing the query sequence to each candidate series by evaluating the distance function and keeping track of the sequence with the smallest distance.In subsequence matching, where a query sequence Q and a longer sequence P are given,the task is tofind the subsequences in P,which matches Q. Subsequence matching requires that the query sequence Q be placed at every possible offset within the longer sequence P.With respect to Query1and Query2above,they can be considered as a whole sequence matching and a subsequence matching,respec-tively.Gavrilov et al.(2000)study the usefulness of different similarity measures for clustering similar stock time series.3.1.Whole sequence matchingTo measure the similarity/dissimilarity between two time series,the most popular approach is to evaluate the Euclidean distance on the transformed representation like the DFT coeffi-cients(Agrawal et al.,1993a)and the DWT coefficients(Chan and Fu,1999).Although most of these approaches guarantee that a lower bound of the Euclidean distance to the original data, Euclidean distance is not always being the suitable distance function in specified domains(Keogh,1997a;Perng et al.,2000; Megalooikonomou et al.,2005).For example,stock time series has its own characteristics over other time series data(e.g.data from scientific areas like ECG),in which the salient points are important.Besides Euclidean-based distance measures,other distance measures can easily be found in the literature.A constraint-based similarity query is proposed by Goldin and Kanellakis(1995), which extended the work of(Agrawal et al.,1993a).Das et al. (1997)apply computational geometry methods for similarity measure.Bozkaya et al.(1997)use a modified edit distance function for time series matching and retrieval.Chu et al.(1998) propose to measure the distance based on the slopes of the segments for handling amplitude and time scaling problems.A projection algorithm is proposed by Lam and Wong(1998).A pattern recognition method is proposed by Morrill(1998),which is based on the building blocks of the primitives of the time series. Ruspini and Zwir(1999)devote an automated identification of significant qualitative features of complex objects.They propose the process of discovery and representation of interesting relations between those features,the generation of structured indexes and textual annotations describing features and their relations.The discovery of knowledge by an analysis of collections of qualitative descriptions is then achieved.They focus on methods for the succinct description of interesting features lying in an effective frontier.Generalized clustering is used for extracting features,which interest domain experts.The general-ized Markov models are adopted for waveform matching in Ge and Smyth(2000).A content-based query-by-example retrieval model called FALCON is proposed by Wu et al.(2000a),which incorporates a feedback mechanism.Indeed,one of the most popular andfield-tested similarity measures is called the‘‘time warping’’distance measure.Based on the dynamic time warping(DTW)technique,the proposed method in(Berndt and Clifford,1994)predefines some patterns to serve as templates for the purpose of pattern detection.To align two time series,P and Q,using DTW,an n-by-m matrix M isfirstTak-chung Fu/Engineering Applications of Artificial Intelligence24(2011)164–181167constructed.The(i th,j th)element of the matrix,m ij,contains the distance d(q i,p j)between the two points q i and p j and an Euclidean distance is typically used,i.e.d(q i,p j)¼(q iÀp j)2.It corresponds to the alignment between the points q i and p j.A warping path,W,is a contiguous set of matrix elements that defines a mapping between Q and P.Its k th element is defined as w k¼(i k,j k)andW¼w1,w2,...,w k,...,w Kð2Þwhere maxðm,nÞr K o mþnÀ1.The warping path is typically subjected to the following constraints.They are boundary conditions,continuity and mono-tonicity.Boundary conditions are w1¼(1,1)and w K¼(m,n).This requires the warping path to start andfinish diagonally.Next constraint is continuity.Given w k¼(a,b),then w kÀ1¼(a0,b0), where aÀa u r1and bÀb u r1.This restricts the allowable steps in the warping path being the adjacent cells,including the diagonally adjacent cell.Also,the constraints aÀa uZ0and bÀb uZ0force the points in W to be monotonically spaced in time.There is an exponential number of warping paths satisfying the above conditions.However,only the path that minimizes the warping cost is of interest.This path can be efficiently found by using dynamic programming(Berndt and Clifford,1996)to evaluate the following recurrence equation that defines the cumulative distance gði,jÞas the distance dði,jÞfound in the current cell and the minimum of the cumulative distances of the adjacent elements,i.e.gði,jÞ¼dðq i,p jÞþmin f gðiÀ1,jÀ1Þ,gðiÀ1,jÞ,gði,jÀ1Þgð3ÞA warping path,W,such that‘‘distance’’between them is minimized,can be calculated by a simple methodDTWðQ,PÞ¼minWX Kk¼1dðw kÞ"#ð4Þwhere dðw kÞcan be defined asdðw kÞ¼dðq ik ,p ikÞ¼ðq ikÀp ikÞ2ð5ÞDetailed treatment can be found in Kruskall and Liberman (1983).As DTW is computationally expensive,different methods are proposed to speedup the DTW matching process.Different constraint(banding)methods,which control the subset of matrix that the warping path is allowed to visit,are reviewed in Ratanamahatana and Keogh(2004).Yi et al.(1998)introduce a technique for an approximate indexing of DTW that utilizes a FastMap technique,whichfilters the non-qualifying series.Kim et al.(2001)propose an indexing approach under DTW similarity measure.Keogh and Pazzani(2000b)introduce a modification of DTW,which integrates with PAA and operates on a higher level abstraction of the time series.An exact indexing approach,which is based on representing the time series by PAA for DTW similarity measure is further proposed by Keogh(2002).An iterative deepening dynamic time warping(IDDTW)is suggested by Chu et al.(2002),which is based on a probabilistic model of the approximate errors for all levels of approximation prior to the query process.Chan et al.(2003)propose afiltering process based on the Haar wavelet transformation from low resolution approx-imation of the real-time warping distance.Shou et al.(2005)use an APCA approximation to compute the lower bounds for DTW distance.They improve the global bound proposed by Kim et al. (2001),which can be used to index the segments and propose a multi-step query processing technique.A FastDTW is proposed by Salvador and Chan(2004).This method uses a multi-level approach that recursively projects a solution from a coarse resolution and refines the projected solution.Similarly,a fast DTW search method,an FTW is proposed by Sakurai et al.(2005) for efficiently pruning a significant number of search candidates. Ratanamahatana and Keogh(2005)clarified some points about DTW where are related to lower bound and speed.Euachongprasit and Ratanamahatana(2008)also focus on this problem.A sequentially indexed structure(SIS)is proposed by Ruengron-ghirunya et al.(2009)to balance the tradeoff between indexing efficiency and I/O cost during DTW similarity measure.A lower bounding function for group of time series,LBG,is adopted.On the other hand,Keogh and Pazzani(2001)point out the potential problems of DTW that it can lead to unintuitive alignments,where a single point on one time series maps onto a large subsection of another time series.Also,DTW may fail to find obvious and natural alignments in two time series,because of a single feature(i.e.peak,valley,inflection point,plateau,etc.). One of the causes is due to the great difference between the lengths of the comparing series.Therefore,besides improving the performance of DTW,methods are also proposed to improve an accuracy of DTW.Keogh and Pazzani(2001)propose a modifica-tion of DTW that considers the higher level feature of shape for better alignment.Ratanamahatana and Keogh(2004)propose to learn arbitrary constraints on the warping path.Regression time warping(RTW)is proposed by Lei and Govindaraju(2004)to address the challenges of shifting,scaling,robustness and tecki et al.(2005)propose a method called the minimal variance matching(MVM)for elastic matching.It determines a subsequence of the time series that best matches a query series byfinding the cheapest path in a directed acyclic graph.A segment-wise time warping distance(STW)is proposed by Zhou and Wong(2005)for time scaling search.Fu et al.(2008a) propose a scaled and warped matching(SWM)approach for handling both DTW and uniform scaling simultaneously.Different customized DTW techniques are applied to thefield of music research for query by humming(Zhu and Shasha,2003;Arentz et al.,2005).Focusing on similar problems as DTW,the Longest Common Subsequence(LCSS)model(Vlachos et al.,2002)is proposed.The LCSS is a variation of the edit distance and the basic idea is to match two sequences by allowing them to stretch,without rearranging the sequence of the elements,but allowing some elements to be unmatched.One of the important advantages of an LCSS over DTW is the consideration on the outliers.Chen et al.(2005a)further introduce a distance function based on an edit distance on real sequence(EDR),which is robust against the data imperfection.Morse and Patel(2007)propose a Fast Time Series Evaluation(FTSE)method which can be used to evaluate the threshold value of these kinds of techniques in a faster way.Threshold-based distance functions are proposed by ABfalg et al. (2006).The proposed function considers intervals,during which the time series exceeds a certain threshold for comparing time series rather than using the exact time series values.A T-Time application is developed(ABfalg et al.,2008)to demonstrate the usage of it.Fu et al.(2007)further suggest to introduce rules to govern the pattern matching process,if a priori knowledge exists in the given domain.A parameter-light distance measure method based on Kolmo-gorov complexity theory is suggested in Keogh et al.(2007b). Compression-based dissimilarity measure(CDM)3is adopted in this paper.Chen et al.(2005b)present a histogram-based representation for similarity measure.Similarly,a histogram-based similarity measure,bag-of-patterns(BOP)is proposed by Lin and Li(2009).The frequency of occurrences of each pattern in 3CDM is proposed by Keogh et al.(2004),which is used to compare the co-compressibility between data sets.Tak-chung Fu/Engineering Applications of Artificial Intelligence24(2011)164–181 168。
References1. D.L.Burkholder.Distribution function inequalities for martingales.Ann.Probability,1:19–42,1973.2. D.L.Burkholder and R.F.Gundy.Distribution function inequalities for thearea integral.Studia Math.,44:527–544,1972.3. D.L.Burkholder,R.F.Gundy,and M.L.Silverstein.A maximal functioncharacterization of the class H p.Trans.Amer.Math.Soc.,157:137–153,1971.4. A.P.Calder´o n.Intermediate spaces and interpolation,the complex method.Studia Math.,24:113–190,1964.5. A.P.Calder´o n.An atomic decomposition of distributions in parabolic H pspaces.Advances in Math.,25:216–225,1977.6. A.P.Calder´o n and A.Torchinsky.Parabolic maximal functions associated witha distribution,I.Advances in Math.,16:1–64,1975.7. A.P.Calder´o n and A.Torchinsky.Parabolic maximal functions associated witha distribution,II.Advances in Math.,24:101–171,1977.8. A.P.Calder´o n and A.Zygmund.On the existence of certain singular integrals.Acta.Math.,88:85–139,1952.9.S.Y.A.Chang and R.Fefferman.A continuous version of duality of H1andBMO on the bidisc.Ann.of Math.,112:179–201,1980.10.S.Y.A.Chang,J.M.Wilson,and T.H.Wolff.Some weighted norm inequalitiesconcerning the Schroedinger m.Math.Helv.,60:217–246,1985.11.S.Chanillo and R.L.Wheeden.L p-estimates for fractional integrals and Sobolevinequalities with applications to Schroedinger operators.10:1077–1116,1985.12.S.Chanillo and R.L.Wheeden.Some weighted norm inequalities for the areaintegral.Indiana U.Math.Jour.,36:277–294,1987.13.R.Coifman and C.Fefferman.Weighted norm inequalities for maximal functionsand singular integrals.Studia Math.,51:269–274,1974.14. D.Cruz-Uribe and C.P´e rez.Two-weight extrapolation via the maximal oper-ator.J.Funct.Anal.,174:1–17,2000.15.I.Daubechies.Ten Lectures on Wavelets,volume61of CBMS-NSF RegionalConferences in Applied Mathematics.Society for Industrial and Applied Math-ematics,Philadelphia,1992.16.J.Duoandikoetxea.Fourier analysis.Number29in Graduate Studies in Math-ematics.American Mathematical Society,Providence,2001.220References17. C.Fefferman.The uncertainty principle.Bull.Amer.Math.Soc.(NS),9:129–206,1983.18. C.Fefferman and E.M.Stein.Some maximal inequalities.Amer.Jour.ofMath.,93:107–115,1971.19. C.Fefferman and E.M.Stein.H p spaces of several variables.Acta Math.,129:137–193,1972.20.R.Fefferman,R.F.Gundy,M.L.Silverstein,and E.M.Stein.Inequalitiesfor ratios of functionals of harmonic A, 79:7958–7960,1982.21.G.B.Folland.Real Analysis:Modern Techniques and Their Applications.WileyInterscience,New York,1999.22.M.Frazier,B.Jawerth,and G.Weiss.Littlewood-Paley Theory and the Studyof Function Spaces.Number79in CBMS Regional Conference Series in Math-ematics.American Mathematical Society,Providence,1991.23.J.Garcia-Cuerva.An extrapolation theorem in the theory of A p weights.Proc.Amer.Math.Soc.,87:422–426,1983.24.J.Garcia-Cuerva and J.L.Rubio de Francia.Weighted Norm Inequalities andRelated Topics.North-Holland,Amsterdam,1985.25.J.B.Garnett.Bounded Analytic Functions.Academic Press,New York,1981.26.R.F.Gundy and R.L.Wheeden.Weighted integral inequalities for the nontan-gential maximal function,Lusin area integral,and Walsh-Paley series.Studia Math.,49:107–124,1973/74.27. A.Haar.Zur Theorie der orthogonalen Funktionensysteme.Math.Ann.,69:331–371,1910.28.G.H.Hardy and J.E.Littlewood.A maximal theorem with function-theoreticapplications.Acta Math.,54:81–116,1930.29.L.Hormander.Estimates for translation invariant operators in L p spaces.ActaMath.,104:93–139,1960.30.R.A.Hunt,B.Muckenhoupt,and R.L.Wheeden.Weighted norm inequalitiesfor the conjugate function and Hilbert transform.Trans.Amer.Math.Soc., 176:227–251,1973.31.Y.Katznelson.An Introduction to Harmonic Analysis.Dover,New York,1976.32.R.Kerman and E.Sawyer.The trace inequality and eigenvalue estimates forSchroedinger operators.Annales de L’Institut Fourier,36:207–228,1986.33. A.Khinchin.Ueber dyadische Brueche.Math.Zeit.,18:109–116,1923.34.M.A.Krasnosel’skii and Ya.B.Rutickii.Convex functions and Orlicz spaces.P.Noordhoff,Groningen,1961.35. D.S.Kurtz.Littlewood-Paley and multiplier theorems on weighted L p spaces.Trans.Amer.Math.Soc.,259:235–254,1980.36. D.S.Kurtz and R.L.Wheeden.Results on weighted norm inequalities formultipliers.Trans.Amer.Math.Soc.,255:343–362,1979.37.J.E.Littlewood and R.E.A.C.Paley.Theorems on Fourier series and powerseries,Part I.J.London Math.Soc.,6:230–233,1931.38.J.E.Littlewood and R.E.A.C.Paley.Theorems on Fourier series and powerseries,Part II.Proc.London Math.Soc.,42:52–89,1937.39.J.E.Littlewood and R.E.A.C.Paley.Theorems on Fourier series and powerseries,Part III.Proc.London Math.Soc.,43:105–126,1937.40.J.Marcinkiewicz.Sur l’interpolation d’op´e rations. C.R.Acad.Sci.Paris,208:1272–1273,1939.References221 41.S.G.Mihlin.On the multipliers of Fourier integrals.Dokl.Akad.Nauk.,109:701–703,1956.42. B.Muckenhoupt.Weighted norm inequalities for the Hardy maximal function.Trans.Amer.Math.Soc.,165:207–226,1972.43.T.Murai and A.Uchiyama.Good-λinequalities for the area integral and thenontangential maximal function.Studia Math.,83:251–262,1986.44. F.Nazarov.Private communication.45.R.O’Neill.Fractional integration in Orlicz spaces.Trans.Amer.Math.Soc.,115:300–328,1963.46. C.P´e rez.Weighted norm inequalities for singular integral operators.J.LondonMath.Soc.,49:296–308,1994.47. C.P´e rez.On sufficient conditions for the boundedness of the Hardy-Littlewoodmaximal operator between weighted L p spaces with different weights.Proc.London Math.Soc.,71:135–157,1995.48. C.P´e rez.Sharp weighted L p weighted Sobolev inequalities.Annales deL’Institut Fourier,45:809–824,1995.49.J.L.Rubio de Francia.Factorization theory and A p weights.Amer.Jour.ofMath.,106:533–547,1984.50. C.Segovia and R.L.Wheeden.On weighted norm inequalities for the Lusinarea integral.Trans.Amer.Math.Soc.,176:103–123,1973.51. E.M.Stein.On the functions of Littlewood-Paley,Lusin,and Marcinkiewicz.Trans.Amer.Math.Soc.,88:430–466,1958.52. E.M.Stein.On some functions of Littlewood-Paley and Zygmund.Bull.Amer.Math.Soc.,67:99–101,1961.53. E.M.Stein.Singular Integrals and Differentiability Properties of Functions.Princeton University Press,Princeton,1970.54. E.M.Stein.The development of square functions in the work of A.Zygmund.Bull.Amer.Math.Soc.,7:359–376,1982.55.J.O.Stromberg and A.Torchinsky.Weighted Hardy Spaces,volume1381ofLecture Notes in Mathematics.Springer-Verlag,Berlin,1989.56. A.Torchinsky.Real-Variable Methods in Harmonic Analysis.Academic Press,New York,1986.57. A.Uchiyama.A constructive proof of the Fefferman-Stein decomposition ofBMO(R n).Acta Math.,148:215–241,1982.58.G.Weiss.A note on Orlicz spaces.Portugal Math.,15:35–47,1950.59.J.M.Wilson.The intrinsic square function.To appear in Revista MatematicaIberoamericana.60.J.M.Wilson.A sharp inequality for the square function.Duke Math.Jour.,55:879–888,1987.61.J.M.Wilson.Weighted inequalities for the dyadic square function withoutdyadic A∞.Duke Math.Jour.,55:19–49,1987.62.J.M.Wilson.Weighted norm inequalities for the continuous square function.Trans.Amer.Math.Soc.,314:661–692,1989.63.J.M.Wilson.Chanillo-Wheeden inequalities for0<p≤1.J.London Math.Soc.,41:283–294,1990.64.J.M.Wilson.Some two-parameter square function inequalities.Indiana U.Math.Jour.,40:419–442,1991.65.J.M.Wilson.Paraproducts and the exponential square class.Jour.of Math.Analy.and Applic.,271:374–382,2002.66. A.Zygmund.On certain integrals.Trans.Amer.Math.Soc.,55:170–204,1944.IndexY-functional,44,49,80adapted functionmultidimensional,81,90one-dimensional,69Calder´o n reproducing formula,85 Calder´o n-Zygmund operator,157, 158convergence,86,88,92–95,112,119in weighted L p,129limitations,97normalization,85redundancy,124Calder´o n-Zygmund decomposition,5 Calder´o n-Zygmund kernel,153Calder´o n-Zygmund operator,153 boundedness,154on test class,156,157Calder´o n reproducing formula,157, 158weighted norm inequalities,158,159 Carleson box,88top half,88Chanillo-Wheeden Inequality(Theorem3.10),60compact-measurable exhaustion,87 constantly changing constant,3Cruz-Uribe-P´e rez Theorem(Theorem10.9),181weighted norm inequalities,181,182, 184doubling weight,26dyadiccube,2,77interval,2maximal function,15square functionmultidimensional,77,78one-dimensional,13dyadic A∞,26dyadic doubling,26dyadic maximal function,15Hardy-Littlewood,15fine structure,17 multidimensional,77one-dimensionalF-adapted,70exponential square class,39 multidimensional,80Fourier transform,1Free Lunch Lemma(Lemma6.2),114 functionalY-,44,49,80good and bad functions,6good family ofcubes,81,90intervals,69good-λinequalities,19goodbye to,189,190224IndexH¨o rmander-Mihlin Multiplier Theorem, 197Haar coefficients,10and averages,12filtering properties,10Haar functions,9multidimensional,77,82 Hamiltonian operator,145Hardy-Littlewood maximal function, 34,35dyadic,15Hardy-Littlewood Maximal Theorem, 34,35,37dyadic,15Hilbert transform,151harmonic conjugate,152 interpolation,17,32intrinsic square function,103,104,117, 118Khinchin’s Inequalities(Theorem14.1), 215maximal functiondyadic,15Hardy-Littlewood,34,35dyadic,15Orlicz space,167,168Rubio de Francia,27necessity of A∞,55Orlicz maximal functionintegrability,174,176Orlicz space,162dual,164examples,163H¨o lder inequality,166local,166maximal function,167,168P´e rez Maximal Theorem(Theorem10.4),168,185potential wells,147probability lemma,79Rademacher functions,214random variables,214independent,214Schr¨o dinger equation,145Schwartz class,2singular integral operator,151splitting of functions,5square functionclassical,113,120dyadicmultidimensional,77,78one-dimensional,13intrinsic,103,104,117,118dominates many others,117,118 one-dimensionalF-adapted,70real-variable,101semi-discretemultidimensional,90one-dimensional,71stopping-time argument,21,23,49 telescoping sum,52,64Uchiyama Decomposition Lemma(Lemma6.3),115uncertainty principle,147weak-type inequality,16weight,26A1,47A d∞,26,27,45,47 counterexample,217doubling,26draping,28inequalities,29counterexample,31,36,43,55,62,64,67Muckenhoupt A p condition,129connection with A∞,132,135,136extrapolation,131maximal functions,137,138singular integrals,159,160 weighted norm inequalitiesCalder´o n-Zygmund operator,158, 159Cruz-Uribe-P´e rez Theorem(Theorem10.9),184square function,181A p weights,130Young function,161approximate dual,172examples,173dual,164examples,165Lecture Notes in Mathematics For information about earlier volumesplease contact your bookseller or SpringerLNM Online archive:V ol.1732:K.Keller,Invariant Factors,Julia Equivalences and the(Abstract)Mandelbrot Set(2000)V ol.1733:K.Ritter,Average-Case Analysis of Numerical Problems(2000)V ol.1734:M.Espedal,A.Fasano,A.Mikeli´c,Filtration in Porous Media and Industrial Applications.Cetraro1998. Editor:A.Fasano.2000.V ol.1735:D.Yafaev,Scattering Theory:Some Old and New Problems(2000)V ol.1736:B.O.Turesson,Nonlinear Potential Theory and Weighted Sobolev Spaces(2000)V ol.1737:S.Wakabayashi,Classical Microlocal Analysis in the Space of Hyperfunctions(2000)V ol.1738:M.Émery,A.Nemirovski,D.V oiculescu, Lectures on Probability Theory and Statistics(2000)V ol.1739:R.Burkard,P.Deuflhard,A.Jameson,J.-L. Lions,G.Strang,Computational Mathematics Driven by Industrial Problems.Martina Franca,1999.Editors: V.Capasso,H.Engl,J.Periaux(2000)V ol.1740:B.Kawohl,O.Pironneau,L.Tartar,J.-P.Zole-sio,Optimal Shape Design.Tróia,Portugal1999.Editors: A.Cellina,A.Ornelas(2000)V ol.1741:E.Lombardi,Oscillatory Integrals and Phe-nomena Beyond all Algebraic Orders(2000)V ol.1742: A.Unterberger,Quantization and Non-holomorphic Modular Forms(2000)V ol.1743:L.Habermann,Riemannian Metrics of Con-stant Mass and Moduli Spaces of Conformal Structures (2000)V ol.1744:M.Kunze,Non-Smooth Dynamical Systems (2000)V ol.1745:man,G.Schechtman(Eds.),Geomet-ric Aspects of Functional Analysis.Israel Seminar1999-2000(2000)V ol.1746:A.Degtyarev,I.Itenberg,V.Kharlamov,Real Enriques Surfaces(2000)V ol.1747:L.W.Christensen,Gorenstein Dimensions (2000)V ol.1748:M.Ruzicka,Electrorheological Fluids:Model-ing and Mathematical Theory(2001)V ol.1749:M.Fuchs,G.Seregin,Variational Methods for Problems from Plasticity Theory and for Generalized Newtonian Fluids(2001)V ol.1750:B.Conrad,Grothendieck Duality and Base Change(2001)V ol.1751:N.J.Cutland,Loeb Measures in Practice: Recent Advances(2001)V ol.1752:Y.V.Nesterenko,P.Philippon,Introduction to Algebraic Independence Theory(2001)V ol.1753:A.I.Bobenko,U.Eitner,PainlevéEquations in the Differential Geometry of Surfaces(2001)V ol.1754:W.Bertram,The Geometry of Jordan and Lie Structures(2001)V ol.1755:J.Azéma,M.Émery,M.Ledoux,M.Yor (Eds.),Séminaire de Probabilités XXXV(2001)V ol.1756:P.E.Zhidkov,Korteweg de Vries and Nonlin-ear Schrödinger Equations:Qualitative Theory(2001)V ol.1757:R.R.Phelps,Lectures on Choquet’s Theorem (2001)V ol.1758:N.Monod,Continuous Bounded Cohomology of Locally Compact Groups(2001)V ol.1759:Y.Abe,K.Kopfermann,Toroidal Groups (2001)V ol.1760:D.Filipovi´c,Consistency Problems for Heath-Jarrow-Morton Interest Rate Models(2001)V ol.1761:C.Adelmann,The Decomposition of Primes in Torsion Point Fields(2001)V ol.1762:S.Cerrai,Second Order PDE’s in Finite and Infinite Dimension(2001)V ol.1763:J.-L.Loday,A.Frabetti,F.Chapoton,F.Goi-chot,Dialgebras and Related Operads(2001)V ol.1764:A.Cannas da Silva,Lectures on Symplectic Geometry(2001)V ol.1765:T.Kerler,V.V.Lyubashenko,Non-Semisimple Topological Quantum Field Theories for3-Manifolds with Corners(2001)V ol.1766:H.Hennion,L.Hervé,Limit Theorems for Markov Chains and Stochastic Properties of Dynamical Systems by Quasi-Compactness(2001)V ol.1767:J.Xiao,Holomorphic Q Classes(2001)V ol.1768:M.J.Pflaum,Analytic and Geometric Study of Stratified Spaces(2001)V ol.1769:M.Alberich-Carramiñana,Geometry of the Plane Cremona Maps(2002)V ol.1770:H.Gluesing-Luerssen,Linear Delay-Differential Systems with Commensurate Delays:An Algebraic Approach(2002)V ol.1771:M.Émery,M.Yor(Eds.),Séminaire de Prob-abilités1967-1980.A Selection in Martingale Theory (2002)V ol.1772:F.Burstall,D.Ferus,K.Leschke,F.Pedit, U.Pinkall,Conformal Geometry of Surfaces in S4(2002) V ol.1773:Z.Arad,M.Muzychuk,Standard Integral Table Algebras Generated by a Non-real Element of Small Degree(2002)V ol.1774:V.Runde,Lectures on Amenability(2002)V ol.1775:W.H.Meeks,A.Ros,H.Rosenberg,The Global Theory of Minimal Surfaces in Flat Spaces. Martina Franca1999.Editor:G.P.Pirola(2002)V ol.1776:K.Behrend,C.Gomez,V.Tarasov,G.Tian, Quantum Comohology.Cetraro1997.Editors:P.de Bar-tolomeis,B.Dubrovin,C.Reina(2002)V ol.1777:E.García-Río,D.N.Kupeli,R.Vázquez-Lorenzo,Osserman Manifolds in Semi-Riemannian Geometry(2002)V ol.1778:H.Kiechle,Theory of K-Loops(2002)V ol.1779:I.Chueshov,Monotone Random Systems (2002)V ol.1780:J.H.Bruinier,Borcherds Products on O(2,1) and Chern Classes of Heegner Divisors(2002)V ol.1781:E.Bolthausen,E.Perkins,A.van der Vaart, Lectures on Probability Theory and Statistics.Ecole d’Etéde Probabilités de Saint-Flour XXIX-1999.Editor: P.Bernard(2002)V ol.1782:C.-H.Chu,u,Harmonic Functions on Groups and Fourier Algebras(2002)V ol.1783:L.Grüne,Asymptotic Behavior of Dynamical and Control Systems under Perturbation and Discretiza-tion(2002)V ol.1784:L.H.Eliasson,S.B.Kuksin,S.Marmi,J.-C. Yoccoz,Dynamical Systems and Small Divisors.Cetraro, Italy1998.Editors:S.Marmi,J.-C.Yoccoz(2002)V ol.1785:J.Arias de Reyna,Pointwise Convergence of Fourier Series(2002)V ol.1786:S.D.Cutkosky,Monomialization of Mor-phisms from3-Folds to Surfaces(2002)V ol.1787:S.Caenepeel,itaru,S.Zhu,Frobenius and Separable Functors for Generalized Module Cate-gories and Nonlinear Equations(2002)V ol.1788:A.Vasil’ev,Moduli of Families of Curves for Conformal and Quasiconformal Mappings(2002)V ol.1789:Y.Sommerhäuser,Yetter-Drinfel’d Hopf alge-bras over groups of prime order(2002)V ol.1790:X.Zhan,Matrix Inequalities(2002)V ol.1791:M.Knebusch,D.Zhang,Manis Valuations and Prüfer Extensions I:A new Chapter in Commutative Algebra(2002)V ol.1792:D.D.Ang,R.Gorenflo,V.K.Le,D.D.Trong, Moment Theory and Some Inverse Problems in Potential Theory and Heat Conduction(2002)V ol.1793:J.Cortés Monforte,Geometric,Control and Numerical Aspects of Nonholonomic Systems(2002)V ol.1794:N.Pytheas Fogg,Substitution in Dynamics, Arithmetics and Combinatorics.Editors:V.Berthé,S.Fer-enczi,C.Mauduit,A.Siegel(2002)V ol.1795:H.Li,Filtered-Graded Transfer in Using Non-commutative Gröbner Bases(2002)V ol.1796:J.M.Melenk,hp-Finite Element Methods for Singular Perturbations(2002)V ol.1797:B.Schmidt,Characters and Cyclotomic Fields in Finite Geometry(2002)V ol.1798:W.M.Oliva,Geometric Mechanics(2002)V ol.1799:H.Pajot,Analytic Capacity,Rectifiability, Menger Curvature and the Cauchy Integral(2002)V ol.1800:O.Gabber,L.Ramero,Almost Ring Theory (2003)V ol.1801:J.Azéma,M.Émery,M.Ledoux,M.Yor (Eds.),Séminaire de Probabilités XXXVI(2003)V ol.1802:V.Capasso, E.Merzbach, B.G.Ivanoff, M.Dozzi,R.Dalang,T.Mountford,Topics in Spatial Stochastic Processes.Martina Franca,Italy2001.Editor: E.Merzbach(2003)V ol.1803:G.Dolzmann,Variational Methods for Crys-talline Microstructure–Analysis and Computation(2003) V ol.1804:I.Cherednik,Ya.Markov,R.Howe,G.Lusztig, Iwahori-Hecke Algebras and their Representation Theory. Martina Franca,Italy1999.Editors:V.Baldoni,D.Bar-basch(2003)V ol.1805:F.Cao,Geometric Curve Evolution and Image Processing(2003)V ol.1806:H.Broer,I.Hoveijn.G.Lunther,G.Vegter, Bifurcations in Hamiltonian puting Singu-larities by Gröbner Bases(2003)V ol.1807:man,G.Schechtman(Eds.),Geomet-ric Aspects of Functional Analysis.Israel Seminar2000-2002(2003)V ol.1808:W.Schindler,Measures with Symmetry Prop-erties(2003)V ol.1809:O.Steinbach,Stability Estimates for Hybrid Coupled Domain Decomposition Methods(2003)V ol.1810:J.Wengenroth,Derived Functors in Functional Analysis(2003)V ol.1811:J.Stevens,Deformations of Singularities (2003)V ol.1812:L.Ambrosio,K.Deckelnick,G.Dziuk, M.Mimura,V.A.Solonnikov,H.M.Soner,Mathematical Aspects of Evolving Interfaces.Madeira,Funchal,Portu-gal2000.Editors:P.Colli,J.F.Rodrigues(2003)V ol.1813:L.Ambrosio,L.A.Caffarelli,Y.Brenier, G.Buttazzo,C.Villani,Optimal Transportation and its Applications.Martina Franca,Italy2001.Editors:L.A. Caffarelli,S.Salsa(2003)V ol.1814:P.Bank, F.Baudoin,H.Föllmer,L.C.G. Rogers,M.Soner,N.Touzi,Paris-Princeton Lectures on Mathematical Finance2002(2003)V ol.1815: A.M.Vershik(Ed.),Asymptotic Com-binatorics with Applications to Mathematical Physics. St.Petersburg,Russia2001(2003)V ol.1816:S.Albeverio,W.Schachermayer,M.Tala-grand,Lectures on Probability Theory and Statistics. Ecole d’Etéde Probabilités de Saint-Flour XXX-2000. Editor:P.Bernard(2003)V ol.1817:E.Koelink,W.Van Assche(Eds.),Orthogonal Polynomials and Special Functions.Leuven2002(2003) V ol.1818:M.Bildhauer,Convex Variational Problems with Linear,nearly Linear and/or Anisotropic Growth Conditions(2003)V ol.1819:D.Masser,Yu.V.Nesterenko,H.P.Schlick-ewei,W.M.Schmidt,M.Waldschmidt,Diophantine Approximation.Cetraro,Italy2000.Editors:F.Amoroso, U.Zannier(2003)V ol.1820:F.Hiai,H.Kosaki,Means of Hilbert Space Operators(2003)V ol.1821:S.Teufel,Adiabatic Perturbation Theory in Quantum Dynamics(2003)V ol.1822:S.-N.Chow,R.Conti,R.Johnson,J.Mallet-Paret,R.Nussbaum,Dynamical Systems.Cetraro,Italy 2000.Editors:J.W.Macki,P.Zecca(2003)V ol.1823: A.M.Anile,W.Allegretto, C.Ring-hofer,Mathematical Problems in Semiconductor Physics. Cetraro,Italy1998.Editor:A.M.Anile(2003)V ol.1824:J.A.Navarro González,J.B.Sancho de Salas, C∞–Differentiable Spaces(2003)V ol.1825:J.H.Bramble,A.Cohen,W.Dahmen,Mul-tiscale Problems and Methods in Numerical Simulations, Martina Franca,Italy2001.Editor:C.Canuto(2003)V ol.1826:K.Dohmen,Improved Bonferroni Inequal-ities via Abstract Tubes.Inequalities and Identities of Inclusion-Exclusion Type.VIII,113p,2003.V ol.1827:K.M.Pilgrim,Combinations of Complex Dynamical Systems.IX,118p,2003.V ol.1828:D.J.Green,Gröbner Bases and the Computa-tion of Group Cohomology.XII,138p,2003.V ol.1829:E.Altman,B.Gaujal,A.Hordijk,Discrete-Event Control of Stochastic Networks:Multimodularity and Regularity.XIV,313p,2003.V ol.1830:M.I.Gil’,Operator Functions and Localization of Spectra.XIV,256p,2003.V ol.1831:A.Connes,J.Cuntz,E.Guentner,N.Hig-son,J.E.Kaminker,Noncommutative Geometry,Martina Franca,Italy2002.Editors:S.Doplicher,L.Longo(2004) V ol.1832:J.Azéma,M.Émery,M.Ledoux,M.Yor (Eds.),Séminaire de Probabilités XXXVII(2003)V ol.1833:D.-Q.Jiang,M.Qian,M.-P.Qian,Mathemati-cal Theory of Nonequilibrium Steady States.On the Fron-tier of Probability and Dynamical Systems.IX,280p, 2004.V ol.1834:Yo.Yomdin,te,Tame Geometry with Application in Smooth Analysis.VIII,186p,2004.V ol.1835:O.T.Izhboldin, B.Kahn,N.A.Karpenko, A.Vishik,Geometric Methods in the Algebraic Theory of Quadratic Forms.Summer School,Lens,2000.Editor: J.-P.Tignol(2004)V ol.1836:C.Nˇa stˇa sescu,F.Van Oystaeyen,Methods of Graded Rings.XIII,304p,2004.V ol.1837:S.Tavaré,O.Zeitouni,Lectures on Probabil-ity Theory and Statistics.Ecole d’Etéde Probabilités de Saint-Flour XXXI-2001.Editor:J.Picard(2004)V ol.1838:A.J.Ganesh,N.W.O’Connell,D.J.Wischik, Big Queues.XII,254p,2004.V ol.1839:R.Gohm,Noncommutative Stationary Processes.VIII,170p,2004.V ol.1840:B.Tsirelson,W.Werner,Lectures on Probabil-ity Theory and Statistics.Ecole d’Etéde Probabilités de Saint-Flour XXXII-2002.Editor:J.Picard(2004)V ol.1841:W.Reichel,Uniqueness Theorems for Vari-ational Problems by the Method of Transformation Groups(2004)V ol.1842:T.Johnsen,A.L.Knutsen,K3Projective Mod-els in Scrolls(2004)V ol.1843:B.Jefferies,Spectral Properties of Noncom-muting Operators(2004)V ol.1844:K.F.Siburg,The Principle of Least Action in Geometry and Dynamics(2004)V ol.1845:Min Ho Lee,Mixed Automorphic Forms,Torus Bundles,and Jacobi Forms(2004)V ol.1846:H.Ammari,H.Kang,Reconstruction of Small Inhomogeneities from Boundary Measurements(2004) V ol.1847:T.R.Bielecki,T.Björk,M.Jeanblanc,M. Rutkowski,J.A.Scheinkman,W.Xiong,Paris-Princeton Lectures on Mathematical Finance2003(2004)V ol.1848:M.Abate,J.E.Fornaess,X.Huang,J.P.Rosay, A.Tumanov,Real Methods in Complex and CR Geom-etry,Martina Franca,Italy2002.Editors:D.Zaitsev,G. Zampieri(2004)V ol.1849:Martin L.Brown,Heegner Modules and Ellip-tic Curves(2004)V ol.1850:man,G.Schechtman(Eds.),Geomet-ric Aspects of Functional Analysis.Israel Seminar2002-2003(2004)V ol.1851:O.Catoni,Statistical Learning Theory and Stochastic Optimization(2004)V ol.1852:A.S.Kechris,ler,Topics in Orbit Equivalence(2004)V ol.1853:Ch.Favre,M.Jonsson,The Valuative Tree (2004)V ol.1854:O.Saeki,Topology of Singular Fibers of Dif-ferential Maps(2004)V ol.1855:G.Da Prato,P.C.Kunstmann,siecka, A.Lunardi,R.Schnaubelt,L.Weis,Functional Analytic Methods for Evolution Equations.Editors:M.Iannelli, R.Nagel,S.Piazzera(2004)V ol.1856:K.Back,T.R.Bielecki,C.Hipp,S.Peng, W.Schachermayer,Stochastic Methods in Finance,Bres-sanone/Brixen,Italy,2003.Editors:M.Fritelli,W.Rung-galdier(2004)V ol.1857:M.Émery,M.Ledoux,M.Yor(Eds.),Sémi-naire de Probabilités XXXVIII(2005)V ol.1858:A.S.Cherny,H.-J.Engelbert,Singular Stochas-tic Differential Equations(2005)V ol.1859:E.Letellier,Fourier Transforms of Invariant Functions on Finite Reductive Lie Algebras(2005)V ol.1860:A.Borisyuk,G.B.Ermentrout,A.Friedman, D.Terman,Tutorials in Mathematical Biosciences I. Mathematical Neurosciences(2005)V ol.1861:G.Benettin,J.Henrard,S.Kuksin,Hamil-tonian Dynamics–Theory and Applications,Cetraro, Italy,1999.Editor:A.Giorgilli(2005)V ol.1862:B.Helffer,F.Nier,Hypoelliptic Estimates and Spectral Theory for Fokker-Planck Operators and Witten Laplacians(2005)V ol.1863:H.Führ,Abstract Harmonic Analysis of Con-tinuous Wavelet Transforms(2005)V ol.1864:K.Efstathiou,Metamorphoses of Hamiltonian Systems with Symmetries(2005)V ol.1865:D.Applebaum,B.V.R.Bhat,J.Kustermans, J.M.Lindsay,Quantum Independent Increment Processes I.From Classical Probability to Quantum Stochastic Cal-culus.Editors:M.Schürmann,U.Franz(2005)V ol.1866:O.E.Barndorff-Nielsen,U.Franz,R.Gohm, B.Kümmerer,S.Thorbjønsen,Quantum Independent Increment Processes II.Structure of Quantum Lévy Processes,Classical Probability,and Physics.Editors:M. Schürmann,U.Franz,(2005)V ol.1867:J.Sneyd(Ed.),Tutorials in Mathematical Bio-sciences II.Mathematical Modeling of Calcium Dynamics and Signal Transduction.(2005)V ol.1868:J.Jorgenson,ng,Pos n(R)and Eisenstein Series.(2005)V ol.1869:A.Dembo,T.Funaki,Lectures on Probabil-ity Theory and Statistics.Ecole d’Etéde Probabilités de Saint-Flour XXXIII-2003.Editor:J.Picard(2005)V ol.1870:V.I.Gurariy,W.Lusky,Geometry of Müntz Spaces and Related Questions.(2005)V ol.1871:P.Constantin,G.Gallavotti,A.V.Kazhikhov, Y.Meyer,ai,Mathematical Foundation of Turbu-lent Viscous Flows,Martina Franca,Italy,2003.Editors: M.Cannone,T.Miyakawa(2006)V ol.1872:A.Friedman(Ed.),Tutorials in Mathemati-cal Biosciences III.Cell Cycle,Proliferation,and Cancer (2006)V ol.1873:R.Mansuy,M.Yor,Random Times and En-largements of Filtrations in a Brownian Setting(2006)V ol.1874:M.Yor,M.Émery(Eds.),In Memoriam Paul-AndréMeyer-Séminaire de Probabilités XXXIX(2006) V ol.1875:J.Pitman,Combinatorial Stochastic Processes. Ecole d’Etéde Probabilités de Saint-Flour XXXII-2002. Editor:J.Picard(2006)V ol.1876:H.Herrlich,Axiom of Choice(2006)V ol.1877:J.Steuding,Value Distributions of L-Functions (2007)V ol.1878:R.Cerf,The Wulff Crystal in Ising and Percol-ation Models,Ecole d’Etéde Probabilités de Saint-Flour XXXIV-2004.Editor:Jean Picard(2006)V ol.1879:G.Slade,The Lace Expansion and its Applica-tions,Ecole d’Etéde Probabilités de Saint-Flour XXXIV-2004.Editor:Jean Picard(2006)V ol.1880:S.Attal,A.Joye,C.-A.Pillet,Open Quantum Systems I,The Hamiltonian Approach(2006)V ol.1881:S.Attal,A.Joye,C.-A.Pillet,Open Quantum Systems II,The Markovian Approach(2006)V ol.1882:S.Attal,A.Joye,C.-A.Pillet,Open Quantum Systems III,Recent Developments(2006)V ol.1883:W.Van Assche,F.Marcellàn(Eds.),Orthogo-nal Polynomials and Special Functions,Computation and Application(2006)V ol.1884:N.Hayashi, E.I.Kaikina,P.I.Naumkin, I.A.Shishmarev,Asymptotics for Dissipative Nonlinear Equations(2006)V ol.1885:A.Telcs,The Art of Random Walks(2006)V ol.1886:S.Takamura,Splitting Deformations of Dege-nerations of Complex Curves(2006)V ol.1887:K.Habermann,L.Habermann,Introduction to Symplectic Dirac Operators(2006)V ol.1888:J.van der Hoeven,Transseries and Real Differ-ential Algebra(2006)V ol.1889:G.Osipenko,Dynamical Systems,Graphs,and Algorithms(2006)V ol.1890:M.Bunge,J.Funk,Singular Coverings of Toposes(2006)V ol.1891:J.B.Friedlander, D.R.Heath-Brown, H.Iwaniec,J.Kaczorowski,Analytic Number Theory, Cetraro,Italy,2002.Editors:A.Perelli,C.Viola(2006) V ol.1892:A.Baddeley,I.Bárány,R.Schneider,W.Weil, Stochastic Geometry,Martina Franca,Italy,2004.Editor: W.Weil(2007)V ol.1893:H.Hanßmann,Local and Semi-Local Bifur-cations in Hamiltonian Dynamical Systems,Results and Examples(2007)V ol.1894:C.W.Groetsch,Stable Approximate Evaluation of Unbounded Operators(2007)V ol.1895:L.Molnár,Selected Preserver Problems on Algebraic Structures of Linear Operators and on Function Spaces(2007)V ol.1896:P.Massart,Concentration Inequalities and Model Selection,Ecole d’Étéde Probabilités de Saint-Flour XXXIII-2003.Editor:J.Picard(2007)V ol.1897:R.Doney,Fluctuation Theory for Lévy Processes,Ecole d’Étéde Probabilités de Saint-Flour XXXV-2005.Editor:J.Picard(2007)V ol.1898:H.R.Beyer,Beyond Partial Differential Equa-tions,On linear and Quasi-Linear Abstract Hyperbolic Evolution Equations(2007)V ol.1899:Séminaire de Probabilités XL.Editors: C.Donati-Martin,M.Émery,A.Rouault,C.Stricker (2007)V ol.1900:E.Bolthausen,A.Bovier(Eds.),Spin Glasses (2007)V ol.1901:O.Wittenberg,Intersections de deux quadriques et pinceaux de courbes de genre1,Inter-sections of Two Quadrics and Pencils of Curves of Genus 1(2007)V ol.1902: A.Isaev,Lectures on the Automorphism Groups of Kobayashi-Hyperbolic Manifolds(2007)V ol.1903:G.Kresin,V.Maz’ya,Sharp Real-Part Theo-rems(2007)V ol.1904:P.Giesl,Construction of Global Lyapunov Functions Using Radial Basis Functions(2007)V ol.1905:C.Prévˆo t,M.Röckner,A Concise Course on Stochastic Partial Differential Equations(2007)V ol.1906:T.Schuster,The Method of Approximate Inverse:Theory and Applications(2007)V ol.1907:M.Rasmussen,Attractivity and Bifurcation for Nonautonomous Dynamical Systems(2007)V ol.1908:T.J.Lyons,M.Caruana,T.Lévy,Differential Equations Driven by Rough Paths,Ecole d’Étéde Proba-bilités de Saint-Flour XXXIV-2004(2007)V ol.1909:H.Akiyoshi,M.Sakuma,M.Wada, Y.Yamashita,Punctured Torus Groups and2-Bridge Knot Groups(I)(2007)V ol.1910:man,G.Schechtman(Eds.),Geo-metric Aspects of Functional Analysis.Israel Seminar 2004-2005(2007)V ol.1911: A.Bressan, D.Serre,M.Williams, K.Zumbrun,Hyperbolic Systems of Balance Laws. Lectures given at the C.I.M.E.Summer School held in Cetraro,Italy,July14–21,2003.Editor:P.Marcati(2007) V ol.1912:V.Berinde,Iterative Approximation of Fixed Points(2007)V ol.1913:J.E.Marsden,G.Misiołek,J.-P.Ortega, M.Perlmutter,T.S.Ratiu,Hamiltonian Reduction by Stages(2007)V ol.1914:G.Kutyniok,Affine Density in Wavelet Analysis(2007)V ol.1915:T.Bıyıkoˇg lu,J.Leydold,P.F.Stadler,Laplacian Eigenvectors of Graphs.Perron-Frobenius and Faber-Krahn Type Theorems(2007)V ol.1916:C.Villani,F.Rezakhanlou,Entropy Methods for the Boltzmann Equation.Editors:F.Golse,S.Olla (2008)V ol.1917:I.Veseli´c,Existence and Regularity Properties of the Integrated Density of States of Random Schrödinger (2008)V ol.1918:B.Roberts,R.Schmidt,Local Newforms for GSp(4)(2007)V ol.1919:R.A.Carmona,I.Ekeland, A.Kohatsu-Higa,sry,P.-L.Lions,H.Pham, E.Taflin, Paris-Princeton Lectures on Mathematical Finance2004. Editors:R.A.Carmona,E.Çinlar,I.Ekeland,E.Jouini, J.A.Scheinkman,N.Touzi(2007)V ol.1920:S.N.Evans,Probability and Real Trees.Ecole d’Étéde Probabilités de Saint-Flour XXXV-2005(2008) V ol.1921:J.P.Tian,Evolution Algebras and their Appli-cations(2008)V ol.1922:A.Friedman(Ed.),Tutorials in Mathematical BioSciences IV.Evolution and Ecology(2008)V ol.1923:J.P.N.Bishwal,Parameter Estimation in Stochastic Differential Equations(2008)V ol.1924:M.Wilson,Weighted Littlewood-Paley Theory and Exponential-Square Integrability(2008)V ol.1925:M.du Sautoy,Zeta Functions of Groups and Rings(2008)V ol.1926:L.Barreira,V.Claudia,Stability of Nonauto-nomous Differential Equations(2008)Recent Reprints and New EditionsV ol.1618:G.Pisier,Similarity Problems and Completely Bounded Maps.1995–2nd exp.edition(2001)V ol.1629:J.D.Moore,Lectures on Seiberg-Witten Invariants.1997–2nd edition(2001)V ol.1638:P.Vanhaecke,Integrable Systems in the realm of Algebraic Geometry.1996–2nd edition(2001)V ol.1702:J.Ma,J.Yong,Forward-Backward Stochas-tic Differential Equations and their Applications.1999–Corr.3rd printing(2007)V ol.830:J.A.Green,Polynomial Representations of GL n,with an Appendix on Schensted Correspondence and Littelmann Paths by K.Erdmann,J.A.Green and M.Schocker1980–2nd corr.and augmented edition (2007)。
克劳德•麦凯《回到哈莱姆》中的跨国书写舒进艳内容摘要:克劳德•麦凯的《回到哈莱姆》描摹了20世纪早期的黑人跨国体验。
学界主要阐释了作者个人的跨国经历与黑人国际主义思想对小说塑造主要人物的影响,而忽视了小说中副线主人公雷的国籍及其旅居哈莱姆的意义。
雷的跨国移民经历既再现了麦凯的复杂跨国情感与认同经历,又观照了哈莱姆作为流散非裔移居的理想家园与城市黑人社区所承载的空间意涵。
论文提出哈莱姆具有三个维度,作为移民唤起历史记忆的地理空间、建构跨国身份的政治空间及容纳差异的多元文化空间,并考察移民在跨国流动中历经的现代性体验,以此揭示他们通过改变既定身份与重新定义自我而竭力摆脱传统的民族、种族和阶级观念的束缚与身份认同的困惑,从而参与到美国城市的种族空间生产中。
关键词:克劳德•麦凯;《回到哈莱姆》;跨国书写基金项目:本文系国家社会科学重大项目“美国文学地理的文史考证与学科建构”(项目编号:16ZDA197);天津市研究生科研创新项目“美国新现实主义小说的跨国空间研究”(项目编号:19YJSB039)的阶段性研究成果。
作者简介:舒进艳,南开大学外国语学院博士研究生、喀什大学外国语学院副教授,主要从事美国文学研究。
Title: Claude Mckay’s Transnational Writing in Home to HarlemAbstract: Claude McKay’s Home to Harlem depicts the black transnational experience of the early 20th century. Academics mainly studied the influence of McKay’s personal transnational experience and black internationalist thinking on his main character, but neglected the minor plot’s protagonist Ray and his nationality, and the significance of his sojourn in Harlem. Ray’s transnational migration experience not only embodies McKay’s complex transnational feeling and identity experience, but also reflects Harlem’s spatial significance as an ideal home for African diaspora and urban black community. The paper aims to examine Caribbean immigrants’ experience of modernity in Harlem which is interpreted as the geographic space for immigrants to evoke historical memories, the political space for constructing transnational identities and the multicultural space for accommodating differences. It is to prove that they manage to extricate themselves from the shackles of traditional concepts of nation, race and class and their confusion of identity by changing their established identity and redefining themselves, and thus participate in the production of racial space in American cities.60Foreign Language and Literature Research 2 (2021)外国语文研究2021年第2期Key words: Claude Mckay; Home to Harlem; transnational writingAuthor: Shu Jinyan is Ph. D. candidate at College of Foreign Languages, Nankai University (Tianjin, 300071, China), associate professor at School of Foreign Studies, Kashi University (Kashi 844000, China). Her major academic research interest includes American literature. E-mail: ******************1925年,阿伦•洛克在《新黑人》选集中将哈莱姆描述为一个国际化的文化之都,视其重要性堪比欧洲新兴民族国家的首都。
LEOPOLD-FRANZENS UNIVERSITYChair of Engineering Mechanicso.Univ.-Prof.Dr.-Ing.habil.G.I.Schu¨e ller,Ph.D.G.I.Schueller@uibk.ac.at Technikerstrasse13,A-6020Innsbruck,Austria,EU Tel.:+435125076841Fax.:+435125072905 mechanik@uibk.ac.at,http://mechanik.uibk.ac.atIfM-Publication2-407G.I.Schu¨e ller.Developments in stochastic structural mechanics.Archive of Applied Mechanics,published online,2006.Archive of Applied Mechanics manuscript No.(will be inserted by the editor)Developments in Stochastic Structural MechanicsG.I.Schu¨e llerInstitute of Engineering Mechanics,Leopold-Franzens University,Innsbruck,Aus-tria,EUReceived:date/Revised version:dateAbstract Uncertainties are a central element in structural analysis and design.But even today they are frequently dealt with in an intuitive or qualitative way only.However,as already suggested80years ago,these uncertainties may be quantified by statistical and stochastic procedures. In this contribution it is attempted to shed light on some of the recent advances in the now establishedfield of stochastic structural mechanics and also solicit ideas on possible future developments.1IntroductionThe realistic modeling of structures and the expected loading conditions as well as the mechanisms of their possible deterioration with time are un-doubtedly one of the major goals of structural and engineering mechanics2G.I.Schu¨e ller respectively.It has been recognized that this should also include the quan-titative consideration of the statistical uncertainties of the models and the parameters involved[56].There is also a general agreement that probabilis-tic methods should be strongly rooted in the basic theories of structural en-gineering and engineering mechanics and hence represent the natural next step in the development of thesefields.It is well known that modern methods leading to a quantification of un-certainties of stochastic systems require computational procedures.The de-velopment of these procedures goes in line with the computational methods in current traditional(deterministic)analysis for the solution of problems required by the engineering practice,where certainly computational pro-cedures dominate.Hence,their further development within computational stochastic structural analysis is a most important requirement for dissemi-nation of stochastic concepts into engineering practice.Most naturally,pro-cedures to deal with stochastic systems are computationally considerably more involved than their deterministic counterparts,because the parameter set assumes a(finite or infinite)number of values in contrast to a single point in the parameter space.Hence,in order to be competitive and tractable in practical applications,the computational efficiency of procedures utilized is a crucial issue.Its significance should not be underestimated.Improvements on efficiency can be attributed to two main factors,i.e.by improved hard-ware in terms of ever faster computers and improved software,which means to improve the efficiency of computational algorithms,which also includesDevelopments in Stochastic Structural Mechanics3 utilizing parallel processing and computer farming respectively.For a con-tinuous increase of their efficiency by software developments,computational procedure of stochastic analysis should follow a similar way as it was gone in the seventieth and eighties developing the deterministic FE approach. One important aspect in this fast development was the focus on numerical methods adjusted to the strength and weakness of numerical computational algorithms.In other words,traditional ways of structural analysis devel-oped before the computer age have been dropped,redesigned and adjusted respectively to meet the new requirements posed by the computational fa-cilities.Two main streams of computational procedures in Stochastic Structural Analysis can be observed.Thefirst of this main class is the generation of sample functions by Monte Carlo simulation(MCS).These procedures might be categorized further according to their purpose:–Realizations of prescribed statistical information:samples must be com-patible with prescribed stochastic information such as spectral density, correlation,distribution,etc.,applications are:(1)Unconditional simula-tion of stochastic processes,fields and waves.(2)Conditional simulation compatible with observations and a priori statistical information.–Assessment of the stochastic response for a mathematical model with prescribed statistics(random loading/system parameters)of the param-eters,applications are:(1)Representative sample for the estimation of the overall distribution.4G.I.Schu¨e ller Indiscriminate(blind)generation of samples.Numerical integration of SDE’s.(2)Representative sample for the reliability assessment by gen-erating adverse rare events with positive probability,i.e.by:(a)variance reduction techniques controlling the realizations of RV’s,(b)controlling the evolution in time of sampling functions.The other main class provides numerical solutions to analytical proce-dures.Grouping again according to the respective purpose the following classification can be made:Numerical solutions of Kolmogorov equations(Galerkin’s method,Finite El-ement method,Path Integral method),Moment Closure Schemes,Compu-tation of the Evolution of Moments,Maximum Entropy Procedures,Asymp-totic Stability of Diffusion Processes.In the following,some of the outlined topics will be addressed stressing new developments.These topics are described within the next six subject areas,each focusing on a different issue,i.e.representation of stochastic processes andfields,structural response,stochastic FE methods and parallel processing,structural reliability and optimization,and stochastic dynamics. In this context it should be mentioned that aside from the MIT-Conference series the USNCCM,ECCM and WCCM’s do have a larger part of sessions addressing computational stochastic issues.Developments in Stochastic Structural Mechanics5 2Representation of Stochastic ProcessesMany quantities involving randomfluctuations in time and space might be adequately described by stochastic processes,fields and waves.Typical ex-amples of engineering interest are earthquake ground motion,sea waves, wind turbulence,road roughness,imperfection of shells,fluctuating prop-erties in random media,etc.For this setup,probabilistic characteristics of the process are known from various measurements and investigations in the past.In structural engineering,the available probabilistic characteristics of random quantities affecting the loading or the mechanical system can be often not utilized directly to account for the randomness of the structural response due to its complexity.For example,in the common case of strong earthquake motion,the structural response will be in general non-linear and it might be too difficult to compute the probabilistic characteristics of the response by other means than Monte Carlo simulation.For the purpose of Monte Carlo simulation sample functions of the involved stochastic pro-cess must be generated.These sample functions should represent accurately the characteristics of the underlying stochastic process orfields and might be stationary and non-stationary,homogeneous or non-homogeneous,one-dimensional or multi-dimensional,uni-variate or multi-variate,Gaussian or non-Gaussian,depending very much on the requirements of accuracy of re-alistic representation of the physical behavior and on the available statistical data.6G.I.Schu¨e ller The main requirement on the sample function is its accurate represen-tation of the available stochastic information of the process.The associ-ated mathematical model can be selected in any convenient manner as long it reproduces the required stochastic properties.Therefore,quite different representations have been developed and might be utilized for this purpose. The most common representations are e.g.:ARMA and AR models,Filtered White Noise(SDE),Shot Noise and Filtered Poisson White Noise,Covari-ance Decomposition,Karhunen-Lo`e ve and Polynomial Chaos Expansion, Spectral Representation,Wavelets Representation.Among the various methods listed above,the spectral representation methods appear to be most widely used(see e.g.[71,86]).According to this procedure,samples with specified power spectral density information are generated.For the stationary or homogeneous case the Fast Fourier Transform(FFT)techniques is utilized for a dramatic improvements of its computational efficiency(see e.g.[104,105]).Advances in thisfield provide efficient procedures for the generation of2D and3D homogeneous Gaus-sian stochasticfields using the FFT technique(see e.g.[87]).The spectral representation method generates ergodic sample functions of which each ful-fills exactly the requirements of a target power spectrum.These procedures can be extended to the non-stationary case,to the generation of stochastic waves and to incorporate non-Gaussian stochasticfields by a memoryless nonlinear transformation together with an iterative procedure to meet the target spectral density.Developments in Stochastic Structural Mechanics7 The above spectral representation procedures for an unconditional simula-tion of stochastic processes andfields can also be extended for Conditional simulations techniques for Gaussianfields(see e.g.[43,44])employing the conditional probability density method.The aim of this procedure is the generation of Gaussian random variates U n under the condition that(n−1) realizations u i of U i,i=1,2,...,(n−1)are specified and the a priori known covariances are satisfied.An alternative procedure is based on the so called Kriging method used in geostatistical application and applied also to con-ditional simulation problems in earthquake engineering(see e.g.[98]).The Kriging method has been improved significantly(see e.g.[36])that has made this method theoretically clearer and computationally more efficient.The differences and similarities of the conditional probability density methods and(modified)Kriging methods are discussed in[37]showing the equiva-lence of both procedures if the process is Gaussian with zero mean.A quite general spectral representation utilized for Gaussian random pro-cesses andfields is the Karhunen-Lo`e ve expansion of the covariance function (see e.g.[54,33]).This representation is applicable for stationary(homoge-neous)as well as for non-stationary(inhomogeneous)stochastic processes (fields).The expansion of a stochastic process(field)u(x,θ)takes the formu(x,θ)=¯u(x)+∞i=1ξ(θ) λiφi(x)(1)where the symbolθindicates the random nature of the corresponding quan-tity and where¯u(x)denotes the mean,φi(x)are the eigenfunctions andλi the eigenvalues of the covariance function.The set{ξi(θ)}forms a set of8G.I.Schu¨e ller orthogonal(uncorrelated)zero mean random variables with unit variance.The Karhunen-Lo`e ve expansion is mean square convergent irrespective of its probabilistic nature provided it possesses afinite variance.For the im-portant special case of a Gaussian process orfield the random variables{ξi(θ)}are independent standard normal random variables.In many prac-tical applications where the random quantities vary smoothly with respectto time or space,only few terms are necessary to capture the major part of the randomfluctuation of the process.Its major advantage is the reduction from a large number of correlated random variables to few most important uncorrelated ones.Hence this representation is especially suitable for band limited colored excitation and stochastic FE representation of random me-dia where random variables are usually strongly correlated.It might also be utilized to represent the correlated stochastic response of MDOF-systems by few most important variables and hence achieving a space reduction.A generalization of the above Karhunen-Lo`e ve expansion has been proposed for application where the covariance function is not known a priori(see[16, 33,32]).The stochastic process(field)u(x,θ)takes the formu(x,θ)=a0(x)Γ0+∞i1=1a i1(x)Γ1(ξi1(θ))+∞i1=1i1i2=1a i1i2(x)Γ2(ξi1(θ),ξi2(θ))+ (2)which is denoted as the Polynomial Chaos Expansion.Introducing a one-to-one mapping to a set with ordered indices{Ψi(θ)}and truncating eqn.2Developments in Stochastic Structural Mechanics9 after the p th term,the above representations reads,u(x,θ)=pj=ou j(x)Ψj(θ)(3)where the symbolΓn(ξi1,...,ξin)denotes the Polynomial Chaos of order nin the independent standard normal random variables.These polynomialsare orthogonal so that the expectation(or inner product)<ΨiΨj>=δij beingδij the Kronecker symbol.For the special case of a Gaussian random process the above representation coincides with the Karhunen-Lo`e ve expan-sion.The Polynomial Chaos expansion is adjustable in two ways:Increasingthe number of random variables{ξi}results in a refinement of the random fluctuations,while an increase of the maximum order of the polynomialcaptures non-linear(non-Gaussian)behavior of the process.However,the relation between accuracy and numerical efforts,still remains to be shown. The spectral representation by Fourier analysis is not well suited to describe local feature in the time or space domain.This disadvantage is overcome in wavelets analysis which provides an alternative of breaking a signal down into its constituent parts.For more details on this approach,it is referred to[24,60].In some cases of applications the physics or data might be inconsistent with the Gaussian distribution.For such cases,non-Gaussian models have been developed employing various concepts to meet the desired target dis-tribution as well as the target correlation structure(spectral density).Cer-tainly the most straight forward procedures is the above mentioned memo-ryless non-linear transformation of Gaussian processes utilizing the spectralrepresentation.An alternative approach utilizes linear and non-linearfil-ters to represent normal and non-Gaussian processes andfields excited by Gaussian white noise.Linearfilters excited by polynomial forms of Poisson white noise have been developed in[59]and[34].These procedures allow the evaluation of moments of arbitrary order without having to resort to closure techniques. Non-linearfilters are utilized to generate a stationary non-Gaussian stochas-tic process in agreement with a givenfirst-order probability density function and the spectral density[48,15].In the Kontorovich-Lyandres procedure as used in[48],the drift and diffusion coefficients are selected such that the solutionfits the target probability density,and the parameters in the solu-tion form are then adjusted to approximate the target spectral density.The approach by Cai and Lin[15]simplifies this procedure by matching the spec-tral density by adjusting only the drift coefficients,which is the followed by adjusting the diffusion coefficient to approximate the distribution of the pro-cess.The latter approach is especially suitable and computationally highly efficient for a long term simulation of stationary stochastic processes since the computational expense increases only linearly with the number n of dis-crete sample points while the spectral approach has a growth rate of n ln n when applying the efficient FFT technique.For generating samples of the non-linearfilter represented by a stochastic differential equations(SDE), well developed numerical procedures are available(see e.g.[47]).3Response of Stochastic SystemsThe assessment of the stochastic response is the main theme in stochastic mechanics.Contrary to the representation of of stochastic processes and fields designed tofit available statistical data and information,the output of the mathematical model is not prescribed and needs to be determined in some stochastic sense.Hence the mathematical model can not be selected freely but is specified a priori.The model involves for stochastic systems ei-ther random system parameters or/and random loading.Please note,due to space limitations,the question of model validation cannot be treated here. For the characterization of available numerical procedures some classifi-cations with regard to the structural model,loading and the description of the stochastic response is most instrumental.Concerning the structural model,a distinction between the properties,i.e.whether it is determinis-tic or stochastic,linear or non-linear,as well as the number of degrees of freedom(DOF)involved,is essential.As a criterion for the feasibility of a particular numerical procedure,the number of DOF’s of the structural system is one of the most crucial parameters.Therefore,a distinction be-tween dynamical-system-models and general FE-discretizations is suggested where dynamical systems are associated with a low state space dimension of the structural model.FE-discretization has no essential restriction re-garding its number of DOF’s.The stochastic loading can be grouped into static and dynamic loading.Stochastic dynamic loading might be charac-terized further by its distribution and correlation and its independence ordependence on the response,resulting in categorization such as Gaussian and non-Gaussian,stationary and non-stationary,white noise or colored, additive and multiplicative(parametric)excitation properties.Apart from the mathematical model,the required terms in which the stochastic re-sponse should be evaluated play an essential role ranging from assessing thefirst two moments of the response to reliability assessments and stabil-ity analysis.The large number of possibilities for evaluating the stochas-tic response as outlined above does not allow for a discussion of the en-tire subject.Therefore only some selected advances and new directions will be addressed.As already mentioned above,one could distinguish between two main categories of computational procedures treating the response of stochastic systems.Thefirst is based on Monte Carlo simulation and the second provides numerical solutions of analytical procedures for obtaining quantitative results.Regarding the numerical solutions of analytical proce-dures,a clear distinction between dynamical-system-models and FE-models should be made.Current research efforts in stochastic dynamics focus to a large extent on dynamical-system-models while there are few new numerical approaches concerning the evaluation of the stochastic dynamic response of e.g.FE-models.Numerical solutions of the Kolmogorov equations are typical examples of belonging to dynamical-system-models where available approaches are computationally feasible only for state space dimensions one to three and in exceptional cases for dimension four.Galerkin’s,Finite El-ement(FE)and Path Integral methods respectively are generally used tosolve numerically the forward(Fokker-Planck)and backward Kolmogorov equations.For example,in[8,92]the FE approach is employed for stationary and transient solutions respectively of the mentioned forward and backward equations for second order systems.First passage probabilities have been ob-tained employing a Petrov-Galerkin FE method to solve the backward and the related Pontryagin-Vitt equations.An instructive comparison between the computational efforts using Monte Carlo simulation and the FE-method is given e.g.in an earlier IASSAR report[85].The Path Integral method follows the evolution of the(transition)prob-ability function over short time intervals,exploiting the fact that short time transition probabilities for normal white noise excitations are locally Gaus-sian distributed.All existing path integration procedures utilize certain in-terpolation schemes where the probability density function(PDF)is rep-resented by values at discrete grid points.In a wider sense,cell mapping methods(see e.g.[38,39])can be regarded as special setups of the path integral procedure.As documented in[9],cumulant neglect closure described in section7.3 has been automated putational procedures for the automated generation and solutions of the closed set of moment equations have been developed.The method can be employed for an arbitrary number of states and closed at arbitrary levels.The approach,however,is limited by available computational resources,since the computational cost grows exponentially with respect to the number of states and the selected closurelevel.The above discussed developments of numerical procedures deal with low dimensional dynamical systems which are employed for investigating strong non-linear behavior subjected to(Gaussian)white noise excitation. Although dynamical system formulations are quite general and extendible to treat non-Gaussian and colored(filtered)excitation of larger systems,the computational expense is growing exponentially rendering most numerical approaches unfeasible for larger systems.This so called”curse of dimen-sionality”is not overcome yet and it is questionable whether it ever will be, despite the fast developing computational possibilities.For this reason,the alternative approach based on Monte Carlo simu-lation(MCS)gains importance.Several aspects favor procedures based on MCS in engineering applications:(1)Considerably smaller growth rate of the computational effort with dimensionality than analytical procedures.(2) Generally applicable,well suited for parallel processing(see section5.1)and computationally straight forward.(3)Non-linear complex behavior does not complicate the basic procedure.(4)Manageable for complex systems.Contrary to numerical solutions of analytical procedures,the employed structural model and the type of stochastic loading does for MCS not play a deceive role.For this reason,MCS procedures might be structured ac-cording to their purpose i.e.where sample functions are generated either for the estimation of the overall distribution or for generating rare adverse events for an efficient reliability assessment.In the former case,the prob-ability space is covered uniformly by an indiscriminate(blind)generationof sample functions representing the random quantities.Basically,at set of random variables will be generated by a pseudo random number generator followed by a deterministic structural analysis.Based on generated random numbers realizations of random processes,fields and waves addressed in section2,are constructed and utilized without any further modification in the following structural analysis.The situation may not be considered to be straight forward,however,in case of a discriminate MCS for the reliability estimation of structures,where rare events contributing considerably to the failure probability should be gener-ated.Since the effectiveness of direct indiscriminate MCS is not satisfactory for producing a statistically relevant number of low probability realizations in the failure domain,the generation of samples is restricted or guided in some way.The most important class are the variance reduction techniques which operate on the probability of realizations of random variables.The most widely used representative of this class in structural reliability assess-ment is Importance Sampling where a suitable sampling distribution con-trols the generation of realizations in the probability space.The challenge in Importance Sampling is the construction of a suitable sampling distribu-tion which depends in general on the specific structural system and on the failure domain(see e.g.[84]).Hence,the generation of sample functions is no longer independent from the structural system and failure criterion as for indiscriminate direct MCS.Due to these dependencies,computational procedures for an automated establishment of sampling distributions areurgently needed.Adaptive numerical strategies utilizing Importance Direc-tional sampling(e.g.[11])are steps in this direction.The effectiveness of the Importance sampling approach depends crucially on the complexity of the system response as well as an the number of random variables(see also section5.2).Static problems(linear and nonlinear)with few random vari-ables might be treated effectively by this approach.Linear systems where the randomness is represented by a large number of RVs can also be treated efficiently employingfirst order reliability methods(see e.g.[27]).This ap-proach,however,is questionable for the case of non-linear stochastic dynam-ics involving a large set of random variables,where the computational effort required for establishing a suitable sampling distribution might exceed the effort needed for indiscriminate direct MCS.Instead of controlling the realization of random variables,alternatively the evolution of the generated sampling can be controlled[68].This ap-proach is limited to stochastic processes andfields with Markovian prop-erties and utilizes an evolutionary programming technique for the genera-tion of more”important”realization in the low probability domain.This approach is especially suitable for white noise excitation and non-linear systems where Importance sampling is rather difficult to apply.Although the approach cannot deal with spectral representations of the stochastic processes,it is capable to make use of linearly and non-linearlyfiltered ex-citation.Again,this is just contrary to Importance sampling which can be applied to spectral representations but not to white noisefiltered excitation.4Stochastic Finite ElementsAs its name suggests,Stochastic Finite Elements are structural models rep-resented by Finite Elements the properties of which involve randomness.In static analysis,the stiffness matrix might be random due to unpredictable variation of some material properties,random coupling strength between structural components,uncertain boundary conditions,etc.For buckling analysis,shape imperfections of the structures have an additional impor-tant effect on the buckling load[76].Considering structural dynamics,in addition to the stiffness matrix,the damping properties and sometimes also the mass matrix might not be predictable with certainty.Discussing numerical Stochastic Finite Elements procedures,two cat-egories should be distinguished clearly.Thefirst is the representation of Stochastic Finite Elements and their global assemblage as random structural matrices.The second category addresses the evaluation of the stochastic re-sponse of the FE-model due to its randomness.Focusingfirst on the Stochastic FE representation,several representa-tions such as the midpoint method[35],the interpolation method[53],the local average method[97],as well as the Weighted-Integral-Method[94,25, 26]have been developed to describe spatial randomfluctuations within the element.As a tendency,the midpoint methods leads to an overestimation of the variance of the response,the local average method to an underestima-tion and the Weighted-Integral-Method leads to the most accurate results. Moreover,the so called mesh-size problem can be resolved utilizing thisrepresentation.After assembling all Finite Elements,the random structural stiffness matrix K,taken as representative example,assumes the form,K(α)=¯K+ni=1K Iiαi+ni=1nj=1K IIijαiαj+ (4)where¯K is the mean of the matrix,K I i and K II ij denote the determinis-ticfirst and second rate of change with respect to the zero mean random variablesαi andαj and n is the total number of random variables.For normally distributed sets of random variables{α},the correlated set can be represented advantageously by the Karhunen-Lo`e ve expansion[33]and for non-Gaussian distributed random variables by its Polynomial chaos ex-pansion[32],K(θ)=¯K+Mi=0ˆKiΨi(θ)(5)where M denotes the total number of chaos polynomials,ˆK i the associated deterministicfluctuation of the matrix andΨi(θ)a polynomial of standard normal random variablesξj(θ)whereθindicates the random nature of the associated variable.In a second step,the random response of the stochastic structural system is determined.The most widely used procedure for evaluating the stochastic response is the well established perturbation approach(see e.g.[53]).It is well adapted to the FE-formulation and capable to evaluatefirst and second moment properties of the response in an efficient manner.The approach, however,is justified only for small deviations from the center value.Since this assumption is satisfied in most practical applications,the obtainedfirst two moment properties are evaluated satisfactorily.However,the tails of the。
空场嗣后充填采矿法英文回答:In-situ filling mining method is a mining techniquethat involves filling the mined-out areas with waste materials or backfill materials to support the surrounding rock and prevent collapse. This method is commonly used in underground mining operations where the extracted oreleaves voids behind.One advantage of the in-situ filling mining method is that it provides structural support to the underground mine, reducing the risk of cave-ins and ensuring the safety of miners. By filling the empty spaces with waste materials or backfill materials, the stability of the remaining rock is enhanced, making the mining operation safer and more efficient.Another benefit of this mining method is the environmental advantage it offers. Instead of leaving themined-out areas open and exposed, the in-situ filling mining method allows for the reclamation of the land. By filling the voids with waste materials, the land can be restored and used for other purposes such as agriculture or construction.Let me give you an example to illustrate how the in-situ filling mining method works. Imagine a scenario where a company is mining for gold in an underground mine. As the gold is extracted, voids are created in the mine. To prevent the collapse of the mine and ensure the safety of the miners, the company decides to use the in-situ filling mining method.They start by identifying suitable waste materials or backfill materials that can be used to fill the voids. These materials can be anything from sand and gravel to mine tailings or even cement. Once the materials are identified, they are transported to the mine and placedinto the voids using specialized equipment.As the voids are filled, the stability of the mineimproves, reducing the risk of cave-ins. The waste materials or backfill materials also provide support to the remaining rock, preventing it from collapsing. This allows the mining operation to continue safely and efficiently.中文回答:空场嗣后充填采矿法是一种采矿技术,其通过用废弃物或充填材料填充采空区域,以支撑周围的岩石并防止坍塌。
宽禁带半导体ZnS物性的第一性原理研究摘要硫化锌(ZnS)是一种新型的II-VI族宽禁带电子过剩本征半导体材料,其禁带宽度为3.67eV,具有良好的光致发光性能和电致发光性能。
在常温下禁带宽度是3.7eV,具有光传导性好,在可见光和红外范围分散度低等优点。
ZnS和基于ZnS的合金在半导体研究领域己经得到了越来越广泛的关注。
由于它们具有较宽的直接带隙和很大的激子结合能,在光电器件中具有很好的应用前景。
本文介绍了宽禁带半导体ZnS目前国内外的研究现状及其结构性质和技术上的应用。
阐述了密度泛函理论的基本原理,对第一性原理计算的理论基础作了详细的总结,并采用密度泛函理论的广义梯度近似(GGA)下的平面波贋势法,利用Castep软件计算了闪锌矿结构ZnS晶体的电子结构和光学性质。
电子结构如闪锌矿ZnS晶体的能带结构,态密度。
光学性质如反射率,吸收光谱,复数折射率,介电函数,光电导谱和损失函数谱。
通过对其能带及结构的研究,可知闪锌矿硫化锌为直接带隙半导体,通过一系列对光学图的分析,可以对闪锌矿ZnS的进一步研究做很好的预测。
关键词ZnS;宽禁带半导体;第一性原理;闪锌矿结构-I -First-principles Research on Physical Properties of Wide Bandgap Semiconductor ZnSAbstractZinc sulfide (ZnS) is a new family of ll-VI wide band gap electronic excess in tri nsic semic on ductor material with good photolu min esce nee properties and electroluminescent properties. At room temperature band gap is 3.7 eV, and there is good optical transmission in the visible and infrared range and low dispersi on. ZnS and Zn S-based alloy in the field of semic on ductor research has bee n paid more and more atte nti on. Because of their wide direct ban dgap and large excit on binding en ergy, the photovoltaic device has a good prospect.This thesis describes the current research status and structure of nature and tech ni cal applicati ons on wide band gap semic on ductor ZnS. Described the basic principles density functional theory, make a detailed summary for the basis of first principles theoretical calculations, using the density functional theory gen eralized gradie nt approximati on (GGA) un der the pla ne wave pseudopote ntial method, calculated using Castep software sphalerite ZnS crystal structure of electronic structure and optical properties. Electronic structures, such as sphalerite ZnS crystal band structure, density of states. Optical properties such as reflecta nee, absorpti on spectra, complex refractive in dex, dielectric function, optical conductivity spectrum and the loss function spectrum. Band and the structure through its research known as zin cble nde ZnS direct band gap semic on ductor,Through a series of optical map an alysis, can make a good predicti on for further study on zin cble nde ZnS.Keywords ZnS; Wide ban dgap semic on ductor; First-pri nciples; Zin cble nde structure-2-目录摘要 (I)Abstract (II)第1章绪论 ........................................................... 1..1.1 ZnS半导体材料的研究背景 ....................................... 1.1.2 ZnS的基本性质和应用........................................... 1.1.3 ZnS材料的研究方向和进展 (3)1.4 ZnS的晶体...................................................... 4.1.4.1 ZnS晶体结构 ............................................... 4.1.4.2 ZnS的能带结构............................................. 5.1.5 ZnS的发光机理................................................. 6.1.6研究目的和主要内容.............................................. 7.2.1相关理论......................................................... 9.2.1.1密度泛函理论 (9)2.1.2交换关联函数近似........................................... 1.12.2总能量的计算 (13)2.2.1势平面波方法 (14)2.2.2结构优化 (16)2.3 CASTEP软件包功能特点........................................ 1.8第3章ZnS晶体电子结构和光学性质.................................... 1.93.1闪锌矿结构ZnS的电子结构 (19)3.1.1晶格结构 (19)3.1.2能带结构 (20)3.1.3态密度 (21)3.2闪锌矿ZnS晶体的光学性质 (24)结论 (30)致谢 (31)参考文献 (32)附录A (33)附录B (45)-ill -第1章绪论1.1 ZnS半导体材料的研究背景Si是应用最为广泛的半导体材料,现代的大规模集成电路之所以成功推广应用,关键就在于Si半导体在电子器件方面的突破。
2024年云南省英语小学五年级上学期期中复习试卷及答案指导一、听力部分(本大题有12小题,每小题2分,共24分)1、What is the weather like today?A. It’s sunny.B. It’s cloudy.C. It’s rainy.Answer: AExplanation: The question asks about the weather today. The correct answer is “It’s sunny,” which means option A.2、How do you spell “cat”?A. K-A-TB. C-A-TC. Q-A-TAnswer: BExplanation: The question asks how to spell the word “cat.” The correct answer is “C-A-T,” which means option B.3.Listen to the dialogue and answer the question.A. What is the weather like today?B. Where is Tom going?C. How is Tom feeling?Answer: A. What is the weather like today?Explanation: In the dialogue, the speaker mentions, “It’s a sunny day, isn’t it?” which indicates that they are discussing the weather. Ther efore, the correct answer is about the weather.4.Listen to the story and choose the correct option.A. The story is about a dog.B. The story is about a cat.C. The story is about a rabbit.Answer: B. The story is about a cat.Explanation: The story begin s with the line, “Once upon a time, there was a cat named Whiskers,” which clearly identifies the main character of the story. Thus, the correct answer is that the story is about a cat.5.What is the name of the capital city of France?A)LondonB)ParisC)RomeD)BerlinAnswer: B) ParisExplanation: The capital city of France is Paris, making option B the correct answer. London is the capital of the United Kingdom, Rome is the capital of Italy, and Berlin is the capital of Germany, which makes options A, C, and D incorrect.6.Listen to the following conversation between two students and choose the best answer to the question below.Student A: Hi, John. How was your science project?Student B: It was great! We learned about different layers of the Earth.Student A: That sounds interesting. What layer did you focus on?Student B: The crust. It’s the outermost layer of the Earth and varies in thickness.What did Student B focus on in their science project?A)The atmosphereB)The crustC)The oceanD)The coreAnswer: B) The crustExplanation: In the conversation, Student B mentions that they focused on the crust, which is the outermost layer of the Earth. Therefore, option B is the correct answer. The atmosphere refers to the layers of gases surrounding the Earth, the ocean is a body of saltwater, and the core is the innermost layer of the Earth, which are not mentioned in the conversation and thus make options A, C, and D incorrect.7.You will hear a conversation between a student and a teacher about a school project. Listen carefully and choose the correct answer.A. The project is about painting.B. The project is due next week.C. The project requires group work.Answer: B. The project is due next week.Explanation: The teacher mentions in the conversation that the project is due on Friday, which is next week.8.Listen to a short passage about different sports. Choose the sport that is mentioned in the passage.A. BasketballB. SwimmingC. TennisAnswer: C. TennisExplanation: The passage mentions that tennis is a popular sport for both adults and children. Basketball and swimming are not mentioned in the passage.9、You will hear a short conversation between two students about their weekend plans. Listen carefully and choose the best answer to the question.Question: What does the second student plan to do on Saturday afternoon?A)Go to the movies.B)Visit a museum.C)Go shopping.Answer: C) Go shopping.解析:In the conversation, the second student mentions, “I think I’ll go shopping on Saturday afternoon,” which indicates the correct answer.10、You will hear a short dialogue between a teacher and a student discussing the st udent’s homework. Listen carefully and answer the following question.Question: What is the main issue with the student’s homework?A)The student didn’t do it.B)The student turned it in late.C)The student didn’t follow the instructions.Answer: C) The student didn’t follow the instructions.解析:The teacher says, “Your homework seems to be missing some details. Did you not follow the instructions?” This indicates that the main issue is the student not following the given instructions for the homework.11、What is the weather like today?A. It’s sunny.B. It’s cloudy.C. It’s rainy.Answer: AExplanation: Listen to the question and choose the correct answer. The question asks about the weather today, and the correct answer is “It’s sunny.”12、Who is playing the piano in the music room?A. The teacher.B. The students.C. The principal.Answer: BExplanation: Listen to the question and choose the correct answer. The question is asking about who is playing the piano in the music room. The correct answer is “The students,” as it is common for students to play musical instrumentsin the music room.二、选择题(本大题有12小题,每小题2分,共24分)1、Choose the word that does not belong in the following group:A. appleB. bananaC. tomatoD. grape答案:C解析:The words “apple,” “banana,” and “grape”are all types of fruit. “Tomato,” on the other hand, is often considered a vegetable in culinary terms, so it does not belong in this group of fruits.2、Select the sentence that correctly uses the past tense:A. I watch television yesterday.B. She go to the park last week.C. We visited the museum last weekend.D. They were eating ice cream at the movie theater.答案:C解析:The correct past tense sentence is “We visited the museum last weekend.” The other options contain grammatical errors: “I watch televisio n yesterday” should be “I watched television yesterday,” “She go to the park last week” should be “She went to the park last week,” and “They were eatingice cream at the movie theater” is already in the correct past continuous tense form.3、Choose the correct word to complete the sentence.Tom is going to the beach with his family. He will need a _.A. hatB. penC. umbrellaD. shoeAnswer: AExplanation: The correct answer is “hat” because it is an item that is commonly used at the beach to protect from the sun. The other options (pen, umbrella, shoe) are not typically associated with beach activities.4、Select the sentence that correctly uses the past tense.A. She runs quickly.B. She is running quickly.C. She had run quickly.D. She will run quickly.Answer: CExplanation: The correct answer is “She had run quickly” as it is in the past perfect tense, indicating that the action of running quickly had already occurred before another past action. The other options are either in the present continuous tense (B) or future simple tense (D), which do not correctly conveythe past completion of the action. The present tense (A) is also incorrect as it does not indicate a past action.5、What is the capital city of France?A. LondonB. ParisC. RomeD. BerlinAnswer: BExplanation: The capital city of France is Paris. London is the capital of the United Kingdom, Rome is the capital of Italy, and Berlin is the capital of Germany.6、Which of the following is a planet in our solar system?A. VenusB. JupiterC. MoonD. SunAnswer: AExplanation: Venus is one of the eight planets in our solar system. Jupiter is also a planet, but the Moon is Earth’s only natural satellite, not a plane t. The Sun is the star at the center of our solar system, not a planet.7、What is the capital city of France?A. LondonB. ParisC. RomeD. MadridAnswer: BExplanation: The capital city of France is Paris. London is the capital of the United Kingdom, Rome is the capital of Italy, and Madrid is the capital of Spain.8、Which of the following is a compound word?A. AppleB. BananaC. SwimD. HappyAnswer: DExplanation: “Happy” is a compound word because it is formed by combining two words, “hap” and “py.” “Apple” and “banana” are single words, and “swim” is also a single word but can be a verb or a noun depending on the context.9.What is the main difference between a noun and a verb?A)Nouns are people, verbs are actions.B)Nouns are objects, verbs are subjects.C)Nouns are things, verbs show action or state.D)Nouns are feelings, verbs are thoughts.Answer: CExplanation: The main difference between a noun and a verb is that nouns are used to name people, places, things, and ideas, while verbs show action or state. Therefore, option C is the correct answer.10.Which sentence correctly uses the word “because” as a conjunction?A)The cat is sleeping, because it’s tired.B)The cat is sleeping; because it’s tired.C)The cat is sleeping, it’s tired.D)The cat is sleeping; it’s tired, because.Answer: AExplanation: The word “because” is a conjunction used to introduce a reason for something. In option A, “because” is correctly placed before the reason (“it’s tired”) to show thecause-and-effect relationship. Options B and D have incorrect punctuation, and option C does not use “because” as a conjunction.11.What is the capital city of France?A. New YorkB. LondonC. ParisD. TokyoAnswer: C. ParisExplanation: Paris is the capital city of France. New York is the capital city of the United States, London is the capital city of the United Kingdom, and Tokyo is the capital city of Japan.12.Which of the following is not a planet in our solar system?A. MercuryB. VenusC. EarthD. PlutoAnswer: D. PlutoExplanation: Pluto is no longer considered a planet in our solar system. It was reclassified as a dwarf planet in 2006. Mercury, Venus, and Earth are all major planets in our solar system.三、完型填空(10分)Task 3: Cloze TestRead the passage and choose the best word to fill in each blank from the options given.The story of “The Ugly Duckling” is a classic tale that teaches us about self-acceptance and the importance of being true to ourselves. One day, a mother duck had a special egg that was different from all the others. The egg was very large, and it took a long time for it to hatch. When the little duck finally came out, the other ducks were surprised to see it was ___________.1.A) beautiful2.B) cute3.C) unique4.D) uglyAfter hatching, the little duck felt___________because itwas___________compared to the other ducks. It had___________feathers and was not as fast as them. The other ducks would laugh at it and call it names.5.A) happy6.B) sad7.C) energetic8.D) curiousThe little duck tried to___________with the other ducks, but it was difficult. It spent many days feeling___________and wondering if it would ever ___________.9.A) blend in10.B) escape11.C) become friends12.D) find a homeOne day, the little duck saw a group of beautiful swans on a nearby lake. It was then that it realized that it was not as___________as it thought. The little duck finally ___________.13.A) realized its true beauty14.B) swam away15.C) hid from the other ducks16.D) lost hopeAnswer Key:1.D) ugly2.B) sad3.D) ugly4.C) unique5.B) sad6.B) sad7.A) blend in8.B) sad9.D) find a home10.B) sad11.A) realized its true beauty四、阅读理解(26分)Title: The Magic of BooksOnce upon a time, in a small village, there was a young boy named Tom. Tom was an avid reader and loved nothing more than escaping into the magical worlds of books. Every day after school, he would rush home to dive into another adventure. His favorite book series was about a group of friends who traveled through time and space, solving mysteries along the way.One day, Tom’s teacher, Mrs.Green, announced that the class was going to have a special project. They were asked to choose a book that they loved and write a report on it. Tom was excited about this project because he had juststarted reading a new book c alled “The Enchanted Forest.” This book was about a girl named Lily who discovers a hidden world behind her house and goes on a quest to save her village.Here is an excerpt from Tom’s report:“I recently read a captivating book called ‘The Enchanted Forest’ by Sarah Johnson. The story is about a young girl named Lily who discovers an enchanted forest behind her house. Lily’s world turns upside down when she learns that the forest is in danger of being destroyed. With the help of her friends, she embarks on a quest to save the forest and restore balance to her village.”Questions:1.What is the name of the young boy in the story?a) Lilyb) Tomc)Mrs.Greend)Sarah Johnson2.What is Tom’s favorite book series?a) “The Enchanted Forest”b) “Time and Space Mysteries”c) “Lily’s Quest”d) “Tom’s Adventures”3.What is the main goal of Lily in the story?a) To find her missing petb) To save her village from dangerc) To become the new queen of the enchanted forestd) To win a contestAnswers:1.b) Tom2.b) “Time and Space Mysteries”3.b) To save her village from danger五、写作题(16分)Title: Why Spring Is My Favorite SeasonSpring is my favorite season of the year. It’s the time when nature comes back to life after the cold winter months. The air becomes warmer, and everything starts to grow again. I love seeing the flowers bloom and the trees turn green. It’s also the perfect time for outdoor activities like picnics in the park or bike rides with friends.During spring, I enjoy spending more time outside playing sports and games with my family. The smell of fresh grass and blooming flowers fills my heart with joy. Spring gives me energy and hope, making every day feel new and exciting. When spring arrives, it feels like a fresh start, and that’s why it’s my most beloved season.Analysis:This essay meets the requirements by:•Sticking to the topic of a favorite season.•Providing specific reasons why spring is favored (nature coming back to life, outdoor activities).•Using descriptive language to convey personal feelings (flowers bloom, trees turn green, joy).•Keeping within the word limit while providing enough detail to paint a clear picture.•Expressing emotions and personal experiences associated with the season.The student demonstrates good writing skills by using varied vocabulary and sentence structures, making the essay engaging and coherent.。
稀有金属学报英文版关于残余应力的文章Title: Residual Stress Analysis in Advanced Materials: A ReviewResidual stress, a prevalent phenomenon in advanced materials, plays a crucial role in determining their mechanical properties and performance characteristics. Understanding and controlling residual stress have become imperative for optimizing the reliability and functionality of various engineering components. In this review, we delve into recent advancements in residual stress analysis methodologies, focusing on their applications in the field of rare metals.X-ray diffraction (XRD) techniques have emerged as powerful tools for quantifying residual stresses in metallic materials. By analyzing the diffraction patterns produced by X-rays interacting with the crystalline lattice, researcherscan accurately determine the magnitude and distribution of residual stresses within a material. Moreover, synchrotron X-ray sources have enabled high-resolution mapping of residual stresses in intricate microstructures, providing valuable insights for material design and process optimization.Neutron diffraction represents another indispensable technique for residual stress characterization, particularly in materials with large penetration depths or complex geometries. Neutrons, being highly penetrating and non-destructive, can interrogate bulk materials without significant sample preparation. Recent developments in neutron scattering instrumentation have facilitated in-situ stress monitoring during material processing, offering unprecedented opportunities for real-time quality control and defect detection.Furthermore, finite element analysis (FEA) has become indispensable for simulating and predicting residual stressdistributions in advanced materials. By discretizing the material domain into finite elements and solving the equilibrium equations iteratively, FEA can accurately model the thermo-mechanical processes involved in residual stress generation. Incorporating advanced material models and boundary conditions, researchers can simulate various manufacturing scenarios to optimize processing parameters and minimize residual stress-induced defects.In addition to experimental and numerical techniques, analytical models provide valuable insights into the underlying mechanisms governing residual stress formation. The superposition principle, plasticity theory, and phase transformation kinetics are commonly employed to formulate closed-form expressions for residual stress prediction in specific material systems. Although analytical models may lack the predictive accuracy of experimental and numerical methods, they offer valuable qualitative understanding and insights for guiding subsequent investigations.Moreover, advancements in material synthesis and processing techniques have enabled tailored manipulation of residual stresses in rare metal alloys. Additivemanufacturing processes, such as selective laser melting (SLM) and electron beam melting (EBM), offer unprecedented control over thermal gradients and cooling rates, thereby allowing precise adjustment of residual stress distributions within printed components. Furthermore, post-processing treatments, including shot peening and annealing, can alleviate residual stresses and enhance the mechanical properties of rare metal alloys.In conclusion, residual stress analysis remains acritical aspect of material characterization and process optimization in the field of rare metals. By leveraging advanced experimental techniques, numerical simulations, and analytical models, researchers can gain deeper insights into the origins and implications of residual stresses in advanced materials. Moreover, synergistic integration of materialsynthesis and processing techniques enables precise manipulation of residual stress distributions, thereby unlocking new opportunities for enhancing the performance and reliability of rare metal components in diverse engineering applications.。
隐马尔科夫模型在金融领域的使用方法隐马尔科夫模型(Hidden Markov Model,HMM)是一种用于建模时序数据的概率模型。
它在金融领域的应用得到了广泛的关注和研究。
在金融市场中,隐马尔科夫模型可以用于预测股票价格走势、进行风险管理、识别市场潜在的投资机会等方面。
本文将从隐马尔科夫模型的基本原理开始,深入探讨其在金融领域的使用方法。
隐马尔科夫模型是一种双重随机过程模型,它由两个随机过程组成:一个隐状态序列和一个可观察的输出序列。
在金融市场中,隐状态可以被理解为市场的真实状态,而可观察的输出则是市场上的行情数据。
隐马尔科夫模型的基本假设是,可观察序列的生成过程依赖于对应的隐状态序列,而隐状态序列则是一个马尔科夫链的输出。
通过对这两个过程进行建模,可以帮助我们理解市场的内在规律和未来走势。
在金融领域,隐马尔科夫模型主要用于时间序列数据的建模和分析。
例如,我们可以使用隐马尔科夫模型来预测股票价格的走势。
通过将股票价格的历史数据作为可观察序列,我们可以利用隐马尔科夫模型来推断出隐藏在股票价格背后的隐含状态,从而预测未来的价格走势。
这种方法在一定程度上可以帮助投资者制定更为准确的交易策略,提高投资收益。
另外,隐马尔科夫模型在金融风险管理方面也有着重要的应用。
金融市场的波动性是非常复杂和难以预测的,而隐马尔科夫模型可以帮助我们对市场波动进行建模和预测。
通过对市场波动性的分析,我们可以更好地识别和管理风险,从而降低投资组合的波动性和损失。
此外,隐马尔科夫模型还可以用于识别市场潜在的投资机会。
在金融市场中,市场的不确定性和复杂性使得投资者很难准确地判断市场的变化和机会。
而隐马尔科夫模型可以帮助我们从海量的市场数据中挖掘出潜在的投资机会,为投资决策提供更为准确和可靠的参考。
在实际应用中,隐马尔科夫模型的使用方法需要结合金融市场的特点和需求。
首先,我们需要收集和整理大量的市场数据,包括股票价格、成交量、市场情绪指标等。
s Data mining and knowledge discovery in databases have been attracting a significant amount of research, industry, and media atten-tion of late. What is all the excitement about?This article provides an overview of this emerging field, clarifying how data mining and knowledge discovery in databases are related both to each other and to related fields, such as machine learning, statistics, and databases. The article mentions particular real-world applications, specific data-mining techniques, challenges in-volved in real-world applications of knowledge discovery, and current and future research direc-tions in the field.A cross a wide variety of fields, data arebeing collected and accumulated at adramatic pace. There is an urgent need for a new generation of computational theo-ries and tools to assist humans in extracting useful information (knowledge) from the rapidly growing volumes of digital data. These theories and tools are the subject of the emerging field of knowledge discovery in databases (KDD).At an abstract level, the KDD field is con-cerned with the development of methods and techniques for making sense of data. The basic problem addressed by the KDD process is one of mapping low-level data (which are typically too voluminous to understand and digest easi-ly) into other forms that might be more com-pact (for example, a short report), more ab-stract (for example, a descriptive approximation or model of the process that generated the data), or more useful (for exam-ple, a predictive model for estimating the val-ue of future cases). At the core of the process is the application of specific data-mining meth-ods for pattern discovery and extraction.1This article begins by discussing the histori-cal context of KDD and data mining and theirintersection with other related fields. A briefsummary of recent KDD real-world applica-tions is provided. Definitions of KDD and da-ta mining are provided, and the general mul-tistep KDD process is outlined. This multistepprocess has the application of data-mining al-gorithms as one particular step in the process.The data-mining step is discussed in more de-tail in the context of specific data-mining al-gorithms and their application. Real-worldpractical application issues are also outlined.Finally, the article enumerates challenges forfuture research and development and in par-ticular discusses potential opportunities for AItechnology in KDD systems.Why Do We Need KDD?The traditional method of turning data intoknowledge relies on manual analysis and in-terpretation. For example, in the health-careindustry, it is common for specialists to peri-odically analyze current trends and changesin health-care data, say, on a quarterly basis.The specialists then provide a report detailingthe analysis to the sponsoring health-care or-ganization; this report becomes the basis forfuture decision making and planning forhealth-care management. In a totally differ-ent type of application, planetary geologistssift through remotely sensed images of plan-ets and asteroids, carefully locating and cata-loging such geologic objects of interest as im-pact craters. Be it science, marketing, finance,health care, retail, or any other field, the clas-sical approach to data analysis relies funda-mentally on one or more analysts becomingArticlesFALL 1996 37From Data Mining to Knowledge Discovery inDatabasesUsama Fayyad, Gregory Piatetsky-Shapiro, and Padhraic Smyth Copyright © 1996, American Association for Artificial Intelligence. All rights reserved. 0738-4602-1996 / $2.00areas is astronomy. Here, a notable success was achieved by SKICAT ,a system used by as-tronomers to perform image analysis,classification, and cataloging of sky objects from sky-survey images (Fayyad, Djorgovski,and Weir 1996). In its first application, the system was used to process the 3 terabytes (1012bytes) of image data resulting from the Second Palomar Observatory Sky Survey,where it is estimated that on the order of 109sky objects are detectable. SKICAT can outper-form humans and traditional computational techniques in classifying faint sky objects. See Fayyad, Haussler, and Stolorz (1996) for a sur-vey of scientific applications.In business, main KDD application areas includes marketing, finance (especially in-vestment), fraud detection, manufacturing,telecommunications, and Internet agents.Marketing:In marketing, the primary ap-plication is database marketing systems,which analyze customer databases to identify different customer groups and forecast their behavior. Business Week (Berry 1994) estimat-ed that over half of all retailers are using or planning to use database marketing, and those who do use it have good results; for ex-ample, American Express reports a 10- to 15-percent increase in credit-card use. Another notable marketing application is market-bas-ket analysis (Agrawal et al. 1996) systems,which find patterns such as, “If customer bought X, he/she is also likely to buy Y and Z.” Such patterns are valuable to retailers.Investment: Numerous companies use da-ta mining for investment, but most do not describe their systems. One exception is LBS Capital Management. Its system uses expert systems, neural nets, and genetic algorithms to manage portfolios totaling $600 million;since its start in 1993, the system has outper-formed the broad stock market (Hall, Mani,and Barr 1996).Fraud detection: HNC Falcon and Nestor PRISM systems are used for monitoring credit-card fraud, watching over millions of ac-counts. The FAIS system (Senator et al. 1995),from the U.S. Treasury Financial Crimes En-forcement Network, is used to identify finan-cial transactions that might indicate money-laundering activity.Manufacturing: The CASSIOPEE trou-bleshooting system, developed as part of a joint venture between General Electric and SNECMA, was applied by three major Euro-pean airlines to diagnose and predict prob-lems for the Boeing 737. To derive families of faults, clustering methods are used. CASSIOPEE received the European first prize for innova-intimately familiar with the data and serving as an interface between the data and the users and products.For these (and many other) applications,this form of manual probing of a data set is slow, expensive, and highly subjective. In fact, as data volumes grow dramatically, this type of manual data analysis is becoming completely impractical in many domains.Databases are increasing in size in two ways:(1) the number N of records or objects in the database and (2) the number d of fields or at-tributes to an object. Databases containing on the order of N = 109objects are becoming in-creasingly common, for example, in the as-tronomical sciences. Similarly, the number of fields d can easily be on the order of 102or even 103, for example, in medical diagnostic applications. Who could be expected to di-gest millions of records, each having tens or hundreds of fields? We believe that this job is certainly not one for humans; hence, analysis work needs to be automated, at least partially.The need to scale up human analysis capa-bilities to handling the large number of bytes that we can collect is both economic and sci-entific. Businesses use data to gain competi-tive advantage, increase efficiency, and pro-vide more valuable services to customers.Data we capture about our environment are the basic evidence we use to build theories and models of the universe we live in. Be-cause computers have enabled humans to gather more data than we can digest, it is on-ly natural to turn to computational tech-niques to help us unearth meaningful pat-terns and structures from the massive volumes of data. Hence, KDD is an attempt to address a problem that the digital informa-tion era made a fact of life for all of us: data overload.Data Mining and Knowledge Discovery in the Real WorldA large degree of the current interest in KDD is the result of the media interest surrounding successful KDD applications, for example, the focus articles within the last two years in Business Week , Newsweek , Byte , PC Week , and other large-circulation periodicals. Unfortu-nately, it is not always easy to separate fact from media hype. Nonetheless, several well-documented examples of successful systems can rightly be referred to as KDD applications and have been deployed in operational use on large-scale real-world problems in science and in business.In science, one of the primary applicationThere is an urgent need for a new generation of computation-al theories and tools toassist humans in extractinguseful information (knowledge)from the rapidly growing volumes ofdigital data.Articles38AI MAGAZINEtive applications (Manago and Auriol 1996).Telecommunications: The telecommuni-cations alarm-sequence analyzer (TASA) wasbuilt in cooperation with a manufacturer oftelecommunications equipment and threetelephone networks (Mannila, Toivonen, andVerkamo 1995). The system uses a novelframework for locating frequently occurringalarm episodes from the alarm stream andpresenting them as rules. Large sets of discov-ered rules can be explored with flexible infor-mation-retrieval tools supporting interactivityand iteration. In this way, TASA offers pruning,grouping, and ordering tools to refine the re-sults of a basic brute-force search for rules.Data cleaning: The MERGE-PURGE systemwas applied to the identification of duplicatewelfare claims (Hernandez and Stolfo 1995).It was used successfully on data from the Wel-fare Department of the State of Washington.In other areas, a well-publicized system isIBM’s ADVANCED SCOUT,a specialized data-min-ing system that helps National Basketball As-sociation (NBA) coaches organize and inter-pret data from NBA games (U.S. News 1995). ADVANCED SCOUT was used by several of the NBA teams in 1996, including the Seattle Su-personics, which reached the NBA finals.Finally, a novel and increasingly importanttype of discovery is one based on the use of in-telligent agents to navigate through an infor-mation-rich environment. Although the ideaof active triggers has long been analyzed in thedatabase field, really successful applications ofthis idea appeared only with the advent of theInternet. These systems ask the user to specifya profile of interest and search for related in-formation among a wide variety of public-do-main and proprietary sources. For example, FIREFLY is a personal music-recommendation agent: It asks a user his/her opinion of several music pieces and then suggests other music that the user might like (<http:// www.ffl/>). CRAYON(/>) allows users to create their own free newspaper (supported by ads); NEWSHOUND(<http://www. /hound/>) from the San Jose Mercury News and FARCAST(</> automatically search information from a wide variety of sources, including newspapers and wire services, and e-mail rele-vant documents directly to the user.These are just a few of the numerous suchsystems that use KDD techniques to automat-ically produce useful information from largemasses of raw data. See Piatetsky-Shapiro etal. (1996) for an overview of issues in devel-oping industrial KDD applications.Data Mining and KDDHistorically, the notion of finding useful pat-terns in data has been given a variety ofnames, including data mining, knowledge ex-traction, information discovery, informationharvesting, data archaeology, and data patternprocessing. The term data mining has mostlybeen used by statisticians, data analysts, andthe management information systems (MIS)communities. It has also gained popularity inthe database field. The phrase knowledge dis-covery in databases was coined at the first KDDworkshop in 1989 (Piatetsky-Shapiro 1991) toemphasize that knowledge is the end productof a data-driven discovery. It has been popular-ized in the AI and machine-learning fields.In our view, KDD refers to the overall pro-cess of discovering useful knowledge from da-ta, and data mining refers to a particular stepin this process. Data mining is the applicationof specific algorithms for extracting patternsfrom data. The distinction between the KDDprocess and the data-mining step (within theprocess) is a central point of this article. Theadditional steps in the KDD process, such asdata preparation, data selection, data cleaning,incorporation of appropriate prior knowledge,and proper interpretation of the results ofmining, are essential to ensure that usefulknowledge is derived from the data. Blind ap-plication of data-mining methods (rightly crit-icized as data dredging in the statistical litera-ture) can be a dangerous activity, easilyleading to the discovery of meaningless andinvalid patterns.The Interdisciplinary Nature of KDDKDD has evolved, and continues to evolve,from the intersection of research fields such asmachine learning, pattern recognition,databases, statistics, AI, knowledge acquisitionfor expert systems, data visualization, andhigh-performance computing. The unifyinggoal is extracting high-level knowledge fromlow-level data in the context of large data sets.The data-mining component of KDD cur-rently relies heavily on known techniquesfrom machine learning, pattern recognition,and statistics to find patterns from data in thedata-mining step of the KDD process. A natu-ral question is, How is KDD different from pat-tern recognition or machine learning (and re-lated fields)? The answer is that these fieldsprovide some of the data-mining methodsthat are used in the data-mining step of theKDD process. KDD focuses on the overall pro-cess of knowledge discovery from data, includ-ing how the data are stored and accessed, howalgorithms can be scaled to massive data setsThe basicproblemaddressed bythe KDDprocess isone ofmappinglow-leveldata intoother formsthat might bemorecompact,moreabstract,or moreuseful.ArticlesFALL 1996 39A driving force behind KDD is the database field (the second D in KDD). Indeed, the problem of effective data manipulation when data cannot fit in the main memory is of fun-damental importance to KDD. Database tech-niques for gaining efficient data access,grouping and ordering operations when ac-cessing data, and optimizing queries consti-tute the basics for scaling algorithms to larger data sets. Most data-mining algorithms from statistics, pattern recognition, and machine learning assume data are in the main memo-ry and pay no attention to how the algorithm breaks down if only limited views of the data are possible.A related field evolving from databases is data warehousing,which refers to the popular business trend of collecting and cleaning transactional data to make them available for online analysis and decision support. Data warehousing helps set the stage for KDD in two important ways: (1) data cleaning and (2)data access.Data cleaning: As organizations are forced to think about a unified logical view of the wide variety of data and databases they pos-sess, they have to address the issues of map-ping data to a single naming convention,uniformly representing and handling missing data, and handling noise and errors when possible.Data access: Uniform and well-defined methods must be created for accessing the da-ta and providing access paths to data that were historically difficult to get to (for exam-ple, stored offline).Once organizations and individuals have solved the problem of how to store and ac-cess their data, the natural next step is the question, What else do we do with all the da-ta? This is where opportunities for KDD natu-rally arise.A popular approach for analysis of data warehouses is called online analytical processing (OLAP), named for a set of principles pro-posed by Codd (1993). OLAP tools focus on providing multidimensional data analysis,which is superior to SQL in computing sum-maries and breakdowns along many dimen-sions. OLAP tools are targeted toward simpli-fying and supporting interactive data analysis,but the goal of KDD tools is to automate as much of the process as possible. Thus, KDD is a step beyond what is currently supported by most standard database systems.Basic DefinitionsKDD is the nontrivial process of identifying valid, novel, potentially useful, and ultimate-and still run efficiently, how results can be in-terpreted and visualized, and how the overall man-machine interaction can usefully be modeled and supported. The KDD process can be viewed as a multidisciplinary activity that encompasses techniques beyond the scope of any one particular discipline such as machine learning. In this context, there are clear opportunities for other fields of AI (be-sides machine learning) to contribute to KDD. KDD places a special emphasis on find-ing understandable patterns that can be inter-preted as useful or interesting knowledge.Thus, for example, neural networks, although a powerful modeling tool, are relatively difficult to understand compared to decision trees. KDD also emphasizes scaling and ro-bustness properties of modeling algorithms for large noisy data sets.Related AI research fields include machine discovery, which targets the discovery of em-pirical laws from observation and experimen-tation (Shrager and Langley 1990) (see Kloes-gen and Zytkow [1996] for a glossary of terms common to KDD and machine discovery),and causal modeling for the inference of causal models from data (Spirtes, Glymour,and Scheines 1993). Statistics in particular has much in common with KDD (see Elder and Pregibon [1996] and Glymour et al.[1996] for a more detailed discussion of this synergy). Knowledge discovery from data is fundamentally a statistical endeavor. Statistics provides a language and framework for quan-tifying the uncertainty that results when one tries to infer general patterns from a particu-lar sample of an overall population. As men-tioned earlier, the term data mining has had negative connotations in statistics since the 1960s when computer-based data analysis techniques were first introduced. The concern arose because if one searches long enough in any data set (even randomly generated data),one can find patterns that appear to be statis-tically significant but, in fact, are not. Clearly,this issue is of fundamental importance to KDD. Substantial progress has been made in recent years in understanding such issues in statistics. Much of this work is of direct rele-vance to KDD. Thus, data mining is a legiti-mate activity as long as one understands how to do it correctly; data mining carried out poorly (without regard to the statistical as-pects of the problem) is to be avoided. KDD can also be viewed as encompassing a broader view of modeling than statistics. KDD aims to provide tools to automate (to the degree pos-sible) the entire process of data analysis and the statistician’s “art” of hypothesis selection.Data mining is a step in the KDD process that consists of ap-plying data analysis and discovery al-gorithms that produce a par-ticular enu-meration ofpatterns (or models)over the data.Articles40AI MAGAZINEly understandable patterns in data (Fayyad, Piatetsky-Shapiro, and Smyth 1996).Here, data are a set of facts (for example, cases in a database), and pattern is an expres-sion in some language describing a subset of the data or a model applicable to the subset. Hence, in our usage here, extracting a pattern also designates fitting a model to data; find-ing structure from data; or, in general, mak-ing any high-level description of a set of data. The term process implies that KDD comprises many steps, which involve data preparation, search for patterns, knowledge evaluation, and refinement, all repeated in multiple itera-tions. By nontrivial, we mean that some search or inference is involved; that is, it is not a straightforward computation of predefined quantities like computing the av-erage value of a set of numbers.The discovered patterns should be valid on new data with some degree of certainty. We also want patterns to be novel (at least to the system and preferably to the user) and poten-tially useful, that is, lead to some benefit to the user or task. Finally, the patterns should be understandable, if not immediately then after some postprocessing.The previous discussion implies that we can define quantitative measures for evaluating extracted patterns. In many cases, it is possi-ble to define measures of certainty (for exam-ple, estimated prediction accuracy on new data) or utility (for example, gain, perhaps indollars saved because of better predictions orspeedup in response time of a system). No-tions such as novelty and understandabilityare much more subjective. In certain contexts,understandability can be estimated by sim-plicity (for example, the number of bits to de-scribe a pattern). An important notion, calledinterestingness(for example, see Silberschatzand Tuzhilin [1995] and Piatetsky-Shapiro andMatheus [1994]), is usually taken as an overallmeasure of pattern value, combining validity,novelty, usefulness, and simplicity. Interest-ingness functions can be defined explicitly orcan be manifested implicitly through an or-dering placed by the KDD system on the dis-covered patterns or models.Given these notions, we can consider apattern to be knowledge if it exceeds some in-terestingness threshold, which is by nomeans an attempt to define knowledge in thephilosophical or even the popular view. As amatter of fact, knowledge in this definition ispurely user oriented and domain specific andis determined by whatever functions andthresholds the user chooses.Data mining is a step in the KDD processthat consists of applying data analysis anddiscovery algorithms that, under acceptablecomputational efficiency limitations, pro-duce a particular enumeration of patterns (ormodels) over the data. Note that the space ofArticlesFALL 1996 41Figure 1. An Overview of the Steps That Compose the KDD Process.methods, the effective number of variables under consideration can be reduced, or in-variant representations for the data can be found.Fifth is matching the goals of the KDD pro-cess (step 1) to a particular data-mining method. For example, summarization, clas-sification, regression, clustering, and so on,are described later as well as in Fayyad, Piatet-sky-Shapiro, and Smyth (1996).Sixth is exploratory analysis and model and hypothesis selection: choosing the data-mining algorithm(s) and selecting method(s)to be used for searching for data patterns.This process includes deciding which models and parameters might be appropriate (for ex-ample, models of categorical data are differ-ent than models of vectors over the reals) and matching a particular data-mining method with the overall criteria of the KDD process (for example, the end user might be more in-terested in understanding the model than its predictive capabilities).Seventh is data mining: searching for pat-terns of interest in a particular representa-tional form or a set of such representations,including classification rules or trees, regres-sion, and clustering. The user can significant-ly aid the data-mining method by correctly performing the preceding steps.Eighth is interpreting mined patterns, pos-sibly returning to any of steps 1 through 7 for further iteration. This step can also involve visualization of the extracted patterns and models or visualization of the data given the extracted models.Ninth is acting on the discovered knowl-edge: using the knowledge directly, incorpo-rating the knowledge into another system for further action, or simply documenting it and reporting it to interested parties. This process also includes checking for and resolving po-tential conflicts with previously believed (or extracted) knowledge.The KDD process can involve significant iteration and can contain loops between any two steps. The basic flow of steps (al-though not the potential multitude of itera-tions and loops) is illustrated in figure 1.Most previous work on KDD has focused on step 7, the data mining. However, the other steps are as important (and probably more so) for the successful application of KDD in practice. Having defined the basic notions and introduced the KDD process, we now focus on the data-mining component,which has, by far, received the most atten-tion in the literature.patterns is often infinite, and the enumera-tion of patterns involves some form of search in this space. Practical computational constraints place severe limits on the sub-space that can be explored by a data-mining algorithm.The KDD process involves using the database along with any required selection,preprocessing, subsampling, and transforma-tions of it; applying data-mining methods (algorithms) to enumerate patterns from it;and evaluating the products of data mining to identify the subset of the enumerated pat-terns deemed knowledge. The data-mining component of the KDD process is concerned with the algorithmic means by which pat-terns are extracted and enumerated from da-ta. The overall KDD process (figure 1) in-cludes the evaluation and possible interpretation of the mined patterns to de-termine which patterns can be considered new knowledge. The KDD process also in-cludes all the additional steps described in the next section.The notion of an overall user-driven pro-cess is not unique to KDD: analogous propos-als have been put forward both in statistics (Hand 1994) and in machine learning (Brod-ley and Smyth 1996).The KDD ProcessThe KDD process is interactive and iterative,involving numerous steps with many deci-sions made by the user. Brachman and Anand (1996) give a practical view of the KDD pro-cess, emphasizing the interactive nature of the process. Here, we broadly outline some of its basic steps:First is developing an understanding of the application domain and the relevant prior knowledge and identifying the goal of the KDD process from the customer’s viewpoint.Second is creating a target data set: select-ing a data set, or focusing on a subset of vari-ables or data samples, on which discovery is to be performed.Third is data cleaning and preprocessing.Basic operations include removing noise if appropriate, collecting the necessary informa-tion to model or account for noise, deciding on strategies for handling missing data fields,and accounting for time-sequence informa-tion and known changes.Fourth is data reduction and projection:finding useful features to represent the data depending on the goal of the task. With di-mensionality reduction or transformationArticles42AI MAGAZINEThe Data-Mining Stepof the KDD ProcessThe data-mining component of the KDD pro-cess often involves repeated iterative applica-tion of particular data-mining methods. This section presents an overview of the primary goals of data mining, a description of the methods used to address these goals, and a brief description of the data-mining algo-rithms that incorporate these methods.The knowledge discovery goals are defined by the intended use of the system. We can distinguish two types of goals: (1) verification and (2) discovery. With verification,the sys-tem is limited to verifying the user’s hypothe-sis. With discovery,the system autonomously finds new patterns. We further subdivide the discovery goal into prediction,where the sys-tem finds patterns for predicting the future behavior of some entities, and description, where the system finds patterns for presenta-tion to a user in a human-understandableform. In this article, we are primarily con-cerned with discovery-oriented data mining.Data mining involves fitting models to, or determining patterns from, observed data. The fitted models play the role of inferred knowledge: Whether the models reflect useful or interesting knowledge is part of the over-all, interactive KDD process where subjective human judgment is typically required. Two primary mathematical formalisms are used in model fitting: (1) statistical and (2) logical. The statistical approach allows for nondeter-ministic effects in the model, whereas a logi-cal model is purely deterministic. We focus primarily on the statistical approach to data mining, which tends to be the most widely used basis for practical data-mining applica-tions given the typical presence of uncertain-ty in real-world data-generating processes.Most data-mining methods are based on tried and tested techniques from machine learning, pattern recognition, and statistics: classification, clustering, regression, and so on. The array of different algorithms under each of these headings can often be bewilder-ing to both the novice and the experienced data analyst. It should be emphasized that of the many data-mining methods advertised in the literature, there are really only a few fun-damental techniques. The actual underlying model representation being used by a particu-lar method typically comes from a composi-tion of a small number of well-known op-tions: polynomials, splines, kernel and basis functions, threshold-Boolean functions, and so on. Thus, algorithms tend to differ primar-ily in the goodness-of-fit criterion used toevaluate model fit or in the search methodused to find a good fit.In our brief overview of data-mining meth-ods, we try in particular to convey the notionthat most (if not all) methods can be viewedas extensions or hybrids of a few basic tech-niques and principles. We first discuss the pri-mary methods of data mining and then showthat the data- mining methods can be viewedas consisting of three primary algorithmiccomponents: (1) model representation, (2)model evaluation, and (3) search. In the dis-cussion of KDD and data-mining methods,we use a simple example to make some of thenotions more concrete. Figure 2 shows a sim-ple two-dimensional artificial data set consist-ing of 23 cases. Each point on the graph rep-resents a person who has been given a loanby a particular bank at some time in the past.The horizontal axis represents the income ofthe person; the vertical axis represents the to-tal personal debt of the person (mortgage, carpayments, and so on). The data have beenclassified into two classes: (1) the x’s repre-sent persons who have defaulted on theirloans and (2) the o’s represent persons whoseloans are in good status with the bank. Thus,this simple artificial data set could represent ahistorical data set that can contain usefulknowledge from the point of view of thebank making the loans. Note that in actualKDD applications, there are typically manymore dimensions (as many as several hun-dreds) and many more data points (manythousands or even millions).ArticlesFALL 1996 43Figure 2. A Simple Data Set with Two Classes Used for Illustrative Purposes.。
基于低秩约束的熵加权多视角模糊聚类算法张嘉旭 1王 骏 1, 2张春香 1林得富 1周 塔 3王士同1摘 要 如何有效挖掘多视角数据内部的一致性以及差异性是构建多视角模糊聚类算法的两个重要问题. 本文在Co-FKM 算法框架上, 提出了基于低秩约束的熵加权多视角模糊聚类算法(Entropy-weighting multi-view fuzzy C-means with low rank constraint, LR-MVEWFCM). 一方面, 从视角之间的一致性出发, 引入核范数对多个视角之间的模糊隶属度矩阵进行低秩约束; 另一方面, 基于香农熵理论引入视角权重自适应调整策略, 使算法根据各视角的重要程度来处理视角间的差异性. 本文使用交替方向乘子法(Alternating direction method of multipliers, ADMM)进行目标函数的优化. 最后, 人工模拟数据集和UCI (University of California Irvine)数据集上进行的实验结果验证了该方法的有效性.关键词 多视角模糊聚类, 香农熵, 低秩约束, 核范数, 交替方向乘子法引用格式 张嘉旭, 王骏, 张春香, 林得富, 周塔, 王士同. 基于低秩约束的熵加权多视角模糊聚类算法. 自动化学报, 2022,48(7): 1760−1770DOI 10.16383/j.aas.c190350Entropy-weighting Multi-view Fuzzy C-means With Low Rank ConstraintZHANG Jia-Xu 1 WANG Jun 1, 2 ZHANG Chun-Xiang 1 LIN De-Fu 1 ZHOU Ta 3 WANG Shi-Tong 1Abstract Effective mining both internal consistency and diversity of multi-view data is important to develop multi-view fuzzy clustering algorithms. In this paper, we propose a novel multi-view fuzzy clustering algorithm called en-tropy-weighting multi-view fuzzy c-means with low-rank constraint (LR-MVEWFCM). On the one hand, we intro-duce the nuclear norm as the low-rank constraint of the fuzzy membership matrix. On the other hand, the adaptive adjustment strategy of view weight is introduced to control the differences among views according to the import-ance of each view. The learning criterion can be optimized by the alternating direction method of multipliers (ADMM). Experimental results on both artificial and UCI (University of California Irvine) datasets show the effect-iveness of the proposed method.Key words Multi-view fuzzy clustering, Shannon entropy, low-rank constraint, nuclear norm, alternating direction method of multipliers (ADMM)Citation Zhang Jia-Xu, Wang Jun, Zhang Chun-Xiang, Lin De-Fu, Zhou Ta, Wang Shi-Tong. Entropy-weighting multi-view fuzzy C-means with low rank constraint. Acta Automatica Sinica , 2022, 48(7): 1760−1770随着多样化信息获取技术的发展, 人们可以从不同途径或不同角度来获取对象的特征数据, 即多视角数据. 多视角数据包含了同一对象不同角度的信息. 例如: 网页数据中既包含网页内容又包含网页链接信息; 视频内容中既包含视频信息又包含音频信息; 图像数据中既涉及颜色直方图特征、纹理特征等图像特征, 又涉及描述该图像内容的文本.多视角学习能有效地对多视角数据进行融合, 避免了单视角数据数据信息单一的问题[1−4].多视角模糊聚类是一种有效的无监督多视角学习方法[5−7]. 它通过在多视角聚类过程中引入各样本对不同类别的模糊隶属度来描述各视角下样本属于该类别的不确定性程度. 经典的工作有: 文献[8]以经典的单视角模糊C 均值(Fuzzy C-means, FCM)算法作为基础模型, 利用不同视角间的互补信息确定协同聚类的准则, 提出了Co-FC (Collaborative fuzzy clustering)算法; 文献[9]参考文献[8]的协同思想提出Co-FKM (Multiview fuzzy clustering algorithm collaborative fuzzy K-means)算法, 引入双视角隶属度惩罚项, 构造了一种新型的无监督多视角协同学习方法; 文献[10]借鉴了Co-FKM 和Co-FC 所使用的双视角约束思想, 通过引入视角权重, 并采用集成策略来融合多视角的模糊隶属收稿日期 2019-05-09 录用日期 2019-07-17Manuscript received May 9, 2019; accepted July 17, 2019国家自然科学基金(61772239), 江苏省自然科学基金(BK20181339)资助Supported by National Natural Science Foundation of China (61772239) and Natural Science Foundation of Jiangsu Province (BK20181339)本文责任编委 刘艳军Recommended by Associate Editor LIU Yan-Jun1. 江南大学数字媒体学院 无锡 2141222. 上海大学通信与信息工程学院 上海 2004443. 江苏科技大学电子信息学院 镇江2121001. School of Digital Media, Jiangnan University, Wuxi 2141222. School of Communication and Information Engineering,Shanghai University, Shanghai 2004443. School of Electronic Information, Jiangsu University of Science and Technology,Zhenjiang 212100第 48 卷 第 7 期自 动 化 学 报Vol. 48, No. 72022 年 7 月ACTA AUTOMATICA SINICAJuly, 2022度矩阵, 提出了WV-Co-FCM (Weighted view colla-borative fuzzy C-means) 算法; 文献[11]通过最小化双视角下样本与聚类中心的欧氏距离来减小不同视角间的差异性, 基于K-means 聚类框架提出了Co-K-means (Collaborative multi-view K-means clustering)算法; 在此基础上, 文献[12]提出了基于模糊划分的TW-Co-K-means (Two-level wei-ghted collaborative K-means for multi-view clus-tering)算法, 对Co-K-means 算法中的双视角欧氏距离加入一致性权重, 获得了比Co-K-means 更好的多视角聚类结果. 以上多视角聚类方法都基于成对视角来构造不同的正则化项来挖掘视角之间的一致性和差异性信息, 缺乏对多个视角的整体考虑.一致性和差异性是设计多视角聚类算法需要考虑的两个重要原则[10−14]. 一致性是指在多视角聚类过程中, 各视角的聚类结果应该尽可能保持一致.在设计多视角聚类算法时, 往往通过协同、集成等手段来构建全局划分矩阵, 从而得到最终的聚类结果[14−16]. 差异性是指多视角数据中的每个视角均反映了对象在不同方面的信息, 这些信息互为补充[10],在设计多视角聚类算法时需要对这些信息进行充分融合. 综合考虑这两方面的因素, 本文拟提出新型的低秩约束熵加权多视角模糊聚类算法(Entropy-weigh-ting multi-view fuzzy C-means with low rank con-straint, LR-MVEWFCM), 其主要创新点可以概括为以下3个方面:1)在模糊聚类框架下提出了面向视角一致性的低秩约束准则. 已有的多视角模糊聚类算法大多基于成对视角之间的两两关系来构造正则化项, 忽视了多个视角的整体一致性信息. 本文在模糊聚类框架下从视角全局一致性出发引入低秩约束正则化项, 从而得到新型的低秩约束多视角模糊聚类算法.2) 在模糊聚类框架下同时考虑多视角聚类的一致性和差异性, 在引入低秩约束的同时进一步使用面向视角差异性的多视角香农熵加权策略; 在迭代优化的过程中, 通过动态调节视角权重系数来突出具有更好分离性的视角的权重, 从而提高聚类性能.3)在模糊聚类框架下首次使用交替方向乘子法(Alternating direction method of multipliers,ADMM)[15]对LR-MVEWFCM 算法进行优化求解.N D K C m x j,k j k j =1,···,N k =1,···,K v i,k k i i =1,···,C U k =[µij,k ]k µij,k k j i 在本文中, 令 为样本总量, 为样本维度, 为视角数目, 为聚类数目, 为模糊指数. 设 表示多视角场景中第 个样本第 个视角的特征向量, , ; 表示第 个视角下, 第 个聚类中心, ; 表示第 个视角下的模糊隶属度矩阵, 其中 是第 个视角下第 个样本属于第 个聚类中心的模i =1,···,C j =1,···,N.糊隶属度, , 本文第1节在相关工作中回顾已有的经典模糊C 均值聚类算法FCM 模型[17]和多视角模糊聚类Co-FKM 模型[9]; 第2节将低秩理论与多视角香农熵理论相结合, 提出本文的新方法; 第3节基于模拟数据集和UCI (University of California Irvine)数据集验证本文算法的有效性, 并给出实验分析;第4节给出实验结论.1 相关工作1.1 模糊C 均值聚类算法FCMx 1,···,x N ∈R D U =[µi,j ]V =[v 1,v 2,···,v C ]设单视角环境下样本 , 是模糊划分矩阵, 是样本的聚类中心. FCM 算法的目标函数可表示为J FCM 可得到 取得局部极小值的必要条件为U 根据式(2)和式(3)进行迭代优化, 使目标函数收敛于局部极小点, 从而得到样本属于各聚类中心的模糊划分矩阵 .1.2 多视角模糊聚类Co-FKM 模型在经典FCM算法的基础上, 文献[9]通过引入视角协同约束正则项, 对视角间的一致性信息加以约束, 提出了多视角模糊聚类Co-FKM 模型.多视角模糊聚类Co-FKM 模型需要满足如下条件:J Co-FKM 多视角模糊聚类Co-FKM 模型的目标函数 定义为7 期张嘉旭等: 基于低秩约束的熵加权多视角模糊聚类算法1761η∆∆式(5)中, 表示协同划分参数; 表示视角一致项,由式(6)可知, 当各视角趋于一致时, 将趋于0.µij,k 迭代得到各视角的模糊隶属度 后, 为了最终得到一个具有全局性的模糊隶属度划分矩阵, Co-FKM 算法对各视角下的模糊隶属度采用几何平均的方法, 得到数据集的整体划分, 具体形式为ˆµij 其中, 为全局模糊划分结果.2 基于低秩约束的熵加权多视角模糊聚类算法针对当前多视角模糊聚类算法研究中存在的不足, 本文提出一种基于低秩约束的熵加权多视角模糊聚类新方法LR-MVEWFCM. 一方面通过向多视角模糊聚类算法的目标学习准则中引入低秩约束项, 在整体上控制聚类过程中各视角的一致性; 另一方面基于香农熵理论, 通过熵加权机制来控制各视角之间的差异性.同时使用交替方向乘子法对模型进行优化求解.U 1,···,U K U U U 设多视角隶属度 融合为一个整体的隶属度矩阵 , 将矩阵 的秩函数凸松弛为核范数, 通过对矩阵 进行低秩约束, 可以将多视角数据之间的一致性问题转化为核范数最小化问题进行求解, 具体定义为U =[U 1···U K ]T ∥·∥∗其中, 表示全局划分矩阵, 表示核范数. 式(8)的优化过程保证了全局划分矩阵的低秩约束. 低秩约束的引入, 可以弥补当前大多数多视角聚类算法仅能基于成对视角构建约束的缺陷, 从而更好地挖掘多视角数据中包含的全局一致性信息.目前已有的多视角的聚类算法在处理多视角数据时, 通常默认每个视角平等共享聚类结果[11], 但实际上某些视角的数据往往因空间分布重叠而导致可分性较差. 为避免此类视角的数据过多影响聚类效果,本文拟对各视角进行加权处理, 并构建香农熵正则项从而在聚类过程中有效地调节各视角之间的权重, 使得具有较好可分离性的视角的权重系数尽可能大, 以达到更好的聚类效果.∑Kk =1w k =1w k ≥0令视角权重系数 且 , 则香农熵正则项表示为U w k U =[U 1···U K ]T w =[w 1,···,w k ,···,w K ]K 综上所述, 本文作如下改进: 首先, 用本文提出的低秩约束全局模糊隶属度矩阵 ; 其次, 计算损失函数时考虑视角权重 , 并加入视角权重系数的香农熵正则项. 设 ; 表示 个视角下的视角权重. 本文所构建LR-MVEWFCM 的目标函数为其中, 约束条件为m =2本文取模糊指数 .2.1 基于ADMM 的求解算法(11)在本节中, 我们将使用ADMM 方法, 通过交替方向迭代的策略来实现目标函数 的最小化.g (Z )=θ∥Z ∥∗(13)(10)最小化式 可改写为如下约束优化问题:其求解过程可分解为如下几个子问题:V w U V 1) -子问题. 固定 和 , 更新 为1762自 动 化 学 报48 卷(15)v i,k 通过最小化式 , 可得到 的闭合解为U w Q Z U 2) -子问题. 固定 , 和 , 更新 为(17)U (t +1)通过最小化式 , 可得到 的封闭解为w V U w 3) -子问题. 固定 和 , 更新 为Z Q U Z(20)通过引入软阈值算子, 可得式 的解为U (t+1)+Q (t )=A ΣB T U (t +1)+Q (t )S θ/ρ(Σ)=diag ({max (0,σi −θ/ρ)})(i =1,2,···,N )其中, 为矩阵 的奇异值分解, 核范数的近邻算子可由软阈值算子给出.Q Z U Q 5) -子问题. 固定 和 , 更新 为w =[w 1,···,w k ,···,w K ]U ˜U经过上述迭代过程, 目标函数收敛于局部极值,同时得到不同视角下的模糊隶属度矩阵. 本文借鉴文献[10]的集成策略, 使用视角权重系数 和模糊隶属度矩阵 来构建具有全局特性的模糊空间划分矩阵 :w k U k k 其中, , 分别表示第 个视角的视角权重系数和相应的模糊隶属度矩阵.LR-MVEWFCM 算法描述如下:K (1≤k ≤K )X k ={x 1,k ,···,x N,k }C ϵT 输入. 包含 个视角的多视角样本集, 其中任意一个视角对应样本集 , 聚类中心 , 迭代阈值 , 最大迭代次数 ;v (t )i,k ˜Uw k 输出. 各视角聚类中心 , 模糊空间划分矩阵和各视角权重 ;V (t )U (t )w (t )t =0步骤1. 随机初始化 , 归一化 及 ,;(21)v (t +1)i,k 步骤2. 根据式 更新 ;(23)U (t +1)步骤3. 根据式 更新 ;(24)w (t +1)k 步骤4. 根据式 更新 ;(26)Z (t +1)步骤5. 根据式 更新 ;(27)Q (t +1)步骤6. 根据式 更新 ;L (t +1)−L (t )<ϵt >T 步骤7. 如果 或者 , 则算法结束并跳出循环, 否则, 返回步骤2;w k U k (23)˜U步骤8. 根据步骤7所获取的各视角权重 及各视角下的模糊隶属度 , 使用式 计算 .2.2 讨论2.2.1 与低秩约束算法比较近年来, 基于低秩约束的机器学习模型得到了广泛的研究. 经典工作包括文献[16]中提出LRR (Low rank representation)模型, 将矩阵的秩函数凸松弛为核范数, 通过求解核范数最小化问题, 求得基于低秩表示的亲和矩阵; 文献[14]提出低秩张量多视角子空间聚类算法(Low-rank tensor con-strained multiview subspace clustering, LT-MSC),7 期张嘉旭等: 基于低秩约束的熵加权多视角模糊聚类算法1763在各视角间求出带有低秩约束的子空间表示矩阵;文献 [18] 则进一步将低秩约束引入多模型子空间聚类算法中, 使算法模型取得了较好的性能. 本文将低秩约束与多视角模糊聚类框架相结合, 提出了LR-MVEWFCM 算法, 用低秩约束来实现多视角数据间的一致性. 本文方法可作为低秩模型在多视角模糊聚类领域的重要拓展.2.2.2 与多视角Co-FKM 算法比较图1和图2分别给出了多视角Co-FKM 算法和本文LR-MVEWFCM 算法的工作流程.多视角数据Co-FKM视角 1 数据视角 2 数据视角 K 数据各视角间两两约束各视角模糊隶属度集成决策函数划分矩阵ÛU 1U 2U K图 1 Co-FKM 算法处理多视角聚类任务工作流程Fig. 1 Co-FKM algorithm for multi-view clustering task本文算法与经典的多视角Co-FKM 算法在多视角信息的一致性约束和多视角聚类结果的集成策略上均有所不同. 在多视角信息的一致性约束方面, 本文将Co-FKM 算法中的视角间两两约束进一步扩展到多视角全局一致性约束; 在多视角聚类结果的集成策略上, 本文不同于Co-FKM 算法对隶属度矩阵简单地求几何平均值的方式, 而是将各视角隶属度与视角权重相结合, 构建具有视角差异性的集成决策函数.3 实验与分析3.1 实验设置本文采用模拟数据集和UCI 中的真实数据集进行实验验证, 选取FCM [17]、CombKM [19]、Co-FKM [9]和Co-Clustering [20]这4个聚类算法作为对比算法, 参数设置如表1所示. 实验环境为: Intel Core i5-7400 CPU, 其主频为2.3 GHz, 内存为8 GB.编程环境为MATLAB 2015b.本文采用如下两个性能指标对各算法所得结果进行评估.1) 归一化互信息(Normalized mutual inform-ation, NMI)[10]N i,j i j N i i N j j N 其中, 表示第 类与第 类的契合程度, 表示第 类中所属样本量, 表示第 类中所属样本量, 而 表示数据的样本总量;2) 芮氏指标(Rand index, RI)[10]表 1 参数定义和设置Table 1 Parameter setting in the experiments算法算法说明参数设置FCM 经典的单视角模糊聚类算法m =min (N,D −1)min (N,D −1)−2N D 模糊指数 ,其中, 表示样本数, 表示样本维数CombKM K-means 组合 算法—Co-FKM 多视角协同划分的模糊聚类算法m =min (N,D −1)min (N,D −1)−2η∈K −1K K ρ=0.01模糊指数 , 协同学习系数 ,其中, 为视角数, 步长 Co-Clustering 基于样本与特征空间的协同聚类算法λ∈{10−3,10−2, (103)µ∈{10−3,10−2,···,103}正则化系数 ,正则化系数 LR-MVEWFCM 基于低秩约束的熵加权多视角模糊聚类算法λ∈{10−5,10−4, (105)θ∈{10−3,10−2, (103)m =2视角权重平衡因子 , 低秩约束正则项系数, 模糊指数 MVEWFCMθ=0LR-MVEWFCM 算法中低秩约束正则项系数 λ∈{10−5,10−4, (105)m =2视角权重平衡因子 , 模糊指数 多视角数据差异性集成决策函数各视角模糊隶属度U 1U 2U K各视角权重W 1W 2W kLR-MVEWFCM 视角 1 数据视角 2 数据视角 K 数据整体约束具有视角差异性的划分矩阵Û图 2 LR-MVEWFCM 算法处理多视角聚类任务工作流程Fig. 2 LR-MVEWFCM algorithm for multi-viewclustering task1764自 动 化 学 报48 卷f 00f 11N [0,1]其中, 表示具有不同类标签且属于不同类的数据配对点数目, 则表示具有相同类标签且属于同一类的数据配对点数目, 表示数据的样本总量. 以上两个指标的取值范围介于 之间, 数值越接近1, 说明算法的聚类性能越好. 为了验证算法的鲁棒性, 各表中统计的性能指标值均为算法10次运行结果的平均值.3.2 模拟数据集实验x,y,z A 1x,y,z A 2x,y,z A 3x,y,z 为了评估本文算法在多视角数据集上的聚类效果, 使用文献[10]的方法来构造具有三维特性的模拟数据集A ( ), 其具体生成过程为: 首先在MATLAB 环境下采用正态分布随机函数normrnd 构建数据子集 ( ), ( )和 ( ), 每组对应一个类簇, 数据均包含200个样本.x,y,z 其中第1组与第2组数据集在特征z 上数值较为接近, 第2组与第3组数据集在特征x 上较为接近;然后将3组数据合并得到集合A ( ), 共计600个样本; 最后对数据集内的样本进行归一化处理. 我们进一步将特征x , y , z 按表2的方式两两组合, 从而得到多视角数据.表 2 模拟数据集特征组成Table 2 Characteristic composition of simulated dataset视角包含特征视角 1x,y 视角 2y,z 视角 3x,z将各视角下的样本可视化, 如图3所示.通过观察图3可以发现, 视角1中的数据集在空间分布上具有良好的可分性, 而视角2和视角3的数据在空间分布上均存在着一定的重叠, 从而影Z YZZXYX(a) 模拟数据集 A (a) Dataset A(b) 视角 1 数据集(b) View 1(c) 视角 2 数据集(c) View 2(d) 视角 3 数据集(d) View 3图 3 模拟数据集及各视角数据集Fig. 3 Simulated data under multiple views7 期张嘉旭等: 基于低秩约束的熵加权多视角模糊聚类算法1765响了所在视角下的聚类性能. 通过组合不同视角生成若干新的数据集, 如表3所示, 并给出了LR-MVEWFCM重复运行10次后的平均结果和方差.表 3 模拟数据实验算法性能对比Table 3 Performance comparison of the proposedalgorithms on simulated dataset编号包含特征NMI RI1视角1 1.0000 ± 0.0000 1.0000 ± 0.0000 2视角20.7453 ± 0.00750.8796 ± 0.0081 3视角30.8750 ± 0.00810.9555 ± 0.0006 4视角1, 视角2 1.0000 ± 0.0000 1.0000 ± 0.0000 5视角1, 视角3 1.0000 ± 0.0000 1.0000 ± 0.0000 6视角2, 视角30.9104 ± 0.03960.9634 ± 0.0192 7视角2, 视角3 1.0000 ± 0.0000 1.0000 ± 0.0000对比LR-MVEWFCM在数据集1~3上的性能, 我们发现本文算法在视角1上取得了最为理想的效果, 在视角3上的性能要优于视角2, 这与图3中各视角数据的空间可分性是一致的. 此外, 将各视角数据两两组合构成新数据集4~6后, LR-MVEWFCM算法都得到了比单一视角更好的聚类效果, 这都说明了本文采用低秩约束来挖掘多视角数据中一致性的方法, 能够有效提高聚类性能.基于多视角数据集7, 我们进一步给出本文算法与其他经典聚类算法的比较结果.从表4中可以发现, 由于模拟数据集在某些特征空间下具有良好的空间可分性, 所以无论是本文的算法还是Co-Clustering算法、FCM算法等算法均取得了很好的聚类效果, 而CombKM算法的性能较之以上算法则略有不足, 分析其原因在于CombKM算法侧重于挖掘样本之间的信息, 却忽视了多视角之间的协作, 而本文算法通过使用低秩约束进一步挖掘了多视角之间的全局一致性, 因而得到了比CombKM算法更好的聚类效果.3.3 真实数据集实验本节采用5个UCI数据集: 1) Iris数据集; 2) Image Segmentation (IS) 数据集; 3) Balance数据集; 4) Ionosphere数据集; 5) Wine数据集来进行实验. 由于这几个数据集均包含了不同类型的特征,所以可以将这些特征进行重新分组从而构造相应的多视角数据集. 表5给出了分组后的相关信息.我们在多视角数据集上运行各多视角聚类算法; 同时在原数据集上运行FCM算法. 相关结果统计见表6和表7.NMI RI通过观察表6和表7中的和指标值可知, Co-FKM算法的聚类性能明显优于其他几种经典聚类算法, 而相比于Co-FKM算法, 由于LR-MVEWFCM采用了低秩正则项来挖掘多视角数据之间的一致性关系, 并引入多视角自适应熵加权策略, 从而有效控制各视角之间的差异性. 很明显, 这种聚类性能更为优异和稳定, 且收敛性的效果更好.表6和表7中的结果也展示了在IS、Balance、Iris、Ionosphere和Wine数据集上, 其NMI和RI指标均提升3 ~ 5个百分点, 这也说明了本文算法在多视角聚类过程中的有效性.为进一步说明本文低秩约束发挥的积极作用,将LR-MVEWFCM算法和MVEWFCM算法共同进行实验, 算法的性能对比如图4所示.从图4中不难发现, 无论在模拟数据集上还是UCI真实数据集上, 相比较MVEWFCM算法, LR-MVEWFCM算法均可以取得更好的聚类效果. 因此可见, LR-MVEWFCM目标学习准则中的低秩约束能够有效利用多视角数据的一致性来提高算法的聚类性能.为研究本文算法的收敛性, 同样选取8个数据集进行收敛性实验, 其目标函数变化如图5所示.从图5中可以看出, 本文算法在真实数据集上仅需迭代15次左右就可以趋于稳定, 这说明本文算法在速度要求较高的场景下具有较好的实用性.综合以上实验结果, 我们不难发现, 在具有多视角特性的数据集上进行模糊聚类分析时, 多视角模糊聚类算法通常比传统单视角模糊聚类算法能够得到更优的聚类效果; 在本文中, 通过在多视角模糊聚类学习中引入低秩约束来增强不同视角之间的一致性关系, 并引入香农熵调节视角权重关系, 控制不同视角之间的差异性, 从而得到了比其他多视角聚类算法更好的聚类效果.表 4 模拟数据集7上各算法的性能比较Table 4 Performance comparison of the proposed algorithms on simulated dataset 7数据集指标Co-Clustering CombKM FCM Co-FKM LR-MVEWFCMA NMI-mean 1.00000.9305 1.0000 1.0000 1.0000 NMI-std0.00000.14640.00000.00000.0000 RI-mean 1.00000.9445 1.0000 1.0000 1.0000 RI-std0.00000.11710.00000.00000.00001766自 动 化 学 报48 卷3.4 参数敏感性实验LR-MVEWFCM算法包含两个正则项系数,λθθθθλλ即视角权重平衡因子和低秩约束正则项系数, 图6以LR-MVEWFCM算法在模拟数据集7上的实验为例, 给出了系数从0到1000过程中, 算法性能的变化情况, 当低秩正则项系数= 0时, 即不添加此正则项, 算法的性能最差, 验证了本文加入的低秩正则项的有效性, 当值变化过程中, 算法的性能相对变化较小, 说明本文算法在此数据集上对于值变化不敏感, 具有一定的鲁棒性; 而当香农熵正则项系数= 0时, 同样算法性能较差, 也说明引入此正则项的合理性. 当值变大时, 发现算法的性能也呈现变好趋势, 说明在此数据集上, 此正则项相对效果比较明显.4 结束语本文从多视角聚类学习过程中的一致性和差异性两方面出发, 提出了基于低秩约束的熵加权多视角模糊聚类算法. 该算法采用低秩正则项来挖掘多视角数据之间的一致性关系, 并引入多视角自适应熵加权策略从而有效控制各视角之间的差异性,从而提高了算法的性能. 在模拟数据集和真实数据集上的实验均表明, 本文算法的聚类性能优于其他多视角聚类算法. 同时本文算法还具有迭代次数少、收敛速度快的优点, 具有良好的实用性. 由于本文采用经典的FCM框架, 使用欧氏距离来衡量数据对象之间的差异,这使得本文算法不适用于某些高维数据场景. 如何针对高维数据设计多视角聚类算法, 这也将是我们今后的研究重点.表 5 基于UCI数据集构造的多视角数据Table 5 Multi-view data constructdedbased on UCI dataset编号原数据集说明视角特征样本视角类别8IS Shape92 31027 RGB99Iris Sepal长度215023 Sepal宽度Petal长度2Petal宽度10Balance 天平左臂重量262523天平左臂长度天平右臂重量2天平右臂长度11Iris Sepal长度115043 Sepal宽度1Petal长度1Petal宽度112Balance 天平左臂重量162543天平左臂长度1天平右臂重量1天平右臂长度113Ionosphere 每个特征单独作为一个视角135134214Wine 每个特征单独作为一个视角1178133表 6 5种聚类方法的NMI值比较结果Table 6 Comparison of NMI performance of five clustering methods编号Co-Clustering CombKM FCM Co-FKM LR-MVEWFCM 均值P-value均值P-value均值P-value均值P-value均值80.5771 ±0.00230.00190.5259 ±0.05510.20560.5567 ±0.01840.00440.5881 ±0.01093.76×10−40.5828 ±0.004490.7582 ±7.4015 ×10−172.03×10−240.7251 ±0.06982.32×10−70.7578 ±0.06981.93×10−240.8317 ±0.00648.88×10−160.9029 ±0.0057100.2455 ±0.05590.01650.1562 ±0.07493.47×10−50.1813 ±0.11720.00610.2756 ±0.03090.10370.3030 ±0.0402110.7582 ±1.1703×10−162.28×10−160.7468 ±0.00795.12×10−160.7578 ±1.1703×10−165.04×10−160.8244 ±1.1102×10−162.16×10−160.8768 ±0.0097120.2603 ±0.06850.38250.1543 ±0.07634.61×10−40.2264 ±0.11270.15730.2283 ±0.02940.01460.2863 ±0.0611130.1385 ±0.00852.51×10−90.1349 ±2.9257×10−172.35×10−130.1299 ±0.09842.60×10−100.2097 ±0.03290.04830.2608 ±0.0251140.4288 ±1.1703×10−161.26×10−080.4215 ±0.00957.97×10−090.4334 ±5.8514×10−172.39×10−080.5295 ±0.03010.43760.5413 ±0.03647 期张嘉旭等: 基于低秩约束的熵加权多视角模糊聚类算法1767表 7 5种聚类方法的RI 值比较结果Table 7 Comparison of RI performance of five clustering methods编号Co-ClusteringCombKM FCMCo-FKM LR-MVEWFCM均值P-value 均值P-value 均值P-value 均值P-value 均值80.8392 ±0.0010 1.3475 ×10−140.8112 ±0.0369 1.95×10−70.8390 ±0.01150.00320.8571 ±0.00190.00480.8508 ±0.001390.8797 ±0.0014 1.72×10−260.8481 ±0.0667 2.56×10−50.8859 ±1.1703×10−16 6.49×10−260.9358 ±0.0037 3.29×10−140.9665 ±0.0026100.6515 ±0.0231 3.13×10−40.6059 ±0.0340 1.37×10−60.6186 ±0.06240.00160.6772 ±0.02270.07610.6958 ±0.0215110.8797 ±0.0014 1.25×10−180.8755 ±0.0029 5.99×10−120.8859 ±0.0243 2.33×10−180.9267 ±2.3406×10−16 5.19×10−180.9527 ±0.0041120.6511 ±0.02790.01560.6024 ±0.0322 2.24×10−50.6509 ±0.06520.11390.6511 ±0.01890.0080.6902 ±0.0370130.5877 ±0.0030 1.35×10−120.5888 ±0.0292 2.10×10−140.5818 ±1.1703×10−164.6351 ×10−130.6508 ±0.01470.03580.6855 ±0.0115140.7187 ±1.1703×10−163.82×10−60.7056 ±0.01681.69×10−60.7099 ±1.1703×10−168.45×10−70.7850 ±0.01620.59050.7917 ±0.0353R I数据集N M I数据集(a) RI 指标(a) RI(b) NMI 指标(b) NMI图 4 低秩约束对算法性能的影响(横坐标为数据集编号, 纵坐标为聚类性能指标)Fig. 4 The influence of low rank constraints on the performance of the algorithm (the X -coordinate isthe data set number and the Y -coordinate is the clustering performance index)目标函数值1 096.91 096.81 096.61 096.71 096.51 096.41 096.31 096.21 096.1目标函数值66.266.065.665.865.465.2迭代次数05101520目标函数值7.05.06.55.54.04.53.03.5迭代次数05101520迭代次数05101520目标函数值52.652.251.451.851.050.6迭代次数05101520×106(a) 数据集 7(a) Dataset 7(b) 数据集 8(b) Dataset 8(c) 数据集 9(c) Dataset 9(d) 数据集 10(d) Dataset 101768自 动 化 学 报48 卷ReferencesXu C, Tao D C, Xu C. Multi-view learning with incompleteviews. IEEE Transactions on Image Processing , 2015, 24(12):5812−58251Brefeld U. Multi-view learning with dependent views. In: Pro-ceedings of the 30th Annual ACM Symposium on Applied Com-puting, Salamanca, Spain: ACM, 2015. 865−8702Muslea I, Minton S, Knoblock C A. Active learning with mul-tiple views. Journal of Artificial Intelligence Research , 2006,27(1): 203−2333Zhang C Q, Adeli E, Wu Z W, Li G, Lin W L, Shen D G. In-fant brain development prediction with latent partial multi-view representation learning. IEEE Transactions on Medical Imaging ,2018, 38(4): 909−9184Bickel S, Scheffer T. Multi-view clustering. In: Proceedings of the 4th IEEE International Conference on Data Mining (ICDM '04), Brighton, UK: IEEE, 2004. 19−265Wang Y T, Chen L H. Multi-view fuzzy clustering with minim-ax optimization for effective clustering of data from multiple sources. Expert Systems with Applications , 2017, 72: 457−4666Wang Jun, Wang Shi-Tong, Deng Zhao-Hong. Survey on chal-lenges in clustering analysis research. Control and Decision ,2012, 27(3): 321−328(王骏, 王士同, 邓赵红. 聚类分析研究中的若干问题. 控制与决策,2012, 27(3): 321−328)7Pedrycz W. Collaborative fuzzy clustering. Pattern Recognition Letters , 2002, 23(14): 1675−16868Cleuziou G, Exbrayat M, Martin L, Sublemontier J H. CoFKM:A centralized method for multiple-view clustering. In: Proceed-ings of the 9th IEEE International Conference on Data Mining,Miami, FL, USA: IEEE, 2009. 752−7579Jiang Y Z, Chung F L, Wang S T, Deng Z H, Wang J, Qian P J. Collaborative fuzzy clustering from multiple weighted views.IEEE Transactions on Cybernetics , 2015, 45(4): 688−70110Bettoumi S, Jlassi C, Arous N. Collaborative multi-view K-means clustering. Soft Computing , 2019, 23(3): 937−94511Zhang G Y, Wang C D, Huang D, Zheng W S, Zhou Y R. TW-Co-K-means: Two-level weighted collaborative K-means for multi-view clustering. Knowledge-Based Systems , 2018, 150:127−13812Cao X C, Zhang C Q, Fu H Z, Liu S, Zhang H. Diversity-in-duced multi-view subspace clustering. In: Proceedings of the2015 IEEE Conference on Computer Vision and Pattern Recog-nition, Boston, MA, USA: IEEE, 2015. 586−59413Zhang C Q, Fu H Z, Liu S, Liu G C, Cao X C. Low-rank tensor constrained multiview subspace clustering. In: Proceedings of the 2015 IEEE International Conference on Computer Visio,Santiago, Chile: IEEE, 2015. 1582−159014Boyd S, Parikh N, Chu E, Peleato B, Eckstein J. Distributed optimization and statistical learning via the alternating direc-tion method of multipliers. Foundations and Trends in Machine Learning , 2011, 3(1): 1−12215Liu G C, Lin Z C, Yan S C, Sun J, Yu Y, Ma Y. Robust recov-ery of subspace structures by low-rank representation. IEEE1616.216.015.815.615.415.215.0目标函数值目标函数值目标函数值51015迭代次数迭代次数迭代次数 711.2011.1511.1011.0511.0010.9510.90800700600500400300200目标函数值38.638.238.438.037.837.637.437.251015205101520迭代次数 705101520(e) 数据集 11(e) Dataset 11(f) 数据集 12(f) Dataset 12(g) 数据集 13(g) Dataset 13(h) 数据集 14(h) Dataset 14图 5 LR-MVEWFCM 算法的收敛曲线Fig. 5 Convergence curve of LR-MVEWFCM algorithm图 6 模拟数据集7上参数敏感性分析Fig. 6 Sensitivity analysis of parameters on simulated dataset 77 期张嘉旭等: 基于低秩约束的熵加权多视角模糊聚类算法1769。
Introduction to Cryptocurrencies∗Stefan Dziembowski†University of WarsawS.Dziembowski@.plABSTRACTWe provide a research-oriented introduction to the crypto-graphic currencies.We start with a description of Bitcoin and its main design principles.We then discuss some of its weaknesses,and show some ideas for dealing with them.We also talk about the mechanics of the mining pools and ideas for discouraging the mining pool creation.We provide an introduction to the smart contracts,and give some examples of them,including the multiparty lotteries.We then present alternative currencies that were designed to remedy some of the problems of Bitcoin.In particular, we talk about the Litecoin,the Primecoin,the Permacoin, the Zerocoin,the Proofs of Stake and the Proofs of Space. We also discuss the most important research challenges in this area.Categories and Subject DescriptorsK.4.4[Computing Milieux]:Computers and Society–Payment schemes;Distributed commercial transactions;Cy-bercash,digital cashKeywordscryptocurrencies;distributed cryptography1.INTRODUCTIONThe cryptographic currencies(also dubbed the cryptocur-rencies)are a fascinating recent concept whose popularity exploded in the past few years.Their main distinguishing feature is that they are not controlled by any single entity.∗A longer version of this document is available at .pl/Dziembowski/talks/bitcoin-tutorial.pdf.Slides from this tutorial are available at.pl/Dziembowski/talks.†Supported by the Foundation for Polish Science WELCOME/2010-4/2grant founded within the frame-work of the EU Innovative Economy(National Cohesion Strategy)Operational ProgrammePermission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full cita-tion on thefirst page.Copyrights for third-party components of this work must be honored.For all other uses,contact the Owner/Author(s).Copyright is held by the owner/author(s).CCS’15,October12–16,2015,Denver,Colorado,USA.ACM978-1-4503-3832-5/15/10.Instead,they are jointly maintained by their anonymous users connected via peer-to-peer networks.Their security is based purely on the security of the underlying crypto-graphic primitives,and on some global assumptions about the behavior of their users(like,e.g.,an assumption that a large fraction of the computing power is controlled by the honest participants).Historically thefirst,and the most prominent cryptocur-rency is the Bitcoin,introduced in2008by Satoshi Naka-moto[9].Probably the most intriguing technical innovations of Bitcoin are:the mechanism for reaching consensus in fully distributed peer-to-peer networks(the so-called blockchain technology),and the transaction syntax that allows to ex-ecute distributedfinancial operations that are much more complex than simple money transfers.Even the sheerfinancial importance of the cryptocurencies makes them an interesting research area.In our opinion, however,what makes this topic even more fascinating are the conceptual ideas behind it.These aspects will be the main focus of this tutorial.Goal of this tutorial.The goal of this tutorial is to provide a research-oriented introduction to the cryptocur-rencies.We will present the main principles of the Bitcoin design,discuss some of its weaknesses,show some ideas for its improvements and for other currencies,and discuss the most important research challenges in this area.Intended audience and prerequisite knowledge. This tutorial will be suitable for all the ACM CCS partic-ipants,both coming from the academia and from the in-dustry.We will assume familiarity with the basic crypto-graphic primitives,such as the hash functions,the signature schemes,and encryption.No prior knowledge of Bitcoin or other cryptographic currencies is required.2.OVERVIEW2.1IntroductionWe will start with a brief overview of the history of the cryptocurrencies.We will identify the mainfinancial and social aspects that contributed to the success of Bitcoin,and mention some opinions of prominent economists about the cryptocurrencies,both those supporting this idea,and those that are skeptical about it.We will show how the popularity of Bitcoin grew over time,and how its pricefluctuated.We will also mention some important events in the history of the Bitcoin community(like the MtGox collapase).DOI: /10.1145/2810103.2812704.We will introduce the transaction ledger(the blockchain), as a technique that can be used to prevent double-spending of electronic cash.We will point out that the Sybil attacks should be taken into account when designing a protocol that emulates such a ledger.We will explain what are the Proofs of Work(PoWs)and how to apply them to thwart the Sybil attacks.We will show how the Bitcoin ledger is maintained by the users called miners that solve the PoWs,in a process called mining.We will explain the concept of the hashrate and show how the total hashrate of Bitcoin users changed over the time.We will also describe how the hardness of Bitcoin’s PoW’s is adjusted to the changing hashrate.We will also talk about the Bitcoin transaction syntax.2.3Bitcoin mining poolsMining pools are coalitions of miners that share the re-ward from mining new blocks.We will discuss the economic reasons for joining the mining pools(reducing the variance of the mining reward).We will then talk about the mechan-ics of the mining pool reward systems and their weaknesses. We will also show how mining pools can be decentralized vie the so-called peer-to-peer mining technique.Discouraging the mining pool creation.We will ex-plain the risks behind the pooled miming.In particular,we will point out that it leads to the centralization of the control over mining.We will then discuss some ideas for the Bitcoin modifications that discourage the mining pool creation[8].2.4Security weaknesses of BitcoinWe will discuss the quality of the Bitcoin design.We will mention some incidents when programming errors lead to forks that needed to be resolved“manually”and we will explain the transaction malleability problem.We will then present more fundamental problems with the blockchain technology.In particular we will explain the self-ish mining attack introduced in the groundbreaking paper of Eyal and Sirer[6]and the difficulty raising attack of Lear Bahack[3]that exploits the way in which the miming dif-ficulty is adjusted in Bitcoin.We will also talk about the birbery attacks of[5].2.5Smart contractsOne of the most intriguing properties of Bitcoin is the pos-sibility of creating the so-called smart contracts(originally proposed in1990s by Nick Szabo).Such contracts can be viewed asfinancial agreements between a number of parties, whose implementations are enforced by the Bitcoin system. Simple examples of such agreements include the escrow and dispute mediation,and the assurance contracts.More ad-vanced types of contracts are the fair multiparty protocols, and in particular the decentralized lotteries[2,1,4].We will give an overview of this area.2.6Other cryptocurrenciesDiscouraging hardware mining.One of the problems of Bitcoin is that mining in hardware is much more efficient than mining in software,and hence it is completely infeasi-ble nowadays to be a miner without investing in specialized hardware.We will describe the Litecoin,which is a cur-rency that was supposed to have the property that it is not economical to mine in hardware.Less“wasteful”cryptocurrencies.Another problem with the Proofs of Work is that they require the miners to spend significant amounts of electricity on mining.There are essentially two approaches to create less“wasteful”cryp-tocurrencies.Thefirst one is to create a cryptocurrency where the work is spent on some“useful”task.We will give an overview of the currencies belonging to this class:the Permacoin,and the Primecoin.The second approach is to replace“work”by some other type of resource.This includes the Proofs of Stake,and the Proofs of Space,which is used in the recent Spacecoin proposal.We will give a short overview of these approaches.More anonymity.Since the Bitcoin’s transaction ledger is public,the only anonymity in this system comes from the fact that the users are using pseudonyms,instead if their real names.As recently shown in[7]this does not provide sufficient privacy protection.We will briefly talk about this result,and then we will explain(on a high level)the main design principles of Zerocash,which is a new proposal for a currency that provides true anonymity.2.7Research directionsWe will conclude with an overview of the open research problems in this area,like the need for better understand-ing of the Bitcoin security model,and improvement of the blockchain technology.We will also mention the open prob-lems concerning the distributed contracts,in particular we will discuss what obstacles need to be overcome to make them really practical.Finally,we will also describe the problems with securely storing the Bitcoin secret keys(in the so-called wallets).3.REFERENCES[1]M.Andrychowicz,S.Dziembowski,D.Malinowski,and L.Mazurek.Fair two-party computations viaBitcoin deposits.In BITCOIN workshop in association with the Financial Cryptography and Data Securityconference,2014.[2]M.Andrychowicz,S.Dziembowski,D.Malinowski,and L.Mazurek.Secure multiparty computations onBitcoin.In2014IEEE Symposium on Security andPrivacy[3]L.Bahack.Theoretical bitcoin attacks with less thanhalf of the computational power.arXiv,2013.[4]I.Bentov and R.Kumaresan.How to use bitcoin todesign fair protocols.In CRYPTO2014[5]J.Bonneau,E.W.Felten,S.Goldfeder,J.A.Kroll,and A.Narayanan.Why buy when you can rent?Bribery attacks on Bitcoin consensus,November2014.manuscript[6]I.Eyal and E.G¨u n Sirer.Majority is not enough:Bitcoin mining is vulnerable.In FinancialCryptography and Data Security,2014.[7]S.Meiklejohn,M.Pomarole,G.Jordan,K.Levchenko,D.McCoy,G.M.Voelker,and S.Savage.Afistful ofbitcoins:Characterizing payments among men withno names.In Proceedings of the2013Conference onInternet Measurement Conference.[8]ler,E.Shi,A.Kosba,and J.Katz.Preprint:Nonoutsourceable scratch-offpuzzles to discouragebitcoin mining coalitions.2015.[9]Satoshi Nakamoto.Bitcoin:A peer-to-peer electroniccash system,2008.2.2Bitcoin main design principles。
Calculation of Mining Subsidence and Ground Principal Strains Using a Generalized Influence Function MethodG. Ren*, J. Li and J. BuckeridgeSchool of Civil, Environmental and Chemical Engineering, RMIT UniversityGPO Box 2476V, Melbourne 3001, Australia___________________________________________________________________________AbstractA generalized influence function method is introduced using tabulated influence weighting factors in subsidence calculation. Tabulated influence weighting factors have the advantage of being more flexible than having to find a mathematical influence function. The values of weighting factors can be readily adopted either using a local observational database if available, or a published data source. The flexibility and adoptability of the method is demonstrated through a case study with subsidence contours, movement vectors and principal strains. It is also demonstrated that the method is a valuable tool in assessing subsidence effects on surface structures and utilities.* Corresponding author : Tel.: 613 9925 2409 Fax: 613 9639 0138E-mail address: gang.ren@.au1. IntroductionThe accurate prediction of ground movements associated with underground mining is essential for assessing surface damage and effective damage prevention. Various methods can be used forpredicting mining subsidence and horizontal movements, including physical and numerical modeling methods, profile function and influence function methods [1]. Of these numerous prediction methods, the influence function method is increasingly favoured due to its flexibility andadoptability with computer programming [2]. Various mining configurations including irregularshaped panels, multiple extraction seams, inclined coal deposits and sloping ground can be taken into account using the influence function method [3 & 4]. The influence function method can be calibrated to suit local mining conditions to achieve better analytical results as demonstrated by Sheorey et al. [5]. Ren et al. [6] suggested that the angle of draw, inter alia , is influenced by the strength of the overburden strata. The angle of draw defines the extent of the underground extraction at the ground surface.In the application of the conventional influence function method, it is usually necessary to use a predefined influence function, which is a mathematical expression, to define the weighting factors. This paper presents a generalized influence function approach which makes use of influence factors in a tabular form for subsidence calculations.2. Mining Subsidence Prediction using Influence Function MethodThe influence function method used in subsidence prediction is based on the assumption that the effect of an underground extraction on the surface follows a prescribed mathematical expression, i.e. the influence function depends on the spatial relationship between the locations of the underground extraction and the surface point in question. This is illustrated in Figure 1, where an underground extraction element creates a subsidence trough at the surface. The profile for the subsidence trough can be prescribed by an influence function (Figure 2), which can be expressed mathematically. A general form of an influence function may be expressed as: )(x f k z =, where x can be either the zone angle θ or as the radial distance r from the centre of the subsidence trough (see Figure 2).A number of influence functions have been proposed by researchers, such as Bals, Sann, Ehrhardt and Sauer which were summarized by Kratzsch [7]:Bals’ influence function: θ2cos =z k (1)where θ is the zone angle (see Figures 2) Sann’s influence function: 241256.2r z e rk −= (2) Ehrhardt and Sauer’s influence function: 25.01392.0r z e k −= (3)where r is radial distance from centre of subsidence trough (see Figure 2). Influence functions are generally derived analytically from observations or based on assumptions. In general, these functions are mathematical expressions that are used to describe the effect of the removal of an underground element on the ground surface.A stochastic influence function that has been derived from statistical assumptions was adopted in computer programs for subsidence computation [3]. Application of this type of influence function assumes that the ground will achieve the most probable state of static equilibrium following theunderground extraction. Based on this probabilistic approach, the trough profile is assumed to follow the stochastic function (Whittaker and Reddish [2], p477): 2221R r z e R k ⋅−=π (4)where R is the radius of influence circle on surface (Figure 2).All influence functions, in essence, define the extent of influence of an underground extraction element on a surface point using a mathematical expression.Extensive studies have been conducted ([2], [3] & [4]) and the application of the stochastic influence function method to practical subsidence analysis has been well demonstrated in these literatures. In a relatively recent publication [5], Sheorey et al . proposed a new modified influence function based on the observational data obtained from a specific coal field in India: )cos 1(5352.02R rR k z ⋅+=π(5) This modified influence function gave generally much improved predictive results for the specific coal field in India [5].As noted, the influence function method used in mining subsidence prediction and analysis has become a powerful tool when it was programmed for personal computers. Complex extractiongeometries and topography can now be readily analysed and the output can be presented in graphical format [3 & 4].3.Generalized Influence Function Method 3.1 Subsidence Weighting FactorsThe conventional influence function method generally makes use of zone area or grid integration approach. Peng [1] has summarised the methodology of application of the influence function to subsidence calculation in detail. It is demonstrated that the influence function method is a flexible tool in subsidence analysis and prediction. However, all the presented influence function methods involve mathematical expressions to describe the effect of an underground extraction on the ground surface, such as the functions listed in Equations (1) to (5). Mathematical functions can be derived either from statistics or from curve fitting based on observation data obtained from specific coal fields. For example, if the influence area is evenly divided into 10 rings, the stochastic influence function at Equation (4) gives the following subsidence weighting factors (Table 1) [3]:Table 1: Weighting factors based on stochastic function Ring (i)* 1 2 3 4 5 6 7 8 9 10 ∑ Weight factorS(i)** 0.035 (0.031) 0.091 (0.087) 0.132 (0.128) 0.153 (0.149) 0.154 (0.149) 0.137 (0.133) 0.112 (0.108) 0.085 (0.081) 0.058 (0.055) 0.039 (0.035) 1 0.957 *Ring number counted from inter to outer** Values in brackets - weighting factors directly computed based on stochastic function without normalization The modified influence function (5) by Sheorey et al. gives the following weighting factors: Table 2: Weighting factors based on Equation (5) (after Sheorey, et al . [5])Ring (i) 1 2 3 4 5 6 7 8 9 10 ∑ Weight factor S(i) 0.030 0.085 0.131 0.161 0.171 0.160 0.129 0.086 0.040 0.007 1 Generally, in calculating the subsidence weighting factors, an infinitesimal extraction element dA will result in an amount of subsidence dS at the surface:dS = S 0 k z dA (6)where S 0 is the maximum possible subsidenceAn underground extraction panel of area “A ” will produce surface subsidence “S”, defined by:S = S 0dA k A z ∫∫(7)When the influence circle is divided into a number of rings (e.g. i=1 to n ), the subsidence weighting factors S(i) for an individual ring can be determined from:S(i) = S/S 0 = dA k A z ∫∫(8)For instance, if the stochastic influence function at (4) is used, we have:dA e R dA k i S A R r A z ∫∫∫∫⋅−==221)(π (9) Ren et al. [3] suggested the solving of the above intergration. After change to a polar system, the weighting factor for a specific ring can be calculated by: -2)(2)1()(R i r R i r e e i S ππ−−−−= (10)Where r i (i=1 to n ) is the radii of rings when the influence circle is devided into n number of rings. If the influence circle is evenly divided, i.e. ,00=r ,1011R r = ,1022R r =… R R r ==101010 (where R is the radius of influence circle), and S(i) values can be computed using Equation (10) for each of the 10 rings.Thus, S(1) = 0.031, S(2) = 0.087, … S(10) = 0.035. Table 1 lists the original weighting factor values as shown in brackets. We note the sum of the original S(i) does not equal to 1. This is because mathematically the weighting factor will never converge to nil even if the r i is outside the influence circle R. To ensure that the maximum subsidence is reached under total extraction condition, the following condition must be met [7]:1)(,1=∑=n i i S (11)This necessitates the requirement of a normalization process, where all weighting factor values (as shown in brackets in Table 1) were adjusted so that the sum of all weighting factors equals a unity. This normalization process involves spreading the difference between the original sum of S(i) and 1 to each individual rings. In this case, )957.01(101−× is added to each weighting factor, so that the sum of normalized S(i) equals 1 (see Table 1). It is noted that only fractional amount of adjustment is required in the case of stochastic influence function. Similar normalization process may be adopted for any other influence functions.From the above, it can be seen that the weighting factors S(i) can be mathematically calculated based on the influence function adopted. The disadvantage of this approach is the tedious mathematical procedure involved and its inability to be calibrated to suit a specific subsidence profile once an influence function is adopted.In fact, the calculation of weighting factors does not necessarily require a mathematical expression.A table listing the influence weighting factors will suffice to facilitate the calculation of subsidence and displacement. This leads to the assumption of the generalized influence function method as briefly described below.Instead of finding a mathematical expression for the weighting factor S(i), the values of S(i) can be expressed in a tabular form, as long as the condition at (11) is satisfied. The tabulated values can then be implemented in a computer program using the computational approach as outlined by Ren et al . [3]. It should be noted that the total sum of all weighting should equal to a unity for the case of a total extraction, i.e. all elements within the influence circle are extracted and the maximum possible subsidence value S 0 is reached.This generalized approach eliminates the need for having to find a mathematical function in order to work out the weighting factors in subsidence calculation. The weighting factor S(i) values that give reasonably good agreement with the observed data by a calibration process should be used. In practice, the calibration process would involve the following steps:(1)Use one of the influence functions (1) to (5) to establish initial base values for all weighting factors in Ring 1 to Ring 10; (2) Adjust the values of weighting factors, ensuring condition at (11), and observe thefollowing effects:(a) Increasing the weightings towards the centre of the influence zones and reducingthe weightings near the outer zones will result in a subsidence profile that is moreconcentrated in the centre of the trough.(b) Reducing the weightings in the centre of the influence zones and increasing theweightings near the outer zones will result in a flatter subsidence profile.(3) Repeat step (2) by trial and error until satisfactory results are obtained that fit wellwith the observed profile.Table 3 shows the weighting factors for the case study discussed in Section 4, where observedsubsidence profiles were available. In this case, the weighting factors were initially determined using the stochastic function as shown in Table1, and later were calibrated using the above mentioned steps to achieve better agreement with the measured profile.Table 3: Weighting factors for a case study Ring (i) 1 2 3 4 5 6 7 8 9 10 ∑ Weight factor S(i) 0.051 0.107 0.145 0.158 0.148 0.129 0.102 0.077 0.051 0.032 1 The weighting factor values for Tables 1, 2 and 3 are plotted in Figure 3 for comparison. Note that for the generalized approach, the influence of inner-rings is weighted more than outer rings. This will result in a “deep” and “narrow” subsidence profile as often observed in a typical longwall mining coal field.3.2 Effect of Angle of Draw in Influence Function MethodThe influence function method uses a number of important subsidence parameters such as the angle of draw ξ (Figure 2) in calculating subsidence values. The angle of draw determines the extent of influence of an underground extract element on the ground surface and demarcates the boundary of the influence circle [6]. The values of the angle of draw vary principally due to the geological settings, mass of overburden and mining configurations.For inclined extraction panels, the angle of draw is observed to have different characteristics between the rise-side of the panel and the low-side of the panel [2]. The angles at both sides determine the influence area on the surface.Direct application of the influence function method can give a subsidence value over the rib-side half of the maximum subsidence value produced by the whole extraction. Calibration is normally required to adjust this edge effect. The seam dip also has influence on the subsidence and train distribution. The method in which seam dip is accounted for in the influence function method was discussed by Ren et al.[4].3.3 Use of the Influence Function Method to Calculate Horizontal Movements and Strains Application of the influence function method will initially produce ground movement vectors which tend towards the extraction element (Figure 1), and this permits calculation of the magnitude of the vertical component V z . The horizontal component V h can then be calculated by expression: βtan z h V V = (12)Where β is the angle between the movement vector V at “Surface Point P” and the vertical in relation to the extraction element (see Figure 1).The relation between V h and V z at Equation (12) was established based on the assumption that the “Surface Point P” within the “Elementary Trough” (see Figure 1) would be displaced towards theextraction element. The vector of the movement is denoted by V in Figure 1, where V h = V sin β and V z = V cos β , hence Equation (12).Thus, by applying the principle of superposition [7], the overall influence of an extraction panel can be calculated for a surface point, including vertical and horizontal components.The computerized influence function method is able to perform calculations for series of surface points specified by grids at any specified intervals and the vertical components of subsidence can be presented with profiles and contours. The horizontal movements induced by underground mining are best presented using principal strains. The horizontal movements and vertical settlements are calculated for all surface points defined by the regular surface grids. Once the magnitudes of horizontal movements at four points on a grid are known, the following formulae can be used to calculate the principal strains and directions [8]:)2cos 11()2cos 11(213111A e A e E +++= (13)1312E e e E −+= (14) 231221223212sin )()cos sin (22tan v e e v e v e e A −−−= (15) Where e 1 and e 3 are strains along the two grid directions, the e 2 being the strain along the diagonal direction in the grid. The E 1 and E 2 are principal strains; A 1 is the angle of between E 1 and e 1 and v 2 is the angle between e 1 and e 2 (Figure 4).Equations (13) and (14) can be used to calculate the magnitudes of the principal strains over an extraction panel and Equation (15) can be used to define the orientations of the principal strains. Subsidence and principal strain calculations for each surface points specified by regular grids along with the subsidence contours can then be plotted as illustrated in the case study in Section 4.3.4 Presentation of Veridical Subsidence, Horizontal Movementsand Ground Strain PatternsWith the generalized influence function method discussed above, it is relatively straightforward to generate a series of grid data over the whole area affected by subsidence. Vertical subsidence values can be readily plotted as subsidence contours and presented by subsidence contours. The horizontal displacement vectors can be presented with a movement vector plot showing the magnitude and direction as shown in the case study in Section 4 (Figure 6). The principal strains induced by the subsidence at the surface can be plotted by a computer program using two vectors at 90° to each other to represent the magnitudes and directions (see Figures 7a & 7b). The advantage of this type of strain presentation is that it gives principal strain distribution patterns over the whole subsidence affected area with both magnitude and direction shown in a single plot. The tensile and compressive strains are also shown on the same plot. The potential subsidence effects on buildings, bridges,pipelines and transportation networks can be assessed from the principal strain plots as illustrated in the case study in Section 4 below.4 A Case Study using the Generalized Influence Function MethodThis case study involves an abandoned 1940 coal working that had been worked by the room-and-pillar method. Because the case may be sub judice, the location of the working can not be disclosed. As the condition of the supporting pillars gradually deteriorated over the years, signs of subsidence effect appeared at the surface with both vertical and horizontal movements observed. A number of residential buildings and public structures are adjacent to the old mine working (Figure 5). The relevant local authority sought prediction of the subsidence effects if the remaining mine pillars completely collapse.A subsidence analysis was performed using the generalized influence function method based on the assumption that the roof collapse would create a series of equivalent uniform extractions as shown in Figure 5. The mine working geometry was rather irregular and dipped at 25º towards the south. The average depth of the working was about 100 m and the seam thickness was 2 m. Earlier subsidence records and observational data were collected and the generalized influence function method was employed. The influence weighting factors in Table 3 were found to give reasonable concurrence with the observed data and were used for the subsidence calculations. Figure 6 shows the output of vertical subsidence represented by contours and horizontal displacement vectors over the mine workings. The general principal strain vectors are presented in Figure 7a, and Figure 7b shows the zoomed plot in the vicinity of the hospital building. The patterns and the magnitudes of the principal strain distributions are demonstrated in relation to the mining layout and the surface building. It is interesting to note that in Figure 7b, the hospital building would be subjected to mainly tensile strains, whilst near the corner of the mine working; the ground surface will be subjected to both tensile and compressive strains and within the mine working area the principal strains are mainly compressive in both directions.The surface subsidence (Figure 6) and ground strains (Figures 7a & 7b) associated with the old mine working represent the possible effect when the mine pillars completely collapse. A three-dimensional view of an exaggerated subsidence trough induced by the old mine is shown in Figure 8. Horizontal strains provide a good indicator for the effect of ground movement on surface structures.A demarcation line for a certain strain level can be manually plotted based on the principal strain plots (Figures 7a & 7b). In this case study, a demarcation line denoting 3mm/m principal strain (mainly tensile) was drawn around the underground extraction (Figure 7a). It can be seen that part of the hospital building falls within the 3 mm/m demarcation line, and would be affected by the ground movements should the mine pillars collapse. Furthermore, the subsidence effects in terms of ground strain and vertical settlement, spread significantly from the edge of the panel at the lower side of the seam due to the dip of the seam at 25 degrees.In the same case, the local transport authority was planning to construct a traffic road in the vicinity of the old mine (see Figure 7a). Technical advice was provided to the transport authority based on the findings of the subsidence analysis pertaining to the road alignment. It is deduced that if the road can not be designed to sustain 3 mm/m strain, it should be realigned away from the 3 mm/m strain demarcation line to avoid potential subsidence damages. In most cases, the 3 mm/m principal strain demarcation line can be used for preliminary assessment of the potential mining subsidence effects on surface structures. Any other values of strain demarcation lines can be produced for relevant analysis. Furthermore, it is demonstrated from Figure 7a that the pattern of principal strains induced by mining subsidence is rather complex. Depending on locations in relationship to the mine workings, the strains can be tensile in one direction and compressive in the perpendicular direction. Generally, within the mine working area the strains are predominantly compressive in both directions.It should be noted that the complete collapse of all mine pillars as illustrated in this study case may not necessarily represent the worst possible scenario in terms of subsidence effect on the surface structures. The pillars may fail randomly and create an irregular pattern of extraction which would give rise to strain concentrations at the surface. Therefore partial collapse of pillars with a resultant irregular pattern of subsidence may be more damaging to buildings, although it is difficult to quantify.5Discussion and ConclusionThe generalized influence function method is easy to use because it eliminates the need to find a mathematical expression for subsidence calculations. All subsidence weighting factors can be expressed in a tabulated form, which can be programmed in the subsidence calculations. In practice, it is not necessary to be bound to a mathematically defined function in order to establish the weighting factors. The generalized approach can be adopted to calibrate against observational data by assigning various values of weighting factors, thus to achieve accurate subsidence prediction results. The method is also capable of producing a principal strain pattern and distribution plots over the whole mining area, which can be used as a powerful tool for subsidence effect assessment and future development planning.Almost any values of weighting factors can be adopted so long as the sum of all weighting factors equals a unity. It is recommended to use weighting factors that are consistent with local observational subsidence data. This can be achieved by a trial and error calibration process. In essence, the generalized approach using influence factor in a tabular form without having to establish a mathematical function make the calibration process more flexible, thus to achieve better analytical results.It should be noted that there is a major difference between subsidence profile function method and the aforementioned generalized influence function method, although both methods can be expressed in tabular forms. The subsidence profile function method is used to directly define the subsidence profile and deformation characteristics by either tables or mathematical functions. For example, Peng [1] proposed a number of tables for predicting final and dynamic subsidence basins and he also found that a negative exponential function was applicable to the US coalfields. The generalized influence function approach indirectly affects the subsidence profile by making use of allocation of weighting factors. The advantage of such approach over the conventional profile function method is that it can be applied to all types of underground openings including multiple seams, irregular shaped-extractions, and inclined panels [4]. As demonstrated the case study in Section 4, a complex mining configuration can be analysed using the generalised influence function approach and in doing so, it can provide useful results for assessing potential mining effects on existing structures and future developments.6 AcknowledgementThe authors wish to thank the anonymous reviewers of the journal for their constructive critiques and valuable comments which helped to improve this paper.7 References[1] Peng S S, Surface Subsidence Engineering, Society for Mining, Metallurgy and Exploration inc., Littleton, Colorado 1992.[2] Whittaker B N and Reddish D J, Subsidence occurrence, prediction and control. Elsevier, Amsterdam 1989.[3] Ren G, Reddish D J, and Whittaker B N, Mining subsidence and displacement prediction using influence function methods. Mining Science and Technology, 5(1987) 89-104.[4] Ren G, Reddish D J, and Whittaker B N, Mining subsidence and displacement prediction for inclined seams. Mining Science and Technology, 8(1989) 235-252.[5] Sheorey P R, Loui J P, Singh K B, and Singh S K, Ground subsidence observation and a modified influence function method for complete subsidence prediction. International Journal of Rock Mechanics and Mining Sciences, 37 (2000) 801-818.[6] Ren G and Li J, A Study of Angle of Draw in Mining Subsidence Using Numerical Modeling Techniques, Electronic Journal of Geotechnical Engineering, Vol. 13, Bund. F, 2008.[7] Kratzsch H, Mining Subsidence Engineering, Springer-Verlag Berlin 1983, p206-207.[8] Whittaker B N, Reddish D J and Fitzpatrick D, Calculation by computer program of mining subsidence ground strain patterns due to multiple longwall extractions, Mining Science and Technology, 3 (1985) 21-33.Figure 1: 3D illustration of the influence functionK zm i t l i n eFigure 2: 2D illustration of the influence functionFigure 3. Influence factor distributions from various influence functionsRegular calculation gride1e2v2A1e3E1E2Figure 4. Determination of principal strains from regular grids (Reproduced after Whittaker et al. 1985 [8])Figure 5: Equivalent uniform extraction panel in relation to a surface structure.(m)Figure 6: Contours showing the vertical subsidence in metre, and horizontal movement vectors in relationship to the old mine workingFigure 7a: Case Study showing the vertical subsidence and principal strains in relationship to the old mine working, existing structures and proposed developments(Strains in the inserts are not to scale)Figure 8: Subsidence trough shown in 3D (Subsidence magnified by a factor of 500)。