当前位置:文档之家› Lecture Notes in Computer Science 1 Temporal Data Mining an overview

Lecture Notes in Computer Science 1 Temporal Data Mining an overview

Lecture Notes in Computer Science 1 Temporal Data Mining an overview
Lecture Notes in Computer Science 1 Temporal Data Mining an overview

Lecture Notes in Computer Science 1 Temporal Data Mining: an overview

Cláudia M. Antunes 1, and Arlindo L. Oliveira 2

1 Instituto Superior Técnico, Dep. Engenharia Informática, Av. Rovisco Pais 1,

1049-001 Lisboa, Portugal

cman@rnl.ist.utl.pt

2 IST / INESC-ID, Dep. Engenharia Informática, Av. Rovisco Pais 1,

1049-001 Lisboa, Portugal

aml@inesc.pt

Abstract. One of the main unresolved problems that arise during the data

mining process is treating data that contains temporal information. In this case,

a complete understanding of the entire phenomenon requires that the data

should be viewed as a sequence of events. Temporal sequences appear in a vast

range of domains, from engineering, to medicine and finance, and the ability to

model and extract information from them is crucial for the advance of the

information society. This paper provides a survey on the most significant

techniques developed in the past ten years to deal with temporal sequences.

1 Introduction

With the rapid increase of stored data, the interest in the discovery of hidden information has exploded in the last decade. This discovery has mainly been focused on data classification, data clustering and relationship finding. One important problem that arises during the discovery process is treating data with temporal dependencies. The attributes related with the temporal information present in this type of datasets need to be treated differently from other kinds of attributes. However, most of the data mining techniques tend to treat temporal data as an unordered collection of events, ignoring its temporal information.

The aim of this paper is to present an overview of the techniques proposed to date that deal specifically with temporal data mining. Our objective is not only to enumerate the techniques proposed so far, but also to classify and organize them in a way that may be of help for a practitioner looking for solutions to a concrete problem. Other overviews of this area have appeared in the literature ([25], [56]), but is not widely available, while the second one uses a significantly different approach to classify the existing work and is considerably less comprehensive than the present work.

The paper is organized as follows: Section 2 defines the main goals of temporal data mining, related problems and concepts and its applications areas. Section 3 presents the principal approaches to represent temporal sequences, and Section 4 the methods developed to measure the similarity between two sequences. Section 5 addresses the problems of non-supervised sequence classification and relation

Lecture Notes in Computer Science 2 finding, including frequent event detection and prediction. Section 6 presents the conclusions and points some possible directions for future research.

2 Mining Temporal Sequences

One possible definition of data mining is “the nontrivial extraction of implicit, previously unknown and potential useful information from data” [19]. The ultimate goal of temporal data mining is to discover hidden relations between sequences and sub-sequences of events. The discovery of relations between sequences of events involves mainly three steps: the representation and modeling of the data sequence in a suitable form; the definition of similarity measures between sequences; and the application of models and representations to the actual mining problems. Other authors have used a different approach to classify data mining problems and algorithms. Roddick [56] has used three dimensions: data type, mining operations and type of timing information. Although both approaches are equally valid, we preferred to use representation, similarity and operations, since it provided a more comprehensive and novel view of the field.

Depending on the nature of the event sequence, the approaches to solve the problem may be quite different. A sequence composed by a series of nominal symbols from a particular alphabet is usually called a temporal sequence and a sequence of continuous, real-valued elements, is known as a time series.

Time series or, more generally, temporal sequences, appear naturally in a variety of different domains, from engineering to scientific research, finance and medicine. In engineering matters, they usually arise with either sensor-based monitoring, such as telecommunications control (e.g. [14], [15]) or log-based systems monitoring (see, for instances [44], [47]). In scientific research they appear, for example, in spatial missions (e.g. [34], [50]) or in the genetics domain (e.g. [15]).

In finance, applications on the analysis of product sales or inventory consumptions are of great importance to business planning (see for instance [5]). Another very common application in finance is the prediction of the evolution of financial data (e.g.

[3], [16], [20], [11], [7], [21], [17], [55]).

In healthcare, temporal sequences are a reality for decades; with data originated by complex data acquisition systems like ECG’s (see, for instances [13], [21], [24], [57]), or even with simple ones like measuring the patient temperature or treatments effectiveness ([9], [6], [12]). In the last years, with the development of medical informatics, the amount of data has increased considerably, and more than ever, the need to react in real-time to any change in the patient behavior is crucial.

In general, applications that deal with temporal sequences serve mainly to support diagnosis and to predict future behaviors ([56]).

Lecture Notes in Computer Science 3

3 Representation of Temporal Sequences

A fundamental problem that needs to be solved in the representation of the data and the pre-processing that needs to be applied before actual data mining operations take place. The representation problem is especially important when dealing with time series, since direct manipulation of continuous, high-dimensional data in an efficient way is extremely difficult. This problem can be addressed in several different ways.

One possible solution is to use the data with only minimal transformation, either keeping it in its original form or using windowing and piecewise linear approximations to obtain manageable sub-sequences. These methods are described in section 3.1. A second possibility is to use a transformation that maps the data to a more manageable space, either continuous (section 3.2) or discrete (section 3.3). A more sophisticated approach, discussed in section 3.4 is to use a generative model, where a statistical or deterministic model is obtained from the data and can be used to answer more complex questions. Approaches, that deal with arbitrary dimension transaction databases are presented in section 3.5.

One issue that is relevant for all the representation methods addressed is the ability to discover and represent the subsequences of a sequence. A method commonly used to find subsequences is to use a sliding window of size w and place it at every possible position of the sequence. Each such window defines a new subsequence, composed by the elements inside the window [16].

3.1 Time-Domain Continuous Representations

A simple approach to represent a sequence of real-valued elements (time series) is using the initial elements, ordered by their instant of occurrence without any pre-processing ([3], [41]).

An alternative consists in finding a piecewise linear function able to approximately describe the entire initial sequence. In this way, the initial sequence is partitioned in several segments and each segment is represented by a linear function. In order to partition the sequence there exist two main possibilities: define beforehand the number of segments or discover that number and then identify the correspondent segments ([14], [27]). The objective is to obtain a representation amenable to the detection of significant changes in the sequence.

While this is a relatively straightforward representation, not much is gained in terms of our ability to manipulate the generated representations. One possible application of this type of representations is on change-point detection, where one wants to locate the points where a significant change in behavior takes place.

Another proposal, based on the idea that the human visual system partitions smooth curves into linear segments, has been proposed. It mainly consists on segmenting a sequence by iteratively merging two similar segments. Choosing which segments are to be merged is done based on the squared error minimization criteria [34]. An extension to this model consists in associating with each segment a weight value, which represents the segment importance relatively to the entire sequence. In this manner, it is possible to compare sequences mainly by looking at their most

Lecture Notes in Computer Science 4 important segments [35]. With this representation, a more effective similarity measure can be defined, and consequently, mining operations may be applied.

One significant advantage of these approaches is the ability to reduce the impact of noise. However, problems with amplitude differences (scaling problems) and the existence of gaps or other time axis distortion are not addressed easily.

3.2 Transformation Based Representations

The main idea of Transformation Based Representations is to transform the initial sequences from time to another domain, and then to use a point in this new domain to represent each original sequence.

One proposal [1] uses the Discrete Fourier Transform (DFT) to transform a sequence from the time domain to a point in the frequency domain. Choosing the k first frequencies, and then representing each sequence as a point in the k-dimensional space, achieves this goal. The DFT has the attractive property that the amplitude of the Fourier coefficients is invariant under shifts, which allows extending the method to find similar sequences ignoring shifts [1].

Other possibilities that have been proposed represent each sequence as a point in the n-dimensional space, with n being the sequence length. This representation can be viewed as a time-domain continuous representation ([20]). However, if one maps each subsequence to a point in a new space with coordinates that are the derivatives of each original sequence point, this should be considered as a transformation based representation. In this case, the derivative at each point is determined by the difference between the point and its preceding one [20].

A more recent approach uses the Discrete Wavelet Transform (DWT) to translate each sequence from the time domain into the time/frequency domain [11]. The DWT is a linear transformation, which decomposes the original sequence into different frequency components, without loosing the information about the instant of the elements occurrence. The sequence is then represented by its features, expressed as the wavelet coefficients. Again, only a few coefficients are needed to approximately represent the sequence.

With these kinds of representations, time series became a more manageable object, which permit the definition of efficient similarity measures and an easier application to common data mining operations.

3.3 Discretization Based Methods

A third approach to deal with time series is the translation of the initial sequence (with real-valued elements) to a discretized sequence, this time composed of symbols from an alphabet. One language to describe the symbols of an alphabet and the relationship between two sequences was proposed in [4]. This Shape Definition Language (SDL) is also able to describe shape queries to pose to a sequence database and it allows performing ‘blurry’ matching. A blurry match is one where the important thing is the overall shape, therefore ignoring the specific details of a particular sequence. The first step in the representation process is defining the

Lecture Notes in Computer Science 5 alphabet of symbols and then translating the initial sequence to a sequence of symbols. The translation is done by considering transitions from an instant to the following one, and then assigning a symbol of the described alphabet to each transition.

Constraint-Based Pattern Specification Language (CAPSUL) [56] is another language used to specify patterns, which similarly to SDL, describes patterns by considering some abstractions of the values at time intervals. The key difference between these two languages is that CAPSUL uses a set of expressive constraints upon temporal objects, which allows describing more complex patterns, for instance, periodic patterns. Finally, to perform each of those abstractions the system accesses the knowledge bases, where the relationships between the different values are defined.

Other approach to discretize time series [33] suggests the segmentation of a sequence by computing the change ratio from one time point to the following one, and representing all consecutive points with equal change ratios by a unique segment. After this partition, each segment is represented by a symbol, and the sequence is represented as a string of symbols. A particular case is when change ratios are classified into one of two classes: large fluctuations (symbol 1) or small fluctuations (symbol 0). This way, each time series is converted into a binary sequence, perfectly suited to be manipulated by genetic algorithms [63].

A different approach to convert a sequence into a discrete representation is using clustering. First, the sequence originates a set of subsequences with length w, by sliding a window of width w. Then, the set of all subsequences is clustered, originating k clusters, and a different symbol is associated with each cluster. The discretized version of the initial sequence is obtained by substituting each subsequence by the symbol associated to the cluster that it belongs to [15].

Another method to convert time series into a sequence of symbols is based on the use of self-organizing maps ([23], [24]). This conversion consists on three steps: first, a new series composed by the differences between consecutive values of the original time series is derived; second windows with size d (called delay embedding) are inputted to the SOM, which finally outputs the winner node. Each node is associated with a symbol, which means that the resultant sequence may be viewed as a sequence of symbols.

The advantage of these methods is that the time series is partitioned in a natural way, depending on its values. However, the symbols of the alphabet are usually chosen externally, which means that they are imposed by the user, who has to know the most suitable symbols, or are established in an artificial way. The discretized time series is more amenable to manipulation and processing than the original, continuous valued, time series.

3.4 Generative Models

A significantly different approach consists in obtaining a model that can be viewed as a generator for the sequences obtained.

Lecture Notes in Computer Science 6 Several methods that use probabilistic generators have been proposed. One such proposal [21] uses semi-Markov models to model a sequence and presents an efficient algorithm that can find appearances of a particular sub-sequence in another sequence.

A different alternative is to find partial orders between symbols and build complex episodes out of serial and parallel compositions of basic events ([43], [26], [8]). A further development [47] builds a mixture model that generates sequences similar to the one that one wishes to model, with high probability. In this approach, an intuitively appealing model is derived from the data, but no particular applications in classification, prediction or similarity-based mining are presented. These models can be viewed as graph-based models, since one or more graphs that describe the relationships between basic events are used to model the observed sequences.

A class of other approaches aim at inferring grammars from the given time sequence. Extensive research in grammatical inference methods has led to many interesting results ([48], [31], [51],) but has not yet found their way into a significant number of real applications. Usually, grammatical inference methods are based on discrete search algorithms that are able to infer very complex grammars. Neural net-based approaches have also been tried (see, for example, part VI of reference [49]) but have been somewhat less successful when applied to complex problems ([32], [22]). However, some recent results have shown that neural-network based inference of grammars can be used in realistic applications [23].

The inferred grammars can belong to a variety of classes, ranging from deterministic regular grammars ([39], [38], [52]) to stochastic ([30], [61], [10]) and context free grammars ([58], [59]). It is clear that the grammar models obtained by induction from examples can be used for prediction and classification, but, so far, these methods have not been extensively applied to actual problems in data mining. 3.5 Transactional databases with timing information

A type of time sequences that does not easily match any of the classes described above are transactional datasets that incorporate timing information. For example, one might have a list of items bought by customers and would like to consider the timing information contained in the database. This type of high-dimensional discrete data cannot be modelled effectively by any of the previously described approaches.

Although the modelling and representation of this type of data is an important problem in temporal data mining, to our knowledge, there exist only a few proposals for the representation of data that has this kind of characteristics. In one of the proposed representations [5] each particular set of items is considered a new symbol of the alphabet and a new sequence is created from the original one, by using this new set of symbols. These symbols are then processed using methods that aim of finding common occurrences of sub-sequences in an efficient way. This representation is effective if the objective is to find common events and significant correlations in time, but fails in applications like prediction and classification. Other authors have used a similar representation [28], although they have proposed new methods to manipulate it.

Lecture Notes in Computer Science 7 4 Similarity Measures for Sequences

After representing each sequence in a suitable form, it is necessary to define a similarity measure between sequences, in order to determine if they match. An event is then identified when there are a significant number of similar sequences in the database.

An important issue in measuring similarity between two sequences is the ability to deal with outlying points, noise in the data, amplitude differences (scaling problems) and the existence of gaps and other time axis distortion problems. As such, the similarity measure needs to be sufficiently flexible to be applied to sequences with different lengths, noise levels, and time scales.

There are many proposals for similaritude measures. The model used to represent sequences clearly has a great impact on the similarity measure adopted, and, therefore, this section follows an organization that closely matches that of section 3. 4.1 Distances in time-domain continuous representations

When no pre-processing is applied to the original sequence, a simple approach to measure the similarity between two sequences consists on comparing the i th element of each sequence. This simple method, however, is insufficient for many applications.

The similarity measure most used with time-domain continuous representations is the Euclidean distance, by viewing each sub-sequence with n points as a point in R n. In order to deal with noise, scaling and translation problems, a simple improvement consists in determining the pairs of sequences portions that agree in both sequences after some linear transformation is applied. A concrete proposal [3] achieves this by finding one window of some fixed size from each sequence, normalizing the values in them to the [-1, 1] interval, and then comparing them to determine if they match. After identifying these pairs of windows, several non-overlapping pairs can be joined, in order to find the longest match length. Normalizing the windows solves the scaling problem, and searching for pairs of windows that match solves the translation problem.

A more complex approach consists in determining if there is a linear function f, such that a long subsequence of one sequence can be approximately mapped into a long subsequence of the other [14]. Since, in this approach, the subsequences don’t necessarily consist of consecutive original elements but only have to appear in the same order, this similarity measure allows for the existence of gaps and outlying points in the sequence.

In order to surpass the high sensibility of the Euclidean distance to small distortions in the time axis, the technique of Dynamic Time Warping (DWT) was proposed to deal with temporal sequences. This technique consists on aligning two sequences so that a predetermined distance measure is minimized. The goal is to create a new sequence, the warping path, which elements are (i, j) points with i and j the indexes of original sequences elements that match [7]. Some algorithms that

Lecture Notes in Computer Science 8 improve the performance of dynamic time warping techniques were proposed in ([36], [65]).

A different similarity measure consists on describing the distance between two sequences using a probabilistic model [34]. The general idea is that a sequence can be ‘deformed’, according to a prior probability distribution, to generate the other sequence. This is easily accomplished by composing a global shape sequence with local features, where each of one is allowed some degree of deformation. In this way, some elasticity is allowed in the global shape sequence, permitting stretching in time and distortions in amplitude. The degree of deformation and elasticity are governed by the prior probability distributions [34].

4.2 Distances in Transformation Based Methods

When Transformation Based Representations are used, a simple approach is to compare the points, in the new domain, which represent each sequence. For example, the similarity between two sequences may be reduced to the similarity between two points in the frequency domain, again measured using the Euclidean distance ([1], [11]). In order to allow for the existence of outlying points in the sequence, the discovery of sequences similar to a given query sequence is reduced to the selection of points in an ε-neighborhood of the query point. However, this method may allow for false hits, and after choosing the neighbor points in the new domain, the distance in the time domain is computed to validate the similarity between the chosen sequences.

Notice that this method implies that the sequences have the same length, and in order to be able to deal with different lengths, the method was improved to find relevant subsequences in the original data [16]. Another important issue is the fact that to compare different time series, the k Fourier coefficients chosen to represent the sequences must be the same for all of them, which could not be the optimal one for each sequence.

4.3 Similarities measures in discrete spaces

When a sequence is represented as a sequence of discrete symbols of an alphabet, the similarity between two sequences is, most of the times, achieved by comparing each element of one sequence with the correspondent one in the other sequence, like in the first approach described in 4.1.

However, since the discrete symbols are not necessarily ordered some provisions must be taken to allow for blurry matching.

One possible approach is to use the Shape Definition Language, described in 3.3, where a blurry match is automatically possible. This is achieved since the language has defined a set of operators, such as any, noless or nomore, that allows describing overall shapes, solving the existence gaps problem in an elegant way [4].

Another possibility is to use well-known algorithms for string editing to define a distance over the space of strings of discrete symbols ([33], [45]). This distance

Lecture Notes in Computer Science 9 reflects the amount of work needed to transform a sequence to another, and is able to deal with different sequences length and gaps existence.

4.4 Similarity measures for generative models

When a generative model is obtained from the datasets, the similarity measure between sequences is obtained in an obvious way by how close the data fits one of the available models. For deterministic models (graph-based models [43], [26], and deterministic grammars [32], [22], [58], [59]), verifying if a sequence matches a given model will provide a yes or no answer, and therefore the distance measures are necessarily discrete, and, in many cases, will take only one of two possible values.

For stochastic generative models like Markov chains ([21]), stochastic grammars ([30], [61], [10]) and mixture models ([47]), it is possible to obtain a number that indicates the probability that a given sequence was generated by a given model.

In both cases, this similaritude measure can be used effectively in classification problems, without the need to resort to complex heuristic similarity measures between sequences.

5 Mining Operations

5.1 Discovery of Association Rules

One of the most common approaches to mining frequent patterns is the apriori method [2] and when a transactional database represented as a set of sequences of transactions performed by one entity is used (as described in section 3.5), the manipulation of temporal sequences requires that some adaptations be made to the apriori algorithm. The most important modification is on the notion of support: support is now the fraction of entities, which had consumed the itemsets in any of their possible transactions [5], i.e. an entity could only contribute one time to increment the support of each itemset, beside it could had consumed that itemset several times.

After identifying the large itemsets, the itemsets with support greater than the minimum support allowed, they are translated to an integer, and each sequence is transformed in a new sequence, whose elements are the large itemsets of the previous-one. The next step is to find the large sequences. For achieve this, the algorithm acts iteratively as apriori: first it generates the candidate sequences and then it chooses the large sequences from the candidate ones, until there are no candidates.

One of the most costly operations in apriori-based approaches is the candidate generation. A proposal to frequent pattern mining states that it is possible to find frequent patterns avoiding the candidate generation-test [28]. Extending this to deal with sequential data is presented in [29].

Lecture Notes in Computer Science 10 The discovery of relevant association rules is one of the most important methods used to perform data mining on transactional databases. An effective algorithm to discover association rules is the apriori [2]. Adapting this method to deal with temporal information leads to some different approaches.

Common sub-sequences can be used to derive association rules with predictive value, as is done, for instance, in the analysis of discretized, multi-dimensional time series [24].

A possible approach consists on extending the notion of a typical rule X ? Y (which states if X occurs then Y occurs) to be a rule with a new meaning: X ?T Y (which states: if X occurs then Y will occur within time T) [15]. Stating a rule in this new form, allows for controlling the impact of the occurrence of an event to the other event occurrence, within a specific time interval.

Another method consists on considering cyclic rules [53]. A cyclic rule is one that occurs at regular time intervals, i.e. transactions that support specific rules occur periodically, for example at every first Monday of a month. In order to discover these rules, it is necessary to search for them in a restrict portion of time, since they may occur repeatedly at specific time instants but on a little portion of the global time considered. A method to discover such rules is applying an algorithm similar to the apriori, and after having the set of traditional rules, detects the cycles behind the rules. A more efficient approach to discover cyclic rules consists on inverting the process: first discover the cyclic large itemsets and then generate the rules. A natural extension to this method consists in allowing the existence of different time units, such as days, weeks or months, and is achieved by defining calendar algebra to define and manipulate groups of time intervals. Rules discovered are designated calendric association rules [54].

A different approach to the discovery of relations in multivariate time sequences is based on the definition of N-dimensional transaction databases. Transactions in these databases are obtained by discretizing, if necessary, continuous attributes [42]. This type of databases can then be mined to obtain association rules. However, new definitions for association rules, support and confidence are necessary. The great difference is the notion of address, which locates each event in a multi-dimensional space and allows for expressing the confidence and support level in a new way.

5.2 Classification

Classification is one of the most typical operations in supervised learning, but hasn’t deserved much attention in temporal data mining. In fact, a comprehensive search of applications based on classification has returned relatively few instances of actual uses of temporal information.

One classification method to deal with temporal sequences is based on the merge operator [35]. This operator receives two sequences and returns “a sequence whose shape is a compromise between the two original sequences”[35]. The basic idea is to iteratively merge a typical example of a class with each positive example, building a more general model for the class. Using negative examples is also possible, but then is necessary to emphasize the shape differences. This is accomplished by using an

Lecture Notes in Computer Science 11 influence factor to control the merge operator function: a positive influence factor implies the generalization of the input sequences, and a negative one conduces to exaggerate the differences between the positive and negative shapes.

Since traditional classification algorithms are difficult to apply to sequential examples, mostly because there is a vast number of potentially useful features for describing each example, an interesting improvement consists on applying a pre-processing mechanism to extract relevant features. One approach [40] to implement this idea consists on discovering frequent subsequences, and then using them, as the relevant features to classify sequences with traditional methods, like Naive Bayes or Winnow.

Classification is relatively straightforward if generative models are employed to model the temporal data. Deterministic and probabilistic models can be applied in a straightforward way to perform classification [39] since they give a clear answer to the question of whether a sequence matches a given model. However, actual applications in classification have been lacking in the literature.

5.3 Non-supervised Clustering

The fundamental problem in clustering temporal sequences databases consists on the discovery of a number of clusters, say K, able to represent the different sequences. There are two central problems in clustering: choose the number of clusters and initialize their parameters. Another important problem appears, when dealing with temporal sequences: the existence of a meaningful similarity measure between sequences.

Considering that K is known, if a sequence is viewed as being generated according to some probabilistic model, for example by a Markov model, clustering may be viewed as modeling the data sequences as a finite group of K sequences in the form of a finite mixture model. Through the EM algorithm their parameters could be estimated and each K group would correspond to a cluster [61]. Learning the value of K, if it is unknown, may be accomplished by a Monte-Carlo cross validation approach as suggested in [60].

A different approach proposes to use a hierarchical clustering method to cluster temporal sequences databases [37]. The algorithm used is the COBWE

B [18], and it works on two steps: first grouping the elements of the sequences, and then grouping the sequences themselves. Considering a simple time series, the first step is accomplished without difficulties, but to group the sequences is necessary to define a generalization mechanism for sequences. Such mechanism has to be able to choose the most specific description for what is common to different sequences.

5.4 Applications in Prediction

Prediction is one of the most important problems in data mining. However, in many cases, prediction problems may be formulated as classification, association rule finding or clustering problems. Generative models can also be used effectively to predict the evolution of time series.

Lecture Notes in Computer Science 12 Nonetheless, prediction problems have some specific characteristics that differentiate them from other problems. A vast literature exists on computer-assisted prediction of time series, in a variety of domains ([17], [64]). However, we have failed to find in the data mining literature significant applications that involve prediction of time series and that do not fall into any of the previously described categories.

Granted, several authors have presented work that aims specifically at obtaining algorithms that can be used to predict the evolution of time series ([46], [42]), but the applications described, if any, are used more to illustrate the algorithms and do not represent an actual problem that needs to be solved. One exception is the work by Giles et all [23], which combines discretization with grammar inference and applies the resulting automata to explicitly predict the evolution of financial time series. In the specific domain of prediction, care must be taken with the domain where prediction is to be applied. Well-known and generally accepted results on the inherent unpredictability of many financial time series [17] imply that significant gains in prediction accuracy are not possible, no matter how sophisticated the techniques used.

In other cases, however, it will be interesting to see if advances in data mining techniques will lead to significant improvements in prediction accuracy, when compared with existing parametric and non-parametric models.

6 Conclusions

We have presented a comprehensive overview of techniques for the mining of temporal sequences. This domain has relationships with many other areas of knowledge, and an exhaustive survey that includes all relevant contributions is well beyond the scope of this work.

Nonetheless, we have surveyed and classified a significant fraction of the proposed approaches to this important problem, taking into consideration the representations they utilize, the similarity measures they propose and the applications they have been applied to.

In our opinion, this survey has shown that a significant number of algorithms exist for the problem, but that the research has not be driven so much by actual problems but by an interest in proposing new approaches. It has also shown that the researchers have ignored, at least for some time, some very significant contributions from related fields, as, for example, Markov-process based modeling and grammatical inference techniques. We believe that the field will be significantly enriched if knowledge from these other sources is incorporated into data mining algorithms and applications. References

1. Agrawal, R., Faloutsos, C., Swami, A.: Efficient similarity search in sequence

databases. FODO - Foundations of Data Organization and Algorithms Proceedings

(1993) 69-84

Lecture Notes in Computer Science 13 2. Agrawal, R., Srikant, R.: Fast Algoritms for Mining Association Rules. VLDB

(1994) 487-499

3. Agrawal, R., Lin, K., Sawhney, H., Shim, K.: Fast Similarity Search in the Presence

of Noise, Scaling, and Translation in Time-Series Databases. VLDB (1995) 490-501 4. Agrawal, R., Giuseppe, P., Edward, W.L., Zait, M.: Querying Shapes of Histories.

VLDB (1995) 502-514

5. Agrawal, R., Srikant, R.: Mining sequential patterns. ICDE (1995) 3-14

6. Antunes, C.: Knowledge Acquisition System to Support Low Vision Consultation.

MSc Thesis at Instituto Superior Técnico, Universidade Técnica de Lisboa. (2001) 7. Berndt, D., Clifford, J.: Finding Patterns in Time Series: a Dynamic Programming

Approach. In: Fayyad, U., Shapiro, G., Smyth, P., Uthurusamy, R. (eds.): Advances in Knowledge Discovery and Data Mining. AAAI Press. (1996) 229-248

8. Bettini, C., Wang, S.X., Jagodia, S., Lin, J.L. Discovering Frequent Event Patterns

with Multiple Granularities in Time Sequences. IEEE Transactions on Knowledge and Data Engineering, vol. 10, no 2 (222-237). 1998.

9. Cara?a-Valente, J., Chavarrías, I.: Discovering Similar Patterns in Time Series. KDD

(2000) 497-505

10. Carrasco, R., Oncina, J.: Learning Stochastic Regular Grammars by means of a State

Merging Method. ICGI (1994) 139-152

11. Chan, K., Fu, W.: Efficient Time Series Matching by Wavelets. ICDE (1999) 126-

133

12. Coiera, E.: The Role of Knowledge Based Systems in Clinical Practice. In Barahona,

P., Christensen, J.: Knowledge and Decisions in Health Telematics – The Next Decada. IOS Press Amsterdam (1994) 199-203

13. Dajani, R., Miquel, M., Forilini, M.C., Rubel, P.: Modeling of Ventricular

Repolarisation Time Series by Multi-layer Perceptrons. 8th Conference on Artificial Intelligence in Medicine in Europe, AIME (2001) 152-155

14. Das, G., Gonopulos, D., Mannila, H.: Finding Similar Time Series. PKDD (1997) 88-

100

15. Das, G., Mannila, H., Smyth, P.: Rule Discovery from Time Series. KDD (1998) 16-

22

16. Faloutsos, C., Ranganathan, M., Manolopoulos, Y.: Fast Subsequence Matching in

Time-Series Databases. ACM SIGMOD Int. Conf. on Management of Data (1994) 419-429

17. Fama, E.. Efficient Capital Markets: a review of theory and empirical work. Journal

of Finance (May 1970) 383-417

18. Fisher, D.. Knowledge Acquisition Via Incremental Conceptual Clustering. Machine

Learning vol. 2 (1987) 139-172

19. Frawley, W., Piatetsky-Shapiro, G., Matheus, C.. Knowledge discovery in databases:

An overview. AI Magazine, vol. 13 n. 3 (1992) 57-70

20. Gavrilov, M., Anguelov, D., Indyk, P, Motwani, R.. Mining The Stock Market:

Which Measure Is Best?. KDD (2000) 487-496

21. Ge, X., Smyth, P.. Deformable Markov Model Templates for Time Series Pattern

Matching. KDD (2000) 81-90

22. Giles, C., Chen, D., Sun, G., Chen, H., Lee, Y.: Extracting and Learning an

Unknown Grammar with Recurrent Neural Networks. In Moody, J., Hanson, S., Lippmann, R.. Advances in Neural Information Processing Systems, vol. 4 Morgan Kaufmann (1992) 317-324

23. Giles, C., Lawrence, S., Tsoi, A.C.: Noisy Time Series Prediction using Recurrent

Neural Networks and Grammatical Inference. Machine Learning, vol. 44. (2001) 161-184

Lecture Notes in Computer Science 14 24. Guimar?es, G.: The Induction of Temporal Grammatical Rules from Multivariate

Time Series. ICGI (2000) 127-140

25. Gunopulos, D., Das, G.: Time Series Similarity Measures and Time Series Indexing.

In WOODSTOCK (1997)

26. Guralnik. V., Wijesakera. D., Srivastava. J.: Pattern Directed Mining of Sequence

Data. KDD (1998) 51-57

27. Guralnik. V., Srivastava. J.: Event Detection from Time Series Data. KDD (1999)

33-42

28. Han, J., Pei, J., Yin, Y.: Mining Frequent Patterns without Candidate Generation.

ACM SIGMOD Int. Conf. on Management of Data (2000) 1-12

29. Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., Hsu, M.: FreeSpan: Frequent

pattern-projected sequential pattern mining. ACM SIGKDD (2000) 355-359

30. Higuera, C.: Learning Stochastic Finite Automata from Experts. ICGI (1998) 79-89

31. Honavar, V., Slutzki, G.: Grammatical Inference. In Lecture Notes in Artificial

Intelligence, Vol. 1433. Springer (1998)

32. Horne, B., Giles, C.: An Experimental Comparison of Recurrent Neural Networks, in

G. Tesauro, D. Touretzky, T. Leen. Advances in Neural Information Processing

Systems, Vol. 7. MIT Press (1995) 697-704

33. Huang, Y., Yu, P.: Adaptive Query Processing for Time Series Data. KDD (1999)

282-286

34. Keogh, E., Smyth, P.: A probabilistic Approach to Fast Pattern Matching in Time

Series Databases. KDD (1997) 24-30

35. Keogh, E., Pazzani, M.: An enhanced representation of time series which allows fast

and accurate classification, clustering and relevance feedback. Knowledge Discovery in Databases and Data Mining (1998) 239-241

36. Keogh, E., Pazzani, M.: Scaling up Dynamic Time Warping to Massive Datasets.

Proc. Principles and Practice of Knowledge Discovery in Databases (1999) 1-11

37. Ketterlin, A.: Clustering Sequences of Complex Objects. KDD (1997) 215-218

38. Juillé, H., Pollack, J.: A Stochastic Search Approach to Grammar Induction. ICGI

(1998) 79-89

39. Lang, K., Pearlmutter, B., Price, R.: Results of the Abbadingo One DFA Learning

Competition and a new Evidence-Driven State Merging Algorithm. ICGI (1998) 79-

89

40. Lesh, N., Zaki, M., Ogihara, M.: Mining Features for Sequence Classification. KDD

(1999). 342-346

41. Lin, L., Risch, T.: Querying Continuous Time Sequences. VLDB (1998) 170-181

42. Lu, H., Han, J., Feng, L.: Stock Price Movement Prediction and N-Dimensional

Inter-Transaction Association Rules. ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (1998) 12.1-12.7

43. Mannila, H., Toivonen, H., Verkamo, I.: Discovering Frequent Episodes in

Sequences. KDD (1995) 210-215

44. Mannila, H., Toivonen, H.: Discovering Generalized episodes using minimal

occurrences. KDD (1996) 146-151

45. Mannila, H., Ronkainen, P.: Similarity of Event Sequences. In TIME (1997). 136-

139

46. Mannila, H., Pavlov, D., Smyth, P.: Prediction with Local Patterns using Cross-

Entropy. KDD (1999) 357-361

47. Mannila, H., Meek, C.: Global Partial Orders from Sequential Data. KDD (2000)

161-168

48. Miclet, L., Higuera, C.: Grammatical Inference: Learning Syntax from Sentences. In

Lecture Notes in Artificial Intelligence, Vol. 1147. Springer (1996)

Lecture Notes in Computer Science 15 49. Moody, J., Hanson, S., Lippmann, R. (eds): Advances in Neural Information

Processing Systems, Vol. 4. Morgan Kaufmann (1992)

50. Oates, T.: Identifying Distinctive Subsequences in Multivariate Time Series by

Clustering. KDD (1999) 322-326

51. Oliveira, A.. Grammatical Inference: Algorithms and Applications. In Lecture Notes

in Artificial Intelligence, Vol. 1891. Springer (2000)

52. Oliveira, A., Silva, J.: Efficient Algorithms for the Inference of Minimum Size

DFAs. Machine Learning, Vol. 44 (2001) 93-119

53. ?zden, B., Ramaswamy, S., Silberschatz, A.: Cyclic association rules. ICDE (1998)

412-421

54. Ramaswamy, S., Mahajan, S., Silberschatz, A.: On the Discovery of Interesting

Patterns in Association Rules. VLDB (1998) 368-379

55. Refenes, A.: Neural Networks in the Capital Markets. John Wiley and Sons, New

York, USA, (1995)

56. Roddick, J., Spiliopoulou, M.: A Survey of Temporal Knowledge Discovery

Paradigms and Methods. In IEEE Transactions of Knowledge and Data Engineering, vol. 13 (2001). To appear

57. Shahar, Y., Musen, MA.: Knowledge-Based Temporal Abstraction in Clinical

Domains. Artificial Intelligence in Medicine 8, (1996) 267-298

58. Sakakibara, Y.: Recent Advances of Grammatical Inference. Theoretical Computer

Science, Vol. 185 (1997) 15-45

59. Sakakibara, Y., Muramatsu, H.: Learning Context Free Grammars from Partially

Structured Examples. ICGI (2000) 229-240

60. Smyth, P.: Clustering Sequences with Hidden Markov Models. In Tesauro, G.,

Touretzky, D., Leen, T.: Advances in Neural Information Processing Systems, Vol.

9. The MIT Press (1997) 648-654

61. Smyth, P.: Probabilistic Model-Based Clustering of Multivariate and Sequential

Data. Artificial Intelligence and Statistics (1999) 299-304

62. Stolcke, A., Omohundro, S.: Inducing Probabilistic Grammars by Bayesian Model

Merging. ICGI (1994) 106-118

63. Szeto, K.Y., Cheung, K.H.: Application of genetic algorithms in stock market

prediction. In Weigend, A.S. et al.: Decision Technologies for Financial Engineering.

World Scientific (1996) 95-103

64. Weigend, A., Gershenfeld, N.: Time Series Prediction: Forecasting the Future and

Understanding the Past. Addison-Wesley (1994)

65. Yi, B., Jagadish, H., Faloutsos, C.: Efficient Retrieval of Similar Time Sequences

Under Time Warping. ICDE (1998) 201-208

英语考试作文-2019英语专四作文范文:教育的职能

英语考试作文 2019英语专四作文范文:教育的职能 2019英语专四作文范文:教育的职能 2019英语专四作文形式为题材性话题作文,答题时间45分钟,分值为20分。复习英语专四作文基础阶段可以多借鉴范文中的优秀例句及行文结构,为日后复习做好储备。 Education is one of the largest items of government spending. It is regarded as the pathway to economic prosperity, an instrument for combating unemployment and the driving force behind scientific and technological advance. Given the importance of education for individuals and society, its scope, constituents and configuration have long been the subject of research, studies and discussion. Theoretically, a student is expected to acquire knowledge of a specific subject or profession at school, but throughout the learning process, education should focus on the development of their skills. A successful school leaver should show exceptional abilities to acquire, organize, interpret,

材料科学基础第三章答案

习题:第一章第二章第三章第四章第五章第六章第七章第八章第九章第十章第十一章答案:第一章第二章第三章第四章第五章第六章第七章第八章第九章第十章第十一章 3-2 略。 3-2试述位错的基本类型及其特点。 解:位错主要有两种:刃型位错和螺型位错。刃型位错特点:滑移方向与位错线垂直,符号⊥,有多余半片原子面。螺型位错特点:滑移方向与位错线平行,与位错线垂直的面不是平面,呈螺施状,称螺型位错。 3-3非化学计量化合物有何特点?为什么非化学计量化合物都是n型或p型半导体材料? 解:非化学计量化合物的特点:非化学计量化合物产生及缺陷浓度与气氛性质、压力有关;可以看作是高价化合物与低价化合物的固溶体;缺陷浓度与温度有关,这点可以从平衡常数看出;非化学计量化合物都是半导体。由于负离子缺位和间隙正离子使金属离子过剩产生金属离子过剩(n型)半导体,正离子缺位和间隙负离子使负离子过剩产生负离子过剩(p型)半导体。 3-4影响置换型固溶体和间隙型固溶体形成的因素有哪些? 解:影响形成置换型固溶体影响因素:(1)离子尺寸:15%规律:1.(R1-R2)/R1>15%不连续。 2.<15%连续。 3.>40%不能形成固熔体。(2)离子价:电价相同,形成连续固熔体。( 3)晶体结构因素:基质,杂质结构相同,形成连续固熔体。(4)场强因素。(5)电负性:差值小,形成固熔体。差值大形成化合物。 影响形成间隙型固溶体影响因素:(1)杂质质点大小:即添加的原子愈小,易形成固溶体,反之亦然。(2)晶体(基质)结构:离子尺寸是与晶体结构的关系密切相关的,在一定程度上来说,结构中间隙的大小起了决定性的作用。一般晶体中空隙愈大,结构愈疏松,易形成固溶体。(3)电价因素:外来杂质原子进人间隙时,必然引起晶体结构中电价的不平衡,这时可以通过生成空位,产生部分取代或离子的价态变化来保持电价平衡。 3-5试分析形成固溶体后对晶体性质的影响。 解:影响有:(1)稳定晶格,阻止某些晶型转变的发生;(2)活化晶格,形成固溶体后,晶格结构有一定畸变,处于高能量的活化状态,有利于进行化学反应;(3)固溶强化,溶质原子的溶入,使固溶体的强度、硬度升高;(4)形成固溶体后对材料物理性质的影响:固溶体的电学、热学、磁学等物理性质也随成分而连续变化,但一般都不是线性关系。固溶体的强度与硬度往往高于各组元,而塑性则较低, 3-6说明下列符号的含义:V Na,V Na',V Cl˙,(V Na'V Cl˙),Ca K˙,Ca Ca,Ca i˙˙解:钠原子空位;钠离子空位,带一个单位负电荷;氯离子空位,带一个单位正电荷;最邻近的Na+空位、Cl-空位形成的缔合中心;Ca2+占据K.位置,带一个单位正电荷;Ca原子位于Ca原子位置上;Ca2+处于晶格间隙位置。 3-7写出下列缺陷反应式:(l)NaCl溶入CaCl2中形成空位型固溶体;(2)CaCl2溶入NaCl中形成空位型固溶体;(3)NaCl形成肖特基缺陷;(4)Agl形成弗伦克尔缺陷(Ag+进入间隙)。

作文范文之英语专业四级作文模板

英语专业四级作文模板 【篇一:英语专业四级考试之写作万能模板:开头结尾 篇】 英语专业四级考试之写作万能模板:开头结尾篇 一开头万能公式: 1.开头万能公式一: 名人名言有人问了,“我没有记住名言,怎么办?尤其是英语名言?”,很好办:编!原理:我们看到的东西很多都是创造出来的, 包括我们欣赏的文章也是,所以尽管编,但是一定要听起来很有道 理呦!而且没准将来我们就是名人呢!对吧?经典句型:a proverb says, “ you are only young once.” (适用于已记住的名言)it goes without saying that we cannot be young forever. (适用于自编名言)更多经典句型:as everyone know s, no one can deny that… 2.开头万能公式二: 数字统计原理:要想更有说服力,就应该用实际的数字来说明。原 则上在议论文当中不应该出现虚假数字的,可是在考试的时候哪管 那三七二十一,但编无妨,只要我有东西写就万事大吉了。所以不 妨试用下面的句型:according to a recent survey, about 78.9% of the college students wanted to further their study after their graduation. 看起来这个数字文邹邹的,其实都是编造出来的,下面随便几个题 目我们都可以这样编造: honesty 根据最近的一项统计调查显示,大学生向老师请假的理由当中78% 都是假的。 travel by bike 根据最近的一项统计调查显示,85%的人在近距离旅行的时候首选 的交通工具是自行车。 youth 根据最近的一项统计调查显示,在某个大学,学生的课余时间的70%都是在休闲娱乐。 five-day work week better than six-day work? 根据最近的一项统计调查显示,98%的人同意每周五天工作日。 更多句型:a recent statistics shows that …

一种趣味无穷的英语单词记忆方法——顺口溜

一种趣味无穷的英语单词记忆方法——顺口溜 英语有大量的单词需要记忆,据说国内很多大学毕业生甚至研究生和博士生词汇量严重不足(本人也有同感),连简单的英文日报阅读起来都非常吃力,说明传统的教学方法有很大需要改进的余地。尤其在是知识信息和爆炸的时代,效率和时效性已经成为一个衡量学习效果的重要杠杆。如果我们改变传统的记忆方式,学习效果可能会更好。比如把这些单词编成顺口溜,请看下列一组单词: 猫和老鼠- 有一个rat(老鼠)* 非常地fat(肥胖) 跳进了vat(缸) 偷吃了salt(盐) 变成了bat(蝙蝠) 气坏了cat(猫) 咬破了hat(帽子) 经过国内外许多代学者的努力,从英语词汇的结构角度上所做的研究已经非常充分了,而且很多学习者已经对此非常熟悉了,在这种情况下,把结构相似的单词系统、完整地联系在一起。我们看下面的内容: -猴王出世- 花果山的back(背部) 有块石头black(黑色的) 突然中间crack(裂开) *

飞出一个sack(口袋) * 天兵天将attack(攻击) 却找不到track(行踪) (口袋里是刚出生的猴王。) 这里就把冷冰冰的单词联系在一起了。同时又把这些单词和鲜活的形象"猴王"联系起来了--猴王是每一个中国人都熟悉的形象,很亲切而且容易接受和记忆,最重要的是容易回想起来,从而有利于复习。 但相似的单词聚集一起,也有一个问题,就是容易搞混淆了,鄙人给出一个简单思考和解决的办法,仅供参考。 我们再看: -珍珠港战士- 海洋非常vast(广阔) 抓颗手雷cast(投掷) 速度非常fast(快) 从天空中past(穿过) 打中敌舰last(最后一个) 炸坏舰上mast(桅杆) * 如何记忆vast这个单词,本书告诉你,主要是第一字母v与别的单词不同,而把“v”无限放大成“V”的形状,就显得非常广阔了,这样就和词义"广阔的"联系起来了;而cast第一个字母“c”,向右旋转90度,就像投掷出的东西在空中的轨迹,这样也和单词的词义联系起来了;而mast起首字母是“m”,发音似“木”,桅杆多是木头做的,于是又和词义联系起来了。 另外,用“珍珠港”这么个重大事件发生的地方做背景,虽然所述故事可能是莫须有的,但几乎每个学生都知道“珍珠港”,无疑也可以很容易想到“珍珠港”这个地方,并把整个内容连带回忆出来。

材料科学基础第一章全部作业

(一) 1 谈谈你对材料学科及材料四要素之间的关系的认识 2 金属键与其它结合键有何不同,如何解释金属的某些特性? 3 说明空间点阵、晶体结构、晶胞三者之间的关系。 4 晶向指数和晶面指数的标定有何不同?其中有何须注意的问题? 5 画出三种典型晶胞结构示意图,其表示符号、原子数、配位数、致密度各是什么? 6 画出立方晶系中(011),(312),[211],[211],[101],(101) 7, 画出六方晶系中(1120),(0110),(1012),(110),(1012) 8. 原子间的结合键共有几种?各自特点如何? 9.在立方系中绘出{110}、{111}晶面族所包括的晶面,及(112)和(120)晶面。标出具有下列密勒指数的晶面和晶向: a)立方晶系(421),() 123,(130),[211],[311];

10.在立方系中绘出{110}、{111}晶面族所包括的晶面,及(112)和(120)晶面。 11.计算面心立方结构(111)、(110)与(100)面的面密度和面间距。 12. 标出具有下列密勒指数的晶面和晶向: a)立方晶系(421),()123,(130),[211],[311]; b)六方晶系()2111, ()1101,()3212,[2111],1213????。 13 在体心立方晶系中画出{111}晶面族的所有晶面。 14 画出<110>晶向族所有晶向

15.写出密排六方晶格中的[0001],(0001),()1120,()1100,()1210 16. 在一个简单立方晶胞内画出一个(110)晶面和一个[112]晶向。 17. 标出具有下列密勒指数的晶面和晶向: 立方晶系(421),()123,(130),[211],[311]; 18.计算晶格常数为a 的体心立方结构晶体中八面体间隙的大小。 19.画出面心立方晶体中(111)面上的[112]晶向。 20.已知某一面心立方晶体的晶格常数为a ,请画出其晶胞模型并分别计算该晶体 的致密度、{111}晶面的面密度以及{110}晶面的面间距。 21.表示立方晶体的(123),[211],()012 22. 写出密排六方晶格中()1120,()1100,()1210[2111],1213???? 23. 画出密排六方晶格中的[0001], ,()0110,()1010,[2110],[1120] 24 在面心立方晶胞中的(1 1 1)晶面上画出[110]晶向 25 指出在一个面心立方晶胞中的八面体间隙的数目,并写出其中一个八 面体间隙的中心位置坐标。假设原子半径为r ,计算八面体间隙的半径。 26.画出密排六方晶格中的(0001),()1120,()1100,()1210 27.立方晶系中画出(010),(011),(111),(231),[231],[321] 29.计算晶格常数为a 的面心立方结构晶体中四面体间隙和八面体间隙的大小。(4分) 30.写出立方晶系{}110、{}123晶面族的所有等价面 31.立方晶胞中画出以下晶面和晶向:()102,(112),(213) ,[110], 32.六方晶系中画出以下晶面和晶向:(2110),(1012),1210????,0111???? 33.写出立方晶系{}100、{}234晶面族的所有等价面 34.画出立方晶胞内(111),[112], 35.画出六方晶胞内(1011),[1123]

英语专四系列作文之范文赏析出国留学的利与弊

英语专四系列:作文之范文赏析--出国留学的利与弊 题目:Studying Abroad: Hardships and Rewards 写作要求 Nowadays more and more students choose to go abroad after they graduate from colleges and universities. They say that they bene? t a lot from it, though some people argue that there are many hardships and dif? culties. What’s your opinion about it? Write on Answer Sheet Two a composition of about 200 words on the following topic: Studying Abroad: Hardships and Rewards You are to write in three parts. In the ? rst part, state speci? cally what your idea is. In the second part, provide reasons to support your opinion OR describe your idea.

In the last part, bring what you have written to a natural conclusion or a summary. Marks will be awarded for content, organization, grammar and appropriateness. Failure to follow the instructions may result in a loss of marks. 审题思路 本题是一个观点选择型的题目,就目前大学生在毕业后选择留学的利弊孰大孰小进行分析。题目可以从以下几个方面入手: 思路1:利大于弊 虽然有困难,但整体上利大于弊,利体现在:1. 教学软件硬件优异;2. 有利于将来就业等。 思路2 :弊大于利 虽然有好处,但整体上弊大于利,弊体现在:1. 费用昂贵; 2. 文化差异,难以融入其中等。

三年级英语单词记忆方法与口诀

三年级英语单词记忆方法与口诀 单词是英语学习的重中之重,但是背单词却是很多孩子头疼的问题,小编在这里整理了相关资料,希望能帮助到大家。 不容错过的9种英语单词记忆方式 1 读音记忆 根据字母组合、读音规则进行记忆,会读一个单词,便会拼写出来。外来语:中文中有许多词来自英语,这些词的发音近似英语。如:tank坦克,jeep吉普车,coffee咖啡 2 歌诀记忆 如:new新的;门[n]前一只鹅[e],曲颈向天歌,借问哪里去,直奔小山坡[w] 。 chair椅子:小汽车c a r, h i来中间插,你和我坐上去,兜兜风真潇洒。 区分am are is的用法:我用am, 你用are, is用于他,她,它,以上主语是单数,一个以上是复数,复数主语用are。我用have,你用have,has用于他,她,它,以上主语是单数,复数主语用have。 3 睡眠记忆 晚上睡前读两遍要记的单词,然后睡觉,第二天醒来后再读两遍,

这样记忆效果不错。(一般上午9-11点,下午3-4点,晚上7-10点记忆力比较好) 4 感官记忆 记单词时,不要只用一种感官,尽可能地用多个感官,耳听、嘴读、手写、眼看、心记等。(多动手写) 5 复习记忆 记住了的单词,过段时间不看,就忘记了,所以每隔一段时间要进行复习,巩固所学单词,反复记忆。 6 比较记忆 同音词的比较如:, eye-I, see-sea, son-sun, right-write too(也)two(二) their(他们的)there(那)等。 7 整体记忆 把几个字母看作一个整体来记如: ow 再加上不同的字母,可组成how(怎么样,如何), cow(奶牛), now(现在), down(下), know (知道)等; ight,再在前面加上不同的字母,可组成eight(8), right(右,正确的), night(晚上),等。ear,在前面加上不同的字母可组成bear(熊),pear(梨) 8

《材料科学基础》总复习(完整版)

《材料科学基础》上半学期容重点 第一章固体材料的结构基础知识 键合类型(离子健、共价健、金属健、分子健力、混合健)及其特点;键合的本质及其与材料性能的关系,重点说明离子晶体的结合能的概念; 晶体的特性(5个); 晶体的结构特征(空间格子构造)、晶体的分类; 晶体的晶向和晶面指数(米勒指数)的确定和表示、十四种布拉维格子; 第二章晶体结构与缺陷 晶体化学基本原理:离子半径、球体最紧密堆积原理、配位数及配位多面体; 典型金属晶体结构; 离子晶体结构,鲍林规则(第一、第二);书上表2-3下的一段话;共价健晶体结构的特点;三个键的异同点(举例); 晶体结构缺陷的定义及其分类,晶体结构缺陷与材料性能之间的关系(举例); 第三章材料的相结构及相图 相的定义 相结构 合金的概念:

固溶体 置换固溶体 (1)晶体结构 无限互溶的必要条件—晶体结构相同 比较铁(体心立方,面心立方)与其它合金元素互溶情况(表3-1的说明) (2)原子尺寸:原子半径差及晶格畸变; (3)电负性定义:电负性与溶解度关系、元素的电负性及其规律;(4)原子价:电子浓度与溶解度关系、电子浓度与原子价关系;间隙固溶体 (一)间隙固溶体定义 (二)形成间隙固溶体的原子尺寸因素 (三)间隙固溶体的点阵畸变性 中间相 中间相的定义 中间相的基本类型: 正常价化合物:正常价化合物、正常价化合物表示方法 电子化合物:电子化合物、电子化合物种类 原子尺寸因素有关的化合物:间隙相、间隙化合物 二元系相图: 杠杆规则的作用和应用; 匀晶型二元系、共晶(析)型二元系的共晶(析)反应、包晶(析)

型二元系的包晶(析)反应、有晶型转变的二元系相图的特征、异同点; 三元相图: 三元相图成分表示方法; 了解三元相图中的直线法则、杠杆定律、重心定律的定义; 第四章材料的相变 相变的基本概念:相变定义、相变的分类(按结构和热力学以及相变方式分类); 按结构分类:重构型相变和位移型相变的异同点; 马氏体型相变:马氏体相变定义和类型、马氏体相变的晶体学特点,金属、瓷中常见的马氏体相变(举例)(可以用许教授提的一个非常好的问题――金属、瓷马氏体相变性能的不同――作为题目) 有序-无序相变的定义 玻璃态转变:玻璃态转变、玻璃态转变温度、玻璃态转变点及其黏度按热力学分类:一级相变定义、特点,属于一级相变的相变;二级相变定义、特点,属于二级相变的相变; 按相变方式分类:形核长大型相变、连续型相变(spinodal相变)按原子迁动特征分类:扩散型相变、无扩散型相变

英语专业四级作文范文

The advantages and disadvantages of college students----taking part-time job Nowadays it has become fashionable for college students to taking part-time job in their spare time. I think this trend has advantages and disadvantages as well. The advantages of college students taking part-time job are as follows : By taking a part-time job,college students can earn some money for their daily necessities, so they can ease their families' financial burden and become economically independent. And by taking a part-time job, college students can get some working experiences and know how to use their knowledge in practical work , how to cooperate with the others and how to deal with some problems they may face. These experiences are surely helpful to their future work after graduation. However, the disadvantages of college students taking a part-time job can not be neglected. The main disadvantages are that if they spend too much time on their part-time jobs, their academic studies will be affected. They may not have enough time to digest what they have learned in class and they may not have enough time to extracurricular organized by their college after class. Since this trend has both advantages and disadvantages, college students should be careful when they decide whether they should take a part-time job or not. My suggestion is that they had better strike a balance between academic studies and a part-time job.

最新整理背单词的十大记忆法口诀英语学习技巧

背单词的十大记忆法口诀英语学习技巧英语的学习,词汇是基础,如何又好又快的记住单词就显得很重要。不同年龄段的人记忆和学习的特点和方法应该各不相同,下面向大家介绍几种英语学习的词汇记忆方法。 背单词记忆方法口诀(1)记单词,要五到,眼嘴手脑齐开炮; 读写背默各几遍,印象清晰记得牢。(五到记忆法) (2)记过单词莫靠边,几天之后再看看; 似忘非忘又温习,反反复复印心间。(循环记忆法) (3)单词多了别心烦,分片分组来攻占; 五个一组先吃掉,几组连成一大片。(分组记忆法) (4)结合词组句子记,有情有景有意义; 重点段落须背诵,理解深刻有乐趣。(理解记忆法) (5)要想单词不写错,语音一关还得过; 读音规律掌握好,拼写自然少差错。(语音记忆法) (6)分类归纳便于记,同类词汇放一起; b i k e,p l a n e和j e e p,归到交通工具里。(归纳记忆法)

(7)同义近义反义词,辨析对比来记忆,比较对照才开窍,印象深刻记得牢。(对比记忆法) (8)单词长了容易忘,卡片纸条来帮忙; m a t h e m a t i c s不好记,纸条贴到《数学》上。(卡片记忆法) (9)构词法,要学习,前缀后缀有规律; 转换常把词类变,合成本是二合一。(构词记忆法) (10)课外读物有情趣,单词复现便于记; 只要坚持常阅读,一举几得大有益。(阅读记忆法)英语背单词技巧一、替换单词字母法 让学生认真观察一个新单词的组成,改变新单词中的某个字母使之变成学生已经学过的单词,从而联想到过去是否学到与此类似的 词汇,一旦想到让学生立刻整理出来,这样不仅巩固了过去已学过的知识,而且掌握和记忆了新的单词。比如在学习新单词f a l l时,让学生动脑筋联想到过去学 过的b a l l(球)这个单词,然后他们一下就可以发现两个单词只错一个字母,所以他们立马可能会想到更多类似的单词,比如h a l l(大厅)、 t a l l

第1课 《我们关心的天气》教案(教科版小学四年级上册科学第一单元)

教科版小学四年级上册科学第一单元 第1课《我们关心的天气》教学设计教学导航 【教材分析】 四年级学生对于天气现象并不陌生,他们可以通过生活经验、每天的天气预报、以及各种信息资料对天气有一些初步印象,但并不能对天气现象进行科学的观察、记录和描述,用比较规范准确的语言来描述天气的基本特征,因此,要通过本课的学习过程,使学生认识一些常见的天气符号,并能用简单的天气符号记录天气变化。提高学生关心天气变化、准确地观察与记录天气的意识。 【教学目标】 知识与能力: 1、认识一些常见的天气符号,并能用简单的天气符号记录天气变化。 2、了解自然界各种各样的天气,感知不同的天气会带给我们不同的感受。 3、在观察和研究天气的活动过程中,使学生增强对周围世界的好奇心和求知欲。 4、激励学生去观察动物和植物的变化与天气的关系。 5、使学生能坚持观测天气和长期测量和记录天气数据。 过程与方法: 通过观察和体验,运用多种感官判断天气变化,从云量、降水量、风、气温等不同方面描述天气,用天气符号观察和记录天气状况。 情感、态度、价值观: 通过对各种天气的了解,意识到天气对人类生活的影响,提高学生关心天气的意识。 【教学重点】 根据平时观察积累和自学,认识一些常见的天气符号,并能用简单的天气符号记录天气变化。 【教学难点】

教科版小学四年级上册科学教学设计通过观察和体验,能运用多种感官判断天气变化,从云量、降水量、风、气温等不同方面描述天气,用天气符号观察和记录天气状况。 【教学准备】 1、多媒体资料: 动画:天气预报图; 课件:天气。 2、实物:记录单、笔等。 教学过程 一、教学设计思路 本节课分为三大部分。第一部分:首先通过让学生谈一谈今天的天气状况,了解学生对天气的初始认识,引起学生对天气变化的关注;第二部分:引导学生认识常见的天气符号。观看中央电视台播报的天气预报资料,使学生了解天气与我们的生活密切相关,再进一步让学生说出表示天气状况的符号分别代表的含义,使学生初步学会正确描述天气状况;第三部分:观测分类活动。在观察前使学生明确要从云量、降水量、风、气温等方面来观察描述天气,并利用天气符号进行分类活动。 二、教学过程 (一)导入 1.谈话:同学们,谁来说一说今天的天气怎么样? 2.组织学生汇报交流 3.评价学生的发言情况 (二)新授课 1.你们知道每天的天气预报都报道哪些情况吗? 2.出示课件(中央电视台天气预报图)讲述:这是一幅中央气象台的天气预报图,观察图中都有哪些主要城市,每个城市的天气如何,你是从哪儿看出来的的? 根据你平时的积累和图中相关符号,说一说你的理由。

英语专四作文范文

英语专四作文范文:如何看待养宠物 There is no denying the fact that keeping pets is a hotly debated topic today. Some people claim that keeping pets is a good thing to do. They believe that pets, like cats and dogs, can help relieve the loneliness suffered by senior citizens and other social members who are confined to their homes for this reason or that. They also argue that keeping pets helps mankind understand animals’ world and develop positive feelings toward them. Others, however, hold the opposite view. They regard keeping pets as a useless but harmful thing to do. First, pets can transmit diseases. Secondly, the noises and dung of pets are sources of pollution. Last but not least, as many rare birds and animals are kept as pets, they will surely be hunted on a large scale, which means a threat to the balance of the ecosystem. Weighing the arguments of both sides, I am inclined to agree with the latter. Anyway, we can lessen our loneliness and express our love toward animals in other ways. And it is my belief that only by placing man and other species on an equal basis can we expect to have a lively and colorful world. 赢得尊重的重要性 Although we would all love to be Ms Popular at home and in the office, at work the task is not to be liked, but to be effective, says computer sales executive Andrea. "This is possibly the single most important lesson we can learn. You can’t always be popular. You shouldn’t have to be; it’s not what you’re there for. Progress depends on having your own ideas and sticking to them. And that means having the guts to make difficult decisions when you have to," she says. We’re aware from day one in our first job that every decision we make is either a building block or a stumbling block on our career path. We should use the time to lay the groundwork of future respect by being professional, responsible, innovative, diligent and reliable. Respect is never given for nothing. Claire knew that she was offered a move to Paris with her finance company because she had gained a reputation for keeping cool under fire. And the next step up the ladder would depend on her performance in Paris. "It’s essential to build regard if you’re going to be able to do what you want in your job," she says. Winning respect enhances all you do. A proposal for change is more likely to be well received; an application for a raise or a request for promotion is more likely to succeed. 诚信如金 Honesty means speaking the truth and being fair and upright in act. He who lies and cheats is dishonest; those who gain fortunes not by hard labour but by other means is dishonest. Honesty is a good virtue. If you are honest all the time,you’ll be trusted and respected by others. A liar is always looked down upon and regarded as a black sheep by the people around. Once you lie, people will never believe you even if you speak the truth. However, in the tide of commodity economy today, it seems that more and more people believe in money at the sacrifice of honesty. To them, among such things as health, beauty, money, intelligence, honesty, reputation and talent, honesty is the only thing that can be erased away. They don’t understand or pretend not to understand that honesty is the biggest fortune human beings own, and that it is the prerequisite for doing everything well. I think these people areto be pitied. In short, honesty is gold. Honest, your reputation will become great;dishonest, your name will be spoiled and your personality degraded. Therefore, we should never make such an excuse as "A little dishonesty is only a trifle thing. We should eradicate immediately the seed of dishonesty once it is sowed in our minds. Green Travel

小学六年级英语单词记忆方法

小学英语单词记忆技巧 在学习英语的过程中,英语词汇的记忆是非常重要的。词汇作为构建英语知识大厦的基石,应给予特别重视。因此,是否能够巧妙的记忆英语词汇成为一个人能否快捷突破英语的关键。然而,很多学习者在掌握一定数量词汇后,还是不能摆脱"死记硬背"的记忆模式,结果浪费时间,耗费精力,效果也不甚良好。本人从新的视角出发,阐述英语单词记忆的原则与技巧,以此引导教育工作者加强重视,改进方法,为学子摆脱"机械记忆"的灌输式记忆模式指津。 一、概论 记忆是一种心理过程,是从"识记"到"回忆"的过程。而英语单词的记忆又不同于其它知识系统的记忆。受汉语学习习惯的影响,加之忽视规律,缺乏意志力等因素困扰,使广大学习者很难走出"死记硬背"的误区。我们在改进记忆方法之前,一定要使注意力集中,目标明确并激发记忆的兴趣,为征服英语词汇记忆做好准备。 二、ACTION原则 ACTION原则是根据记忆心理学相关知识并结合英语单词记忆自身特点而总结出来的。 A即Association (联想)的缩略,意指我们在记忆英语单词时不要孤立的记忆,要发挥思维潜能,一并记忆与之相关的词汇,。"联想"辐射的范围很广,可以是由部分联想到整体,以及其它事物,也可以是由某一特征联想到一类事物,还可以是读音、词形、词义方面的扩充联想等等。例如: 记忆单词mare(母马、母驴),从读音角度出发,我们可以联想到同音词mayor(市长),从词意角度出发,我们可以联想到donkey(驴),mule(骡子)等词,从词形角度出发,我们可以联想到一连串词汇: bare(赤裸的)care(关心)、dare(敢)、fare(费用)、hare(野兔)、flare(闪耀)、glare(强光)、pare(修剪)、rare(稀有的)、share(分享)、spare(空闲的)、prepare(准备)、square(广场)、stare(凝视)、

科学第一单元知识点(1—7课)

科学第一单元知识点(1—7课) (和上次的一张一起国庆长假让孩子在家背一下) 我们熟悉的食物来自什么动植物? 米饭水稻面条小麦猪肉猪苹果苹果树 根据食物的来源,可以将食物分为那几类? 根据食物的来源,可以把食物分为动物类食物和植物类食物两类。 “民以食为天。”中国人口众多,耕地面积少,解决吃饭问题尤其重要。如何解决?我们主要依靠科学技术提高粮食产量。被誉为“杂交水稻之父”的袁隆平在这方面做出了巨大贡献。 “四大发明”之后的“第五大发明”是“杂交水稻” 食物所含的营养成分非常丰富,通常分为:蛋白质、糖类、脂肪、维生素和矿物质以及水。 富含蛋白质类食物:鱼、蛋、豆内、瘦肉、牛奶 含有丰富糖类的食物:米、面、淀粉 含有丰富脂肪类食物:肥肉、花生、油、巧克力 含有丰富维生素和矿物质的食物:蔬菜、水果、食盐和水 使用酒精灯注意事项: 1、点燃酒精灯要用火柴,绝对禁止用酒精灯引燃另一只酒精灯。 2、绝对禁止向燃着的酒精灯里添加酒精。 3、用完酒精灯,必须用灯帽盖灭,不可用嘴去吹。 4、不要碰到酒精灯。万一洒出的酒精在桌上燃烧,应立即用湿布扑盖。 米饭、馒头、土豆,滴碘酒后变蓝色,说明这些食物中含淀粉。 脂肪和淀粉可为我们的身体提供能量,蛋白质是组成身体的主要物质。 不同的食物中含有不同的营养成分,合理安排一日三餐可以使我们从食物中获得全面营养。 我的一日食谱: 可根据情况适当拱配:早餐:鸡蛋、豆奶必有 中餐:荤素搭配、汤、主食 晚餐:简单举例子。不要出现荤 保护消化器官,请做到以下几点: (1)饭前便后要洗手; (2)吃饭要定时,细嚼慢咽; (3)不吃不干净,过了保持期的食物。 (4)不暴饮暴食。

英语专四作文讲解

作文之写作摸板--现象解释说明/结果预测型 现象解释说明/结果预测型这类题型往往是让考生分析某个社会现象背后的原因,为什么会产生这种情况,带有说明的性质,这类题型还未在近年的真题中出现过,但是有以黑马姿态出现的概率,也符合国际英语测试的潮流。或者也非常有可能是让考生预测某个热门社会现象可能产生的后果,甚至也会进一步让考生提出一些建设性措施。关键词: the reasons of ... the consequences of ... the effective measures 典型例题 Nowadays more and more students choose to further their studies abroad, for example, in the USA or UK. What could be the possible reasons behind this social phenomenon? In the modern society, people depend on technology more and more, such as mobile phones and the Internet. What can be the possible consequences or impacts on people’s lives? 解题思路 考生要看清题目,是解释原因,还是预测可能的结果,还是两者都要覆盖到。我们一般会在引言段交代时代背景,然后开门见山地谈本文探讨该现象的产生原因或导致的结果。 常用模板 引言段:Recently, ... has been brought into focus... No where in history has the issue been more visible. In this essay, I aim to explore this complicated phenomenon from diverse perspectives, identify the relevant contributing factors and bring up some effective measures. (最后半句根据题目需要取舍,看题目中有没有要求提出建议措施) 主体段.:As far as I am concerned, an array of integrated factors contributes to the... The first role that should be blamed is... Another equally important factor lies in... Last but not least, ... 总结段:As a matter of fact, ...gives rise to a host of problems, such as... Confronted with such a thorny issue, people come up with a variety of constructive countermeasures. Personally, the following is worth recommendation... (分析现象背后的原因的题目也往往会让考生给出建设性措施,可参考此条) 作文之写作摸板--个人观点表达型

相关主题
文本预览
相关文档 最新文档