当前位置:文档之家› Mining the Most Interesting Rules

Mining the Most Interesting Rules

Mining the Most Interesting Rules
Mining the Most Interesting Rules

Abstract

Several algorithms have been proposed for finding the “best,” “opti-mal,” or “most interesting” rule(s) in a database according to a vari-ety of metrics including confidence, support, gain, chi-squared value, gini, entropy gain, laplace, lift, and conviction. In this paper, we show that the best rule according to any of these metrics must reside along a support/confidence border. Further, in the case of conjunctive rule mining within categorical data, the number of rules along this border is conveniently small, and can be mined effi-ciently from a variety of real-world data-sets. We also show how this concept can be generalized to mine all rules that are best according to any of these criteria with respect to an arbitrary subset of the population of interest. We argue that by returning a broader set of rules than previous algorithms, our techniques allow for improved insight into the data and support more user-interaction in the optimized rule-mining process.

1. Introduction

There are numerous proposals for mining rules from data. Some are constraint-based in that they mine every rule satisfying a set of hard constraints such as minimum support or confidence (e.g. [1,2,6]). Others are heuristic in that they attempt to find rules that are predictive, but make no guarantees on the predictiveness or the completeness of the returned rule set (e.g. decision tree and covering algorithms [9,15]). A third class of rule mining algorithms, which are the subject of this paper, identify only the most interesting, or optimal, rules according to some interestingness metric [12,18,20,24]. Optimized rule miners are particularly useful in domains where a constraint-based rule miner produces too many rules or requires too much time.

It is difficult to come up with a single metric that quantifies the “interestingness” or “goodness” of a rule, and as a result, several different metrics have been proposed and used. Among them are confidence and support [1], gain [12], variance and chi-squared value [17,18], entropy gain [16,17], gini [16], laplace [9,24], lift [14] (a.k.a. interest [8] or strength [10]), and conviction [8]. Several algorithms are known to efficiently find the best rule (or a close approximation to the best rule [16]) according to a specific one of these metrics [12,18,20,24]. In this paper, we show that a single yet simple concept of rule goodness captures the best rules according to any of them. This concept involves a partial order on rules defined in terms of both rule support and confidence. We demonstrate that the set of rules that are optimal according to this partial order includes all rules that are best according to any of the above metrics, even given arbitrary minimums on support and/or confidence.In the context of mining conjunctive association rules, we present an algorithm that can efficiently mine an optimal set according to this partial order from a variety of real-world data-sets. For example, for each of the categorical data-sets from the Irvine machine learning repository (excepting only connect-4), this algorithm requires less than 30 seconds on a 400Mhz Pentium-II class machine. Specifying constraints such as minimum support or confidence reduces execution time even further. While optimizing according to only a single interestingness metric could sometimes require less overhead, the approach we propose is likely to be advantageous since it supports an interactive phase in which the user can browse the optimal rule according to any of several interestingness metrics. It also allows the user to interactively tweak minimums on support and confidence. Witnessing such effects with a typical optimized rule miner requires repeated mining runs, which may be impractical when the database is large.

Another need for repeated invocations of an optimized rule miner arises when the user needs to gain insight into a broader population than what is already well-characterized by previously discovered rules. We show how our algorithm can be generalized to produce every rule that is optimal according to any of the previously mentioned interestingness metrics, and additionally, with respect to an arbitrary subset of the population of interest. Because data-mining is iterative and discovery-driven, identifying several good rules up-front in order to avoid repeatedly querying the database reduces total mining time when amortized over the entire process [13].

2. Preliminaries

2.1 Generic Problem Statement

A data-set is a finite set of records. For the purpose of this paper, a record is simply an element on which we apply boolean predicates called conditions. A rule consists of two conditions called the antecedent and consequent, and is denoted as where is the antecedent and the consequent. A rule constraint is a boolean predicate on a rule. Given a set of constraints , we say that a rule satisfies the constraints in if every constraint in evaluates to true given . Some common examples of constraints are item constraints [22] and minimums on support and confidence [1]. The input to the problem of mining optimized rules is a -tuple

where:

? is a finite set of conditions;

? is a data-set;

? is a total order on rules;

? is a condition specifying the rule consequent;

? is a set of constraints on rules.

When mining an optimal disjunction, we treat a set of conditions as a condition itself that evaluates to true if and only if one or more of the conditions within evaluates to true on the given record. When mining an optimal conjunction, we treat as a condition that evaluates to true if and only if every condition within evaluates to true on the given record. For both cases, if is empty then it always evaluates to true. Algorithms for mining optimal conjunctions and disjunctions differ significantly in their

A C

→A

C

N

r N N

r

5

U D C N

,,

,,

??

U

D

C

N

A U

?

A

A

A A

Mining the Most Interesting Rules

Roberto J. Bayardo Jr.

IBM Almaden Research Center

https://www.doczj.com/doc/39483708.html,/cs/people/bayardo/ bayardo@https://www.doczj.com/doc/39483708.html,

Rakesh Agrawal

IBM Almaden Research Center https://www.doczj.com/doc/39483708.html,/u/ragrawal/ ragrawal@https://www.doczj.com/doc/39483708.html,

Appears in Proc. of the Fifth ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, 145-154, 1999.

details, but the problem can be formally stated in an identical manner 1:

P ROBLEM (O PTIMIZED R ULE M INING ): Find a set such that (1) satisfies the input constraints, and

(2) there exists no set such that satisfies the input

constraints and .

Any rule whose antecedent is a solution to an instance of the optimized rule mining problem is said to be -optimal (or just optimal if the instance is clear from the context). For simplicity, we sometimes treat rule antecedents (denoted with and possibly some subscript) and rules (denoted with and possibly some subscript) interchangeably since the consequent is always fixed and clear from the context.We now define the support and confidence values of rules. These values are often used to define rule constraints by bounding them above a pre-specified value known as minsup and minconf respectively [1], and also to define total orders for optimization [12,20]. The support of a condition is equal to the number of records in the data-set for which evaluates to true, and this value is denoted as . The support of a rule , denoted similarly as , is equal to the number of records in the data-set for which both and evaluate to true.2 The antecedent support of a rule is the support of its antecedent alone. The confidence of a rule is the probability with which the consequent evaluates to true given that the antecedent evaluates to true in the input data-set, computed as follows:

2.2 Previous Algorithms for the Optimized Rule

Mining Problem

Many previously proposed algorithms for optimized rule mining solve specific restrictions of the optimized rule mining problem.For example, Webb [24] provides an algorithm for mining an optimized conjunction under the following restrictions:

? contains an existence test for each attribute/value pair appear-ing in a categorical data-set outside a designated class column;? orders rules according to their laplace value (defined later);? is empty.

Fukuda et al. [12] provide algorithms for mining an optimized dis-junction where:

? contains a membership test for each square of a grid formed by discretizing two pre-specified numerical attributes of a data-set (a record is a member of a square if its attribute values fall within the respective ranges);

? orders rules according to either confidence, antecedent sup-port, or a notion they call gain (also defined later);

? includes minimums on support or confidence, and includes one of several possible “geometry constraints” that restrict the allowed shape formed by the represented set of grid squares;Rastogi and Shim [20] look at the problem of mining an optimized disjunction where:

? includes a membership test for every possible hypercube defined by a pre-specified set of record attributes with either

ordered or categorical domains;

? orders rules according to antecedent support or confidence;? includes minimums on antecedent support or confidence, a maximum on the number of conditions allowed in the ante-cedent of a rule, and a requirement that the hypercubes corre-sponding to the conditions of a rule are non-overlapping.

In general, the optimized rule mining problem, whether conjunctive or disjunctive, is NP-hard [17]. However, features of a specific instance of this problem can often be exploited to achieve tractability. For example, in [12], the geometry constraints are used to develop low-order polynomial time algorithms. Even in cases where tractability is not guaranteed, efficient mining in practice has been demonstrated [18,20,24]. The theoretical contributions in this paper are conjunction/disjunction neutral. However, we focus on the conjunctive case in validating the practicality of these results through empirical evaluation.

2.3 Mining Optimized Rules under Partial Orders

We have carefully phrased the optimized rule mining problem so that it may accommodate a partial order in place of a total order.With a partial order, because some rules may be incomparable,there can be several equivalence classes containing optimal rules.The previous problem statement requires an algorithm to identify only a single rule from one of these equivalence classes. However,in our application, we wish to mine at least one representative from each equivalence class that contains an optimal rule. To do so, we could simply modify the previous problem statement to find all optimal rules instead of just one. However, in practice, the equivalence classes of rules can be large, so this would be unnecessarily inefficient. The next problem statement enforces our requirements specifically:

P ROBLEM (P ARTIAL -O RDER O PTIMIZED R ULE M INING ): Find a set of subsets of such that:

(1) every set in is optimal as defined by the optimized rule mining problem.

(2) for every equivalence class of rules as defined by the partial order, if the equivalence class contains an optimal rule, then exactly one member of this equivalence class is within .We call a set of rules whose antecedents comprise a solution to an instance of this problem an -optimal set. An -optimal rule is one that may appear in an -optimal set.

2.4 Monotonicity

Throughout this paper, we exploit (anti-)monotonicity properties of functions. A function is said to be monotone (resp. anti-monotone) in if implies that (resp.). For example, the confidence function, which is defined in terms of rule support and antecedent support, is anti-monotone in antecedent support when rule support is held fixed.

3. SC-Optimality

3.1 Definition

Consider the following partial order on rules. Given rules and , if and only if:

?, or ?. Additionally, if and only if and .

An -optimal set where contains this partial order is depicted in Figure 1. Intuitively, such a set of rules defines a support-confidence border above which no rule that satisfies the input constraints can fall.

1

Algorithms for mining optimal disjunctions typically allow a single fixed conjunctive condition without complications, e.g. see [20]. We ignore this issue for simplicity of presentation.2

This follows the original definition of support as defined in [1]. The reader is warned that in the work of Fukuda et. al. [14] and Rastogi and Shim [20](who are careful to note the same discrepancy), the definition of support corresponds to our notion of antecedent support.

A 1U ?A 1A 2U ?A 2A 1A 2

sup A ()

----------------------------=U ≤N U ≤N U ≤N k O U A O O I I I I f x ()x x 1x 2

Consider also the similar partial order such that if and only if:

?,

or ?.

The equivalence condition is the same as before. An optimal set of rules according to this partial order forms a lower border.

3.2 Theoretical Implications

In this section, we show that for many total orders intended to rank rules in order of interestingness, we have that

, and . We say that any such total order is implied by . This property of a total order is useful due to the following fact:

L EMMA 3.1: Given the problem instance such that is implied by , an -optimal rule is contained

within any -optimal set where .Proof:Consider any rule that is not -optimal (for simplicity we will ignore the presence of constraints in ). Because is non-optimal, there must exist some rule that is optimal such that . But then we also have that since is

implied by . This implies that any non--optimal rule is either non--optimal, or it is equivalent to some -optimal rule which resides in an -optimal equivalence class. At least one -optimal equivalence class must therefore contain an -opti-mal rule. Further, because is implied by , every rule in this equivalence class must be -optimal. By definition, an -optimal set will contain one of these rules, and the claim fol-lows.Put simply, mining the upper support/confidence border identifies optimal rules according to several different interestingness metrics.We will show that these metrics include support, confidence,conviction, lift, laplace, gain, and an unnamed interest measure proposed by Piatetsky-Shapiro [19]. If we also mine the lower

border, metrics such as entropy gain, gini, and chi-squared value are also included. But first, consider the following additional property of total orders implied by :O BSERVATION 3.2: Given instance such that is implied by , and contains a minimum support con-straint and/or a minimum confidence constraint , an -optimal rule is contained within any -optimal set where .The implication of this fact is that we can mine without minimum support and confidence constraints, and the optimal rule given any setting of these constraints will remain in the mined rule set. This allows the user to witness the effects of modifying these constraints without further mining of the data -- a useful fact since the user often cannot determine accurate settings of minimum support or confidence apriori. Minimum support and confidence constraints are quite critical in some applications of optimized rule mining, particularly when the optimization metric is itself support or confidence as in [12] and [20].

To identify the interestingness metrics that are implied by , we use the following lemma.

L EMMA 3.3: The following conditions are sufficient for establish-ing that a total order defined over a rule value function is implied by partial order :

(1) is monotone in support over rules with the same confi-dence, and

(2) is monotone in confidence over rules with the same support.

Proof:Suppose , then consider a rule where and . Note that by defini-tion and . Now, if a total order has the mono-tonicity properties from above, then and . Since total orders are transitive, we then have that , which establishes the claim. These conditions trivially hold when the rule value function is or . So consider next the Laplace function which is commonly used to rank rules for classification purposes [9,24].

The constant is an integer greater than 1 (usually set to the number of classes when building a classification model). Note that if confidence is held fixed to some value , then we can rewrite the Laplace value as below.

It is straightforward to show that this expression is monotone in rule support since and . The Laplace function is also monotone in confidence among rules with equivalent support. To see why, note that if support is held constant, in order to raise the function value, we need to decrease the value of the denominator.This decrease can only be achieved by reducing antecedent support, which implies a larger confidence. Note that an optimal set contains the optimized Laplace rule for any valid setting of .This means the user can witness the effect of varying on the optimal rule without additional mining of the database.

The gain function of Fukuda et al. [12] is given above, where is a fractional constant between 0 and 1. If confidence is held fixed at , then this function is equal to , which is

s c ?≤r 1 s c ?∧≤sup r 1()sup r 2()conf r 1()conf r 2()≥∧

Figure 1. Upper and lower support-confidence borders. t ≤r 1 sc

≤I U D t ≤C N ,,

,,??= t ≤ sc ≤I I sc I sc U D sc

≤C N ,,,,??=r 1I sc N r 1r 2r 1 sc

≤I sc I I I sc I sc I t = sc =

I I sc sc ≤I U D t ≤C N ,,

,,??= t ≤ sc ≤N n s n c I I sc I sc U D sc ≤C N n s n c ,{}}–,,,,??= sc ≤ t ≤f r () sc ≤f r ()f r ()r 1 sc

≤r 2r 1 t ≤r r t ≤r 2r 1 t ≤r 2 sup r ()conf r ()laplace A C →()sup A C →()1

+sup A ()k

+-------------------------------------=k c laplace r ()sup r ()1

+sup r ()c ?k

+-----------------------------=k 1>c 0≥k k gain A C →()sup A C →()θsup A ()

×–=θc sup r ()1Θc ?–()

trivially monotone in support as long as . We can ignore the case where if we assume there exists any rule satisfying the input constraints such that . This is because for any pair of rules and such that and ,we know that irrespective of their support.Should this assumption be false, the gain criteria is optimized by any rule with zero support should one exist (e.g. in the conjunctive case, one can simply add conditions to a rule until its support drops to zero).

The gain function is monotone in confidence among rules with equivalent support for reasons similar to the case of the Laplace function. If support is held constant, then an increase in gain implies a decrease in the subtractive term. The subtractive term can be decreased only by reducing antecedent support, which implies a larger confidence. Note that, like from the Laplace function,after identifying the optimal set of rules, the user can vary and view its effect on the optimal rule without additional mining.Another interestingness metric that is identical to gain for a fixed value of was introduced by Piatetsky-Shapiro [19]:

Consider next conviction [8], which was framed in [6] as a

function of confidence:

Conviction is obviously monotone in confidence since confidence appears in a subtractive term within the denominator. It is also unaffected by variations in rule support if confidence is held constant, which implies monotonicity.

Lift, a well-known statistical measure that can be used to rank rules in IBM’s Intelligent Miner [14] (it is also known as interest [8] and strength [10]), can also be framed as a function of confidence [6]:

Like conviction, lift is obviously monotone in confidence and

unaffected by rule support when confidence is held fixed.The remaining interestingness metrics, entropy gain, gini, and chi-squared value, are not implied by . However, we show that the space of rules can be partitioned into two sets according to confidence such that when restricted to rules in one set, each

metric is implied by , and when restricted to rules in the other set, each metric is implied by . As a consequence, the optimal rules with respect to entropy gain, gini, and chi-squared value must reside on either the upper or lower support confidence border. This idea is formally stated by the observation below.

O BSERVATION 3.4: Given instance , if implies over the set of rules whose confidence is greater than equal to some value , and implies over the set of rules whose confidence is less than or equal to , then an -optimal rule appears in either (a) any optimal set where , or (b) any -optimal set where .To demonstrate that the entropy gain, gini, and chi-squared values satisfy the requirements put forth by this observation, we need to know when the total order defined by a rule value function is implied by . We use an analog of the Lemma 3.3 for this purpose:

L EMMA 3.5: The following conditions are sufficient for establish-ing that a total order defined over a rule value function is implied by partial order :

(1) is monotone in support over rules with the same confi-dence, and

(2) is anti-monotone in confidence over rules with the same support.In [16], the chi-squared, entropy gain, and gini values of a rule are each defined in terms of a function where and given the rule to which it is applied 3 (definitions appear in Appendix A).These functions are further proven to be convex . Another important property of each of these functions is that they reach their minimum at any rule whose confidence is equal to the “expected”confidence [17]. More formally, is minimum when where and . To prove our claims, we exploit two properties of convex functions from [11]:

(1) Convexity of a function implies that for an arbitrary dividing point of the line segment between two points and , we have .4(2) A convex function over a given region must be continuous at every point of the region’s relative interior.

The next two lemmas show that a convex function which is minimum at has the properties required by Observation 3.4, where from the observation is set to the expected confidence value.

L EMMA 3.6: For a convex function which is minimum at , is (1) monotone in for fixed , so long as , and (2) monotone in when for any constant .

Proof:For case (1), when is fixed to a constant , for represents a horizontal line segment that extends from point leftward. The value of is clearly anti-monotone in . Because is also anti-monotone in in this region as a consequence of its con-vexity, it follows that is monotone in 5.For case (2), assume the claim is false. This implies there exists a vector defined by for some constant along which is not monotone in . That is, there exist two points and along where yet (see Figure 2). If were defined to be minimum at , then this would contradict the fact that is convex. But since as well as are undefined at this point, another argument is required. Consider then some sufficiently small non-zero value such that and . Because is convex and continuous in

c Θ≥c Θ

--------------------------------–=conviction A C →()D sup C ()

–D 1conf A C →()–()

----------------------------------------------------=lift A C →()D conf A C →()sup C ()

--------------------------------------= sc

≤ sc

≤ s c ?≤I U D t ≤C N ,,,,??= sc

≤ t ≤γ s c ?≤ t ≤γI I sc I sc U D sc ≤C N ,,,,??=I s c ?I s c ?U D s c ?≤C N ,,,,??= s c ?≤3

We are making some simplifications: these functions are actually defined in terms of a vector defining the class distribution after a binary split. We are restricting attention to the case where there are only two classes (and which correspond to and respectively). The binary split in our case is the segmentation of data-set made by testing the antecedent condition of the rule.

4This well-known property of convex functions is sometimes given as the definition of a convex function, e.g. [16]. While this property is necessary for convexity, it is not sufficient. The proofs of convexity for gini, entro-py, and chi-squared value in [16] are nevertheless valid for the actual def-inition of convexity since they show that the second derivatives of these functions are always non-negative, which is necessary and sufficient [11].5We are not being completely rigorous due to the bounded nature of the convex region over which is defined. For example, the point may not be within this bounded region since can be no greater than . Verifying that these boundary conditions do not affect the validity of our claims is left as an exercise.

t ≤f r () s c ?≤f r ()f r ()f x y ,()x sup A ()sup A C ∪()–=y sup A C ∪()=A C →C ?C x y D A f x y ,()conf x y ,()c =conf x y ,()y x y +()?=c sup C ()D ?=f x y ,()x 3y 3,x 1y 1,x 2y 2,max f x 1y 1,()f x 2y 2,(),()f x 3y 3,()≥f x y ,()conf x y ,()c =γf x y ,()conf x y ,()c =f x y ,()conf x y ,()y conf x y ,()c ≥y conf x y ,()A =A c ≥y Y conf x Y ,()conf x Y ,()c ≥conf x Y ,()c =conf x Y ,()x f x f x Y ,()conf x Y ,()f conf x Y ,()c =x D v conf x y ,()A =A c ≥f y x 1y 1,()x 2y 2,()v f x 1y 1,()f x 2y 2,()>y 1y 2f

its interior region, such a value of is guaranteed to exist unless , which is a trivial boundary case. Now, consider the line . This line must contain a point such that and are non-negative, and one or both of or is non-zero. But because is convex and minimum at , we have that , which is a contradiction. L EMMA 3.7: For a convex function which is minimum at , is (1) anti-monotone in for fixed , so long as , and (2) monotone in when for any constant .Proof:Similar to the previous. The previous two lemmas and Observation 3.4 lead immediately to the following theorem, which formalizes the fact that mining the upper and lower support-confidence borders identifies the optimal rules according to metrics such as entropy gain, gini, and chi-squared value. Conveniently, an algorithm specifically optimized for mining the upper border can be used without modification to mine the lower border by simply negating the consequent of the given instance, as stated by the subsequent lemma.

T HEOREM 3.8: Given instance , if is defined over the values given by a convex function over rules where:

(1) and , and (2) is minimum at ,

then an -optimal rule appears in either (a) any optimal set

where , or (b) any -optimal set where .L EMMA 3.9: Given an instance , any -optimal set for (where evaluates to true only when evaluates to false) is also an -optimal set.

Proof Idea: Note that . Thus,maximizing the confidence of minimizes the confi-dence of . Before ending this section we consider one practical issue -- that of result visualization. Note that the support-confidence borders as displayed in Figure 1 provide an excellent means by which optimal sets of rules may be visualized. Each border clearly illustrates the trade-off between the support and confidence. Additionally, one can imagine the result visualizer color-coding points along these borders that are optimal according to the various interestingness metrics, e.g. blue for Laplace value, red for chi-squared value,green for entropy gain, and so on. The result of modifying minimum support or confidence on the optimal rules could be displayed in real-time as the user drags a marker along either axis

in order to specify a minimum bound.

3.3 Practical Implications for Mining Optimal Con-junctions

In this section we present and evaluate an algorithm that efficiently

mines an optimal set of conjunctions according to (and due to Lemma 3.9) from many real-world categorical data-sets,without requiring any constraints to be specified by the user. We also demonstrate that the number of rules produced by this algorithm for a given instance is typically quite manageable -- on the order of a few hundred at most. We address the specific problem of mining optimal conjunctions within categorically valued data, where each condition in is simply a test of whether the given input record contains a particular attribute/value pair,excluding values from a designated class column. Values from the designated class column are used as consequents. While our algorithm requires no minimums on support or confidence, if they are specified, they can be exploited for better performance. Space constraints prohibit a full explanation of the workings of this algorithm, so we highlight only the most important features here.A complete description appears in an extended draft [7].

The algorithm we use is a variant of Dense-Miner from [6], which is a constraint-based rule miner suitable for use on large and dense data-sets. In Dense-Miner, the rule mining problem is framed as a set-enumeration tree search problem [21] where each node of the tree enumerates a unique element of . Dense-Miner returns every rule that satisfies the input constraints, which include minimum support and confidence. We modified Dense-Miner to instead maintain only the set of rules that are potentially optimal at any given point during its execution. Whenever a rule is enumerated by a node and found to satisfy the input constraints, it is compared against every rule presently in . If is better than or incomparable to every rule already in according to the partial order, then rule is added to . Also, any rule in that is worse than is removed. Given this policy, assuming the tree enumerates every subset of , upon termination, is an optimal set.Because an algorithm which enumerates every subset would be unacceptably inefficient, we use pruning strategies that greatly reduce the search space without compromising completeness.These strategies use Dense-Miner’s pruning functions (appearing in Appendix B), which bound the confidence and support of any rule that can be enumerated by a descendent of a given node. To see how these functions are applied in our variant of the algorithm,consider a node with support bound and confidence bound .To see if can be pruned, the algorithm determines if there exists

a rule in such that , where is some imaginary rule with support and confidence . Given such a rule, if any descendent of enumerates an optimal rule, then it must be equivalent to . This equivalence class is already represented in , so there is no need to enumerate these descendents, and can be pruned.

This algorithm differs from Dense-Miner in only two additional ways. First, we allow the algorithm to perform a set-oriented best-first search of the tree instead of a purely breadth-first search.Dense-miner uses a breadth-first search since this limits the number of database passes required to the height of the search tree.In the context of optimized rule mining, a breadth-first strategy can be inefficient because pruning improves as better rules are found,and good rules sometimes arise only at the deeper levels. A pure best-first search requires a database pass for each node in the tree,which would be unacceptable for large data-sets. Instead, we process several of the best nodes (at most 5000 in our implementation) with each database pass in order to reduce the

δx 20=x 1y 1,()x 2δ–y 2,(),[]x 3y 3,()x 3y 3x 3y 3f x 3y 3,()f x 1y 1,()f x 2δ–y 2,()≤

x

y

Figure 2. Illustration of case (2) from Lemma 3.6.conf(x,y) = c

x 2δ–y 2,()

x 2y 2,()

x 1y 1,()

x 3y 3,()

f x y ,()conf x y ,()c =f x y ,()conf x y ,()y conf x y ,()c ≤y conf x y ,()A =A c ≤

I U D t ≤C N ,,,,??= t ≤f x y ,()A C →x sup A ()sup A C ∪()–=y sup A C ∪()=f x y ,()conf x y ,()sup C ()D ?=I I sc I sc U D sc ≤C N ,,

,,??=I s c ?I s c ?U D s c ?≤C N ,,,,??=I s c ?U D s c ?≤C N ,,,,??=I sc I sc U D sc ≤C ?N ,,,,??=C ?C I s c ?conf A C ?→()1conf A C →()–=A C ?→A C → sc

≤ s c ?≤U 2U R r R r R r R R r U R g s c g r R r i sc

≤r r i s c g r R g

number of database passes while still substantially reducing the search space. For this purpose, a node is better than another if the rule it enumerates has a higher confidence value.

The remaining modification is the incorporation of inclusive pruning as proposed by Webb [23]. This pruning strategy avoids enumerating a rule when it can be determined that its antecedent can be extended with an additional condition without affecting the support of the rule. In the absence of item constraints, this optimization prunes many rules that are either non-optimal or equivalent to some other optimal rule to be enumerated.Unfortunately, when there are item constraints in (e.g. “rules must contain fewer than conditions”), this pruning strategy cannot be trivially applied without compromising completeness, so it must be disabled. Full details of this pruning strategy are provided in Appendix B.

We evaluated our algorithm on the larger of the categorical data-sets from the Irvine machine learning database repository,6including chess, mushroom, letter, connect-4, and dna. We also used the pums data-set from [6] which is compiled from census data (a similar data-set was used in [8]). For the Irvine data-sets,we used each value of the designated class column as the consequents. For the pums data-set, we used the values of the RELAT1 column (13 in all)7. Each of these data-sets is known to be difficult for constraint-based rule mining algorithms such as Apriori, even when specifying a strong minimum support constraint [4,6,8].

Experiments were performed on an IBM IntelliStation with 400MHZ Intel Pentium-II processor and 128 MBytes of main memory.Execution time and the number of rules returned by the algorithm appear in Table 1; characteristics of each data-set appear in Table 2. For the Irvine data-sets, with the exception of connect-4, our algorithm identified an optimal set of rules within 30 seconds in every case, with many runs requiring less than 1 second. Connect-4was the most difficult of the data-sets for two reasons. First, it has substantially more records and more columns than many of the other data-sets. But a stronger contributor to this discrepancy was the fact that rules with high confidence within the connect-4 data-set have very low support. For example, with the “tie” class as the consequent, rules with 100% confidence have a support of at most 14 records. This property greatly reduces pruning effectiveness,resulting in almost one hour of execution time given this consequent. In cases like these, modest settings of the minimum support or confidence constraint can be used to improve runtime considerably. For example, a minimum support of 676 records (1%of the data-set) reduces execution time to 6 minutes.

The number of rules in each optimal set was on the order of a few hundred at most. Of the Irvine data-sets, connect-4 contained the most optimal rules, with 216 for win, 171 for tie, and 465 for lose.We plot the upper support-confidence border for each of these consequents in Figure 3. Rule support is normalized according to consequent support so that each border ranges from 0 to 100%along the axis.

4. PC-Optimality

4.1 Definition

While sc-optimality is a useful concept, it tends to produce rules that primarily characterize only a specific subset of the population

of interest (by population of interest, we mean the set of records for which condition evaluates to true). In this section, we propose another partial order with the goal of remedying this deficiency.First, the population of a rule is simply the set of records from data-set for which both and evaluate to true. We denote the population of a rule as . Clearly then,. A rule which contains some record within its population is said to characterize .

6https://www.doczj.com/doc/39483708.html,/~mlearn/MLRepository.html 7

https://www.doczj.com/doc/39483708.html,/cs/quest

apriori binary format of this data.

N k x Data-set Consequent Time (sec)# of Rules

chess

win <160nowin <141

connect-4win 642216

draw 3066171loss 1108465

letter A-Z 18322dna EI 209

IE 2315N <19

mushroom poisonous <112

edible <17

pums 1740324

2204702350926741741525469161981750183827021098433831057242411881651212111322102

Table 1. Execution time and number of rules returned.

Figure 3. Upper support/confidence borders for Connect-4.C A C →D A C r pop r ()pop r ()sup r ()=t t

Consider now the partial order on rules where if and only if:

?, or ?.

Two rules are equivalent according to this partial order if their population sets are identical and their confidence values are equal.One can analogously define the partial order where

if and only if:?

or ?.

4.2 Theoretical Implications

It is easy to see that is implied by (and by ), so

all the claims from Section 3 also hold with respect to pc-optimality. Note that results in more incomparable rule pairs than due to the population subsumption requirement. This implies there will be many more rules in a pc-optimal set compared to an sc-optimal set. The consequence of these additional rules, as formalized by the observation below, is that a pc-optimal set always contains a rule that is optimal with respect to any of the previous interestingness metrics, and further with respect to any constraint that requires the rules to characterize a given subset of the population of interest.

O BSERVATION 4.1: Given an instance where

(1) is implied by , and (2) has a constraint on rules stating that for

a given subset of the population of interest,

an -optimal rule appears in any -optimal set where .We note that this generalization of sc-optimality is quite broad.Given the large number of rules that can appear in a pc-optimal set (see next subsection), a more controlled generalization might be desirable. One idea is to have the rule set support any constraint that requires a rule to characterize a given member (in place of a subset ) of the population of interest. This would still guarantee a broad characterization, but potentially with much fewer rules. This notion designates a rule as uninteresting if there exists some set of rules such that (1) each rule in satisfies the input constraints, (2) every member of is characterized by at least one rule in and (3) each rule in is equal to or better than according to . (Note that pc-optimality designates a rule as uninteresting only if there exists such a set containing exactly one rule.) This notion of uninterestingness cannot, to our knowledge, be expressed using a partial order on rules. Instead, we are examining how the concept of pc-optimality combined with constraints can successfully express (and exploit) this notion.

4.3 Practical Implications for Mining Optimal Con-junctions

Producing an algorithm that can efficiently mine an optimal set of

conjunctions with respect to may seem an impossible proposition in large data-sets due to the need to verify subsumption

between rule populations over many rules. However, we will show that we can verify the relationships induced by the partial order “syntactically” in many cases; that is, by checking the conditions of the rules along with support and confidence, instead of examining the set of database records that comprise their populations. Keep in mind that in this subsection we are restricting attention to mining optimal conjunctions. In typical formulations of mining optimized disjunctions (e.g. [12,20]), syntactic checks for population subsumption simply require geometrically comparing the regions represented by the base conditions.D EFINITION (A-M AXIMAL ): A rule is a-maximal if there is no rule such that and .Note that the definition of an a-maximal rule is such that it need not satisfy the input constraints. The intuition behind a-maximality (“antecedent” maximality) is that an a-maximal rule cannot have its antecedent enlarged without strictly reducing its population.Given a rule , we denote an a-maximal rule that has the same population of as . Lemma 4.2a implies there is only one such rule, which means that is in fact a function. The purpose of this function is to provide a concise, canonical

description of a rule’s population that allows for efficiently

checking the subsumption condition (Lemma 4.2). In turn, this

provides a syntactic method for comparing rules with (Theorem 4.3).

L EMMA 4.2A :B :Proof:The direction is immediate for both cases, so we prove the direction. Suppose the claims are false and consider the rule which is formed by taking the union of the antecedents from and . We establish each claim by con-tradiction.For claim A, if , then clearly has the same population as and . Since we assume the claim is false, we have that is different than . Given this, either or . But since all three rules have the same population, this leads to the contradiction that either or is not a-maxi-mal.

For claim B, if , then .As a consequence, we must have that by claim A from above. Since we assume the claim is false, we must in addition have that or . The case where they are equal is an immediate contradiction due to claim A. For the case where , note that , which contradicts the fact that . T HEOREM 4.3 A : if and only if:(1) and , or (2) and .B : if and only if

and .These facts cannot be trivially applied in an efficient manner since computing requires scanning the data-set (an algorithm for this purpose appears in Figure 4). Instead, we describe pruning optimizations which can be incorporated into a rule miner to avoid generating many rules that are not pc-optimal. A post-processing phase then computes for every rule identified in one final pass over the database, and this information is used to efficiently identify any remaining non-optimal rules.

Data-set # of Rows

# of Columns

chess 3,19637connect-467,55743dna 3,12461letter

20,00017mushroom 8,12423pums 49,04674

Table 2. Number of rows and columns in each data-set. pc ≤r 1 pc

c ?∧?pop r 1()pop r 2()conf r 1()conf r 2()≥∧? sc ≤ pc ≤ s c ?≤ p c ?≤ pc ≤ sc

≤I U D t ≤C N ,,,,??= t ≤ pc

≤N n r P pop r ()?P I I pc I pc U D pc ≤C N n {}–,,,,??=t P r R R pop r ()R R r

sc

≤R pc

≤A 1C →A 2C →A 1A 2?sup A 1C →()sup A 2C →()=r r amax r ()amax() pc

≤pop r 1()pop r 2()= ?amax r 1()amax r 2()

=pop r 1()pop r 2()? ?amax r 1()amax r 2()

???r 3amax r 1()amax r 2()pop r 1()pop r 2()=r 3r 1r 2amax r 1()amax r 2()amax r 1()amax r 3()?amax r 2()amax r 3()?amax r 1()amax r 2()pop r 1()pop r 2()?pop r 3()pop r 1()=amax r 1()amax r 3()=amax r 1()amax r 2()?amax r 1()amax r 2()=amax r 1()amax r 2()?amax r 3()amax r 2()?r 3amax r 1()amax r 2()∪= r 1 pc

O BSERVATION 4.4: Given an instance , a rule cannot be -optimal if there exists some rule where , satisfies the input constraints,

and .Using the terminology from [6], this observation simply states that

a pc-optimal rule must have a non-negative improvement value. We can in fact require that improvement be positive since we only need one member of each equivalence class. The Dense-Miner algorithm already provides pruning functions that bound the improvement value of any rule derivable from a given node. We thus modify Dense-Miner to prune nodes whose improvement bound is 0 (see Appendix B for the simplified pruning functions that can be used for this special case). We also again exploit Webb’s inclusive pruning strategy for the case where there are no item constraints in .

We can now fully describe an algorithm for mining an -optimal set of rules without performing any explicit subsumption checks between rule populations (Figure 5). We use the above-described variant of Dense-Miner to implement step 1 of this algorithm. Step 3 applies Theorem 4.3a to identify non-optimal rules for removal.This step requires we first associate each rule with its a-maximal rule (step 2). To find the a-maximal rules, we apply the algorithm in Figure 4 for each rule in , using a single shared database scan.For a data-set such as connect-4, our implementation of this step requires less than 3 seconds for a set with up to 10,000 rules.Finally, step 4 applies Theorem 4.3b in order to identify equivalent rules so that there is only one representative from each equivalence class in the returned result.

This algorithm could be simplified slightly when the input constraints have the property that satisfies whenever does. In this case, the set could be returned

immediately following step 2 (since is a set, we assume it contains no duplicates). However, some common rule constraints (e.g. “a rule must have fewer than conditions”) do not have this property.

In practice, we find that the number of rules returned by this algorithm is considerably larger than that of the algorithm for mining sc-optimal sets. On most data-sets, rule constraints such as minimum support or confidence must be specified to control the size of the output as well as execution time. For the execution times reported in Table 3, the minimum support constraint was set to ensure that each rule’s population is at least 5% of the population of interest. For the pums data-set, this constraint was not sufficiently strong to control combinatorial explosion. We therefore simplified the data-set by removing values which appear in 80% or more of the records (the resulting data-set is known as pumsb*, and was used in [5]). We conclude that pc-optimal sets of rules are useful primarily on data-sets where the density is somewhat subdued, or when the user is capable of specifying strong rule constraints prior to mining.

5. Conclusions

We have defined a new optimized rule mining problem that allows a partial order in place of the typical total order on rules. We have also shown that solving this optimized rule mining problem with

respect to a particular partial order (and in some cases its analog ) is guaranteed to identify a most-interesting rule according to several interestingness metrics including support,confidence, gain, laplace value, conviction, lift, entropy gain, gini,and chi-squared value. In practice, identifying such an optimal set of conjunctive rules can be done efficiently, and the number of rules in such a set is typically small enough to be easily browsed by

I pc U D pc ≤C N ,,,,??=A 1C →I pc A 2C →A 2A 1?A 2conf A 2()conf A 1()>N I NPUT : a rule and a data-set O UTPUT : 1. Find the first record in that is a member of the population of . Initialize a set to contain those conditions of that evaluate to true on this record, but are not already in .2. For each remaining record in which is in the population of , remove any condition in which evaluates to false on this record.3. Add the conditions in to the antecedent of and return the result.

r D amax r ()

D r A U r D r A A r Figure 4. Algorithm for computing .amax r ()I NPUT : O UTPUT : An -optimal set.

1. Find all rules with positive improvement among those satisfying the input constraints. Call this set of rules .

2. For each rule in , associate with and put into a set .

3. Remove any rule from if there exists some such that such that according to Theorem

4.3a.4. For each rule , if there is more than one rule in

such that , then remove all but one of them from .5. Return .

I pc U D pc ≤C N ,,,,??=I pc R r R r amax r ()amax r ()R a r 1R r 2R

∈r 1 pc

chess

win 2,821236,735nowin 50442,187

connect-4win 19,992178,678

draw 18,123119,984loss 34,377460,444

letter A-Z 6537,850dna EI 6455,347

IE 4649,505N 539,071

mushroom poisonous 1217

edible 3389

pumsb*11,05884,594

282933,443330514,927477028,553530821,2446595,717741215,474842822,99293,079160,908103,857175,061111185,70112129911348259,088

Table 3. Execution time and number of rules returned. sc

≤ s c ?≤

an end-user. We lastly generalized this concept using another

partial order in order to ensure that the entire population of interest is well-characterized. This generalization defines a set of rules that contains the most interesting rule according to any of the above metrics, even if one requires this rule to characterize a specific subset of the population of interest.

These techniques can be used to facilitate interactivity in the process of mining most-interesting rules. After mining an optimal set of rules according to the first partial order, the user can examine the most-interesting rule according to any of the supported interestingness metrics without additional querying or mining of the database. Minimum support and confidence can also be modified and the effects immediately witnessed. After mining an optimal set of rules according to the second partial order, in addition to the above, the user can quickly find the most-interesting rule that characterizes any given subset of the population of interest. This extension overcomes the deficiency of most optimized rule miners where much of the population of interest may be poorly characterized or completely uncharacterized by the mined rule(s).

Acknowledgments

We are indebted to Ramakrishnan Srikant and Dimitrios Gunopulos for their helpful suggestions and assistance.

References

[1]Agrawal, R.; Imielinski, T.; and Swami, A. 1993. Mining

Associations between Sets of Items in Massive Databases. In Proc. of the 1993 ACM-SIGMOD Int’l Conf. on Management of Data , 207-216.

[2]Agrawal, R.; Mannila, H.; Srikant, R.; Toivonen, H.; and

Verkamo, A. I. 1996. Fast Discovery of Association Rules. In Advances in Knowledge Discovery and Data Mining , AAAI Press, 307-328.

[3]Ali, K.; Manganaris, S.; and Srikant, R. 1997. Partial Classifi-cation using Association Rules. In Proc. of the 3rd Int'l Conf. on Knowledge Discovery in Databases and Data Mining , 115-118.

[4]Bayardo, R. J. 1997. Brute-Force Mining of High-Confidence

Classification Rules. In Proc. of the Third Int’l Conf. on Knowledge Discovery and Data Mining , 123-126.

[5]Bayardo, R. J. 1998. Efficiently Mining Long Patterns from

Databases. In Proc. of the 1998 ACM-SIGMOD Int’l Conf. on Management of Data , 85-93.

[6]Bayardo, R. J.; Agrawal, R.; and Gunopulos, D. 1999. Con-straint-Based Rule Mining in Large, Dense Databases. In Proc. of the 15th Int’l Conf. on Data Engineering , 188-197.[7]Bayardo, R. J. and Agrawal, R. 1999. Mining the Most Inter-esting Rules . IBM Research Report. Available from:https://www.doczj.com/doc/39483708.html,/cs/quest [8]Brin, S.; Motwani, R.; Ullman, J.; and Tsur, S. 1997.

Dynamic Itemset Counting and Implication Rules for Market Basket Data. In Proc. of the 1997 ACM-SIGMOD Int’l Conf. on the Management of Data , 255-264.

[9]Clark, P. and Boswell, P. 1991. Rule Induction with CN2:

Some Recent Improvements. In Machine Learning: Proc. of the Fifth European Conference , 151-163.

[10]Dhar, V . and Tuzhilin, A. 1993. Abstract-driven pattern dis-covery in databases. IEEE Transactions on Knowledge and Data Engineering , 5(6).

[11]Eggleston, H. G. 1963. Convexity. Cambridge Tracts in Math-ematics and Mathematical Physics, no. 47. Smithies, F. and Todd, J. A. (eds.). Cambridge University Press.

[12]Fukuda, T.; Morimoto, Y .; Morishita, S.; and Tokuyama, T.

1996. Data Mining using Two-Dimensional Optimized Asso-ciation Rules: Scheme, Algorithms, and Visualization. In Proc. of the 1996 ACM-SIGMOD Int’l Conf. on the Manage-ment of Data , 13-23.

[13]Goethals, B. and V an den Bussche, J. 1999. A Priori Versus A

Posteriori Filtering of Association Rules. In Proc. of the 1999 ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, paper 3.

[14]International Business Machines, 1996. IBM Intelligent Miner

User’s Guide , V ersion 1, Release 1.

[15]Mitchell, T. M. 1997. Machine Learning . McGraw-Hill, Inc.[16]Morimoto, Y .; Fukuda, T.; Matsuzawa, H.; Tokuyama, T.; and

Yoda, K. 1998. Algorithms for Mining Association Rules for Binary Segmentations of Huge Categorical Databases. In Proc. of the 24th Very Large Data Bases Conf ., 380-391.[17]Morishita, S. 1998. On Classification and Regression. In

Proc. of the First Int’l Conf. on Discovery Science -- Lecture Notes in Artificial Intelligence 1532:40-57.

[18]Nakaya, A. and Morishita, S. 1999. Fast Parallel Search for

Correlated Association Rules . Unpublished manuscript.

[19]Piatetsky-Shapiro, G. 1991. Discovery, Analysis, and Presen-tation of Strong Rules. Chapter 13 of Knowledge Discovery in Databases , AAAI/MIT Press, 1991.

[20]Rastogi, R. and Shim, K. 1998. Mining Optimized Associa-tion Rules with Categorical and Numeric Attributes. In Proc. of the 14th Int’l Conf. on Data Engineering , 503-512.

[21]Rymon, R. 1992. Search through Systematic Set Enumera-tion. In Proc. of Third Int’l Conf. on Principles of Knowledge Representation and Reasoning , 539-550.

[22]Srikant, R.; Vu, Q.; and Agrawal, R. 1997. Mining Associa-tion Rules with Item Constraints. In Proc. of the Third Int'l Conf. on Knowledge Discovery in Databases and Data Min-ing , 67-73.

[23]Webb, G. I. 1996. Inclusive Pruning: A New Class of Pruning

Axiom for Unordered Search and its Application to Classifi-cation Learning. In Proc. of the 1996 Australian Computer Science Conference , 1-10.

[24]Webb, G. I. 1995. OPUS: An Efficient Admissible Algorithm

for Unordered Search. Journal of Artificial Intelligence Research , 3:431-465.

Appendix A

Here we provide definitions for the gini, entropy gain, and chi-squared rule value functions. For a given condition , we denote the fraction of records that satisfy the consequent condition among those that satisfy as , and the fraction of records that do not satisfy the consequent condition among those that satisfy as . Note then that:

and pc

≤A C A p A ()A p A ()p A ()sup A C →()sup A ()----------------------------=p A ()sup A ()sup A C →()

–sup A ()

--------------------------------------------------=

Since and are constant for a given instance of the problem,terms such as are constants. Any of the above and , and hence so can the functions themselves. For example, ,, , ,and so on.

Appendix B

Set-Enumeration Tree Search .

The set-enumeration tree search framework is a systematic and complete tree expansion procedure for searching through the power set of a given set . The idea is to first impose a total order on the elements of . The root node of the tree will enumerate the empty set. The children of a node will enumerate those sets that can be formed by appending a single element of to , with the restriction that this single element must follow every element already in according to the total order. For example, a fully-expanded set-enumeration tree over a set of four elements (where each element of the set is denoted by its position in the ordering)appears in Figure 6.

To facilitate pruning, we use a node representation called a group where the set enumerated by a node is called the head and denoted . The set of viable elements of which can be appended to in order to form a child is called the tail and denoted . Making the tail elements explicit enables optimizations such as element reordering (for dynamic tree rearrangement) and tail pruning. Element reordering involves locally changing the total order on at each node in the tree to maximize pruning effectiveness. Tail pruning involves removing elements from the tail if they cannot possibly be part of any solution in the sub-tree rooted at the node. This directly reduces the search space, and indirectly reduces the search space by improving the bounds we compute on values such as the confidence of any rule that can be enumerated by a descendent of the node.

Pruning with Confidence, Support, and Improvement

We say a rule (which we represent using only its antecedent since the consequent it fixed) is derivable from a group if , and . By definition, any rule that can be enumerated by a descendent of in the set-enumeration tree is also derivable from . From [6] we have that:

?The function provides an upper-bound on the confidence of any conjunctive rule derivable from a given group , where and are non-negative integers such that and .

?The value of from above provides an upper-bound on support.?If the confidence bound given by from above for some group is less than or equal to the confidence of any rule enumer-ated thus far such that is a subset of and satisfies the input constraints, then the maximum improvement of any deriv-able rule is zero.

?If the following value is equal to zero, then the maximum improvement of any derivable rule is zero:

We refer the reader to [6] for details on how to compute the above values economically given that the data-set is large, and how to heuristically reorder the elements of in order to ensure these functions have plenty of pruning opportunities.

Inclusive Pruning

Webb’s inclusive pruning strategy [23] moves a subset of the tail of a group into its head whenever the following fact can be established: if some solution is derivable from , then at least one of the solutions derivable from is a superset of . For example,in the case of mining optimized conjunctions according to the Laplace function (and many of the other metrics we have examined including our partial orders), suppose for some element in . Ignoring the effects of rule constraints, if some rule derivable from is optimal, then so is the rule , which is also derivable from . If one or more such elements are found in the tail of some node, instead of expanding its children, these elements can all be moved into the head to form a new node that replaces it.

Some constraints may unfortunately prohibit straightforward application of this particular inclusive pruning strategy. For example, an item constraint may disallow from participating in a solution when combined with some other items from and . Another problematic constraint is one which bounds the size of a rule’s antecedent. Luckily, work-arounds are typically possible. For example, in the case where a bound is specified on the number of base conditions that may appear in an antecedent,the strategy can be applied safely whenever .

gini A C →()=1p ?()2p ?()2+[]

– sup A ()

D ---------------1p A ()2p A ()2+[]–()

– sup A ?()

D

-------------------1p A ?()2p A ?()2+[]–()

–ent A C →()= p ?()log p ?()()p ?()log p ?()()+[]

– sup A ()

D ---------------p A ()log p A ()()p A ()log p A ()()+[]

– sup A ?()

D

-------------------p A ?()log p A ?()()p A ?()log p A ?()()+[]

–chi 2A C →()=

sup A ()p A ()p ?()–[]2sup A ?()p A ?()p ?()–[]2–p ?()

---------------------------------------------------------------------------------------------------------------------- sup

A ()A ()?()–[]2sup A ?()A ?)?()–[]2–?()

------------------------------------------------------------------------------------------------------------------+C D p ?()sup C ()D ?=x sup A ()sup A C →()–=y sup A C →()=sup A ()y x +=sup A ?()D y x +()–=p A ()y y x +()?=p A ()x y x +()?={}

121,2

1,2,31,2,3,4

1,3

1,3,41,4

2,32,3,4

2,4

33,4

4

Figure 6. A complete set-enumeration tree over a 4 item set.

U U N U N N g h g ()U h g ()t g ()U r g h g ()r ?r h g ()t g ()∪?g g f c x y ,()x x y +()?=g x y y sup h g ()t g ()C ?→∪()≤x sup h g ()C →()≥x f c g r r h g ()r βmin u ?h g ()∈sup h g ()u {}–()u ?{}C ?→∪(),()

=U T g g g T sup h g ()C →()sup h g ()u {}C →∪()=u t g ()r g r u {}∪g u h g ()t g ()k h g ()t g ()∪k ≤

高考完形填空真题

The journey my daughter Cathy has had with her swimming is as long as it is beautiful. Cathy suffered some terrible 16 in her early childhood. After years of regular treatment, she17became healthy. Two years ago, while Cathy was watching the Olympics, a dream came into her sweet little head— to be a swimmer. Last summer, she wanted to18our local swim team. She practiced hard and finally19it. The team practice,20 ,was a rough start. She coughed and choked and could hardly21 her first few weeks. Hearing her coughing bitterly one night, I decided to22her from it all. But Cathy woke me up early next morning, wearing her swimsuit23to go! I told her she shouldn’swimt after a whole night’coughing,s but she refused to24 and insisted she go . From that day on, Cathy kept swimming and didn ’ t 25 a single practice. She had a 26intention within herself to be the best she could be. My ten-year-old was growing and changing right before my eyes, into this27 human being with a passion and a mission. There were moments of28of course: often she would be the last swimmer in the race. It was difficult for Cathy to accept that she wasn29---ever. But that’didnta’ t stop her from trying. Then came the final awards ceremony at the end of the year. Cathy didn’ t expect any award but was stil 30 her friends and praise their accomplishments. As the ceremony was nearing the end, I suddenly heard the head coach31 ,“ The highest honor goes to Cathy!” Looking around, he continued,“ Cathy has inspired32 us and enthusiasm.33skills and talents bring great success, the most valuable asset(财富 )one can hold is the heart.” It was the greatest34of my da ughter’ s life. With all she hadbeen35in her ten years, this was the hour of true triumph( 成功 ). 16.A. failure B. pressure C. loss D. illness 17.A. usually B. finally C. firstly D. frequently 18.A. improve B. train C. join D. contact 19.A. increased B. found C. created D. made 20.A. however B. therefore C. otherwise D. instead 21.A. use B. survive C. save D. waste 22.A. pull B. tell C. hide D. fire 23.A. afraid B. nervous C. ready D. free 24.A. take off B. set off C. give up D. show up 25.A. attend B. miss C. ban D. Start 26.A. rich B. weak C. firm D. kind 27.A. trusted B.determined C.experienced D. embarrassed 28.A. frustration B. delight C. excitement D. surprise 29.A. beginner B.learner C. partner D. winner 30.A. cheer on B. compete with C. respond to D. run after 31.A. admitting B.explaining C.announcing D. whispering 32.A. humor B. will C. honesty D. wisdom 33.A. Although B. Since C. Once D. Because 34.A. discovery B. choice C. influence D. moment 35.A. through B. under C. across D. around My fiance (未婚夫) and I were excited about shopping for our first home. But our funds were 16 , and none of the houses in our price range seemed satisfactory. One agent 17 a house in particular. Although her description sounded wonderful, the price was18our range, so we declined. But she kept urging us to have a look 19. We finally did and it was20at first sight. It was Our Home, small and charming, overlooking a quiet lake. Walking through the rooms and talking with the owners, a nice elderly couple, we felt the warmth and21of the

关于毕业演讲稿

关于毕业演讲稿 尊敬的各位领导、各位老师、各位同学: 大家好!我是06级2班的争气的败家子,非常荣幸代表我们班48名毕业生发言。四年过去了,学校的学习和生活为我们奠定了坚实的基础,明天我们就要离开曾经憧憬向往的大学生涯,走向我们的最终归宿——社会。服务社会才是我们的最终目标,我们会投身在社会的大课堂中不断进步,在社会的大舞台上大展鸿图。再此,我代表我们班的全体毕业生,感谢母校四年来对我们的培养和教育,感谢各位领导和老师对我们的关爱和教诲,感谢家人对我们的付出和鼓励,感谢身边朋友带给我们的快乐和帮助。 毕业,就像一个大大的句号。从此,我们告别了一段纯真的青春,一段年少轻狂的岁月,一个充满幻想的时代……毕业前的这些日子,时间过的好像流沙,想挽留,一伸手,有限的时光却在指间悄然溜走,毕业答辩,散伙席筵,举手话别,各奔东西……一切似乎都预想的到,一切又走的太过无奈。 还记得入学第一天我们的自我介绍么? 还记得为次日的比赛挑灯做准备么? 还记得我们一起逛街,一起喝酒,一起聊天,一起唱歌么? 自习室、野游、考试、获奖……一幕幕的场景就像一张

张绚烂的剪贴画,串连成一部即将谢幕的电影,播放着我们的快乐和忧伤,记录着我们的青春和过往,也见证着我们的情深义重。从大一开始第一次上讲台的激动,第一次加入社团的好奇,第一次考试的紧张……到此时在为工作各种选择里彷徨,每一个人都忙忙碌碌,一切仿佛一首没写完的诗,匆匆开始就要匆匆告别。这些岁月里,大学是我们的资本,也是我们的慰藉。 班级聚餐的时候,所有的同学都在那里举杯,为过去的日子和情感,为将来的分别和感伤。昔日笑声不断的整个宿舍楼就这样在几天之内变回空楼,变成一个无限伤感的符号。想起四年以前,我们拎着简单的行李来到这里,而明天,我们重新拎起新的行李,将要开始下一站的生活。 再见了,我的宿舍,再见了,我的兄弟,再见了,我的青春,再见,我的大学。 毕业,又像一个长长的省略号。青春散场,我们等待下一场开幕。等待我们在前面的旅途里,迎着阳光,勇敢地飞向心里的梦想;等待我们在前面的故事里,就着星光,回忆这生命中最美好的四年,盛开过的花……道一声离别,送一声祝福,无论再过多少年,无论我们走到哪里,我们也不会忘记,曾经孕育过我们的这一片深情的土地。 大学时光只是人生路途中的一个小小的驿站,毕业并不代表结束,而是欢呼开始,不是庆祝完成,而是宣布进步。

完形填空、阅读理解技巧讲解及练习

完形填空 基础建构: 小学完形填空主要考察:1.动词的变化;2.词汇的常用搭配。 应对完形填空的考查点基本上相同,应对完形填空的学习,我们需要培养: 第一,勤于朗读和背诵,熟读和熟记常用的表达-----应对词汇常用搭配的考查; 第二,课堂上,注意语法的讲授,通过做讲义练习进行巩固-----应对动词变化的考查。 A Li li, look 1 the picture. It’s 2 picture of our classroom. In the picture, you can see some desks 3 chairs. 4 the blackboard, you can see two black and white cats. A map is 5 the door. It’s a map 6 Beijing. Under the 7 desk is a ball, but you can’t see it. The girl in the hat is my good friend Kate. She is a new student. She is 8 English girl. She looks 9 Lucy . But they aren’t10 . ( )1.A. in B. at C. to D. on ( )2.A.a B. an C. the D./ ( )3.A.or B. but C. and D. there ( )4.A. In B. Of C. At D. On ( )5.A.at B. in C. under D. behind ( )6.A. of B. on C. in D. for ( )7.A. teacher B. teacher’s C. teachers’ D. of teacher ( )8.A. / B. the C. an D. a ( )9.A.at B. after C. like D. the same ( )10.A. boys B. girls C. twins D. students B Jim and Tom are 1 .They look 2 same. They are 3 . They’re twelve. They are in No. 14 Middle 4 . They’re in the same 5 .But they 6 in the same room. Jim is in 7 301 and Tom is in Room 302 . 8 classmates all look 9 them. Now they are good 10 . ( )1.A. twin sister B. twins sisters C. twin brothers D. twins brothers ( )2.A. a B. an C. the D. × ( )3.A. new B. new student C. a new student D. a new ( )4.A. school B. School C. schools D. Schools ( )5.A. class B. Class C. classes D. Classes ( )6.A. is B. isn’t C. are D. aren’t ( )7.A. room B. Room C. rooms D. Rooms ( )8.A. He B. His C. They D. Their ( )9.A. at B. like C. after D. to ( )10.A. friend B. friends C. student D. students C I 1 a picture . It’s a picture 2 a school. 3 the picture you can see a school and some trees. You can see some boys and girls. They are 4 the trees. The school is a middle school . Look 5 these two boys. 6 they good friends 7 brothers? 8 is

完形填空高考真题解析

完形填空高考真题解析 一、高中英语完形填空 1.阅读下面短文,掌握其大意,然后从各题所给的四个选项A、B、C、D四个选项中,选 出最佳选项。 I started volunteering at a soup kitchen several years ago. The original reason I was going was to 1 community service hours for school. My plan was to 2 go there a few times and get my service hours, but it taught me a lot. The typical volunteer there served 3 to people. Basically, I was 4 serving bread and juice to whoever wanted it, which was a simple task. Some of the people were homeless, and some of them were 5 families. All of them were people in need of a hot meal and a place to 6 for an hour or several minutes. 7 some of them looked like they weren't behaving well, we always took care of them. The first time I went there was right before Christmas. For the people coming to the soup kitchen, it was not exactly a 8 time. It made me think about my happy Christmas and made me feel how 9 I was. Unlike them, I have a home and I don't 10 cold or hunger. At that point, I decided that I 11 wanted to go back there. I couldn't offer them much, but I could always offer my time and 12 . The experience also gives me a feeling of 13 . Whenever I go there, people are 14 that I showed up again. They know my name and they know that I am more than happy to 15 them. It truly feels good to know that you can 16 someone's day. I've realized that the feeling of doing good for people can be a better 17 than any amount of money. You can't buy that feeling. I have never 18 a single second of my volunteering. It 19 me that dozens of cities have made it illegal to set up a soup kitchen. But I will continue my volunteer work and find more ways to show my 20 to people in need. 1. A. reduce B. avoid C. complete D. cancel 2. A. yet B. just C. even D. still 3. A. food B. work C. time D. money 4. A. tired of B. worried about C. responsible for D. free from 5. A. busy B. serious C. experienced D. struggling 6. A. hide B. rest C. live D. study 7. A. Although B. If C. Because D. Until 8. A. available B. strange C. pleasant D. painful 9. A. wise B. honest C. curious D. fortunate 10. A. turn down B. suffer from C. pass down D. learn from 11. A. definitely B. gradually C. equally D. hardly 12. A. reason B. effort C. chance D. patience 13. A. stability B. guilt C. loss D. appreciation 14. A. grateful B. confident C. proud D. shocked 15. A. change B. leave C. forget D. help

英语完形填空用法详解

英语完形填空用法详解 一、高中英语完形填空 1.阅读下面的短文,从短文后各题所给的A、B、C和D四个选项中,选出可以填入空白 处的最佳选项。 Owura Kwadwo Hottish teaches computers in the school he works in. I think it is a 1 school except for the fact that the school didn't have 2 . Owura became famous after he posted photos of him on the Internet. In the picture, people could see he was teaching his students by 3 an entire computer on the blackboard. The photos showed the 4 level of education for children in Ghana. People were 5 that Owura made sure each button (按钮)was drawn correctly. Owura wanted the students to 6 what life with a computer could be like someday. He would come to school half an hour ahead of the 7 every day. He drew the computer on the blackboard, but at the end of his class, it was 8 off to start the next class, so he had to 9 it the next day! Owura's efforts were 10 when Microsoft (微软公司)took 11 of his act. They first took him to an international educators' meeting in Singapore. He made a 12 about his teaching methods at the meeting. He 13 a standing ovation (致敬)after the speech. 14 , Owura got the thing he always wanted for his students—some companies 15 computers to the school. Not a single child in the 16 had seen a real computer in their lives. Thanks to their teacher's 17 , the world took notice and responded with 18 to them. "Your work has really made us feel 19 about the world. At Microsoft, we believe that educators are heroes. They 20 influence the lifelong skills of their students." said Anthony Salcito, Vice president at Microsoft. 1. A. final B. dusty C. normal D. personal 2. A. computers B. playgrounds C. classrooms D. managers 3. A. operating B. drawing C. describing D. repairing 4. A. clear B. high C. ancient D. poor 5. A. worried B. disappointed C. afraid D. amazed 6. A. start B. imagine C. rebuild D. harm 7. A. line B. culture C. schedule D. judge 8. A. shown B. called C. cut D. rubbed 9. A. improved B. repeated C. ruined D. calculated 10. A. rewarded B. selected C. forgotten D. affected 11. A. care B. notice C. place D. charge 12. A. plan B. medal C. decision D. speech 13. A. received B. replaced C. decreased D. contained 14. A. Suddenly B. Hopelessly C. Importantly D. Strangely 15. A. gave B. sold C. lent D. applied

有关六年级毕业演讲稿集锦5篇(最新)

有关六年级毕业演讲稿集锦5篇 尊敬的学校领导、老师,亲爱的同学们: 大家好! 我是六年级82班的杨茜。今天,我代表我们全班43位同学,向在场的各位恩师,致以最崇高的敬意! 六年前,是您,敬爱的老师,将我们领进知识的大门,不仅让我们领略了知识的无穷魅力,更教会了我们做人的道理。我们就是您亲手栽种的桃李。在您谆谆的教导下,我们一天比一天懂事,一天比一天成熟,一天比一天勇敢! 六载春秋,两千多个日日夜夜,恩师如山啊!您阳光雨露般的教育,给予了我们生命的色彩,让我们收获了生命的精彩!这些,我们都将一辈子铭记于心!请允许我们道一声:“老师,您辛苦了!” “六年磨一剑”再过两周,我们就要迎来人生中的第一次大项的考试——小学毕业考试。这次考试既是献给母校最好的礼物,又是自己学习道路上的小结,更是一张成为合格初中生的名片。我们深知,它的重要性不亚于中考,甚至是高考。 为了能给母校献上一份最美的答卷,为了给自己的人生打下坚实的基础,我们要发扬百米赛跑冲刺的精神,努力奋斗,潜心复习,力争考出六年来最优异的成绩! 如何在短时间内有针对性地进行复习,提高成绩呢?我们准备从下面几点入手: 一、全面研读教材 我们的考试内容都来自教材,从教材的第一页到最后一页,每个部分都可能考到。我们只有充分准备,在考试时才能游刃有余。 二、善于总结 就是在仔细看完一篇教材的前提下,我们一边看书,一边作总结性的笔记,把教材中的要点列出来,从而让厚书变薄,并理解其精华所在。 三、对症下猛药 在老师、同学的指导下,我们要找出自己在学习上的薄弱环节,集中精力,重点攻破。这样就能以较小的投入获得较大的考试收益,在考试中力于不败之地。

完 形 填 空& 阅 读 理 解 练习(一)

完形填空& 阅读理解练习(一) 一. 完形填空 Dorothy Brown was very happy as she sat in the theatre listening to the music. Today her little daughter Lauren was giving her ___26___ concert. She had been waiting for this ___27___ for years and years. “Now it is here at last,” she thought. “How beautiful her ___28___ is.” The song made her ___29___ to the days when she was Lauren’s ___30___. As a young ___31___, Dorothy wanted to be a concert singer. She studied ___32___ in France, Italy and in the United States. “You can become a fine ___33___ in the future,” her teachers told her. “But you must be ___34___to study hard and work for many years. There will be ___35___time for anything but music in your life.” Dorothy was ___36___ at that time and she was ___37___ that music was all she wanted or needed to ___38___ her life. For almost a year Dorothy ___39___ of nothing else. Then she ___40___ David, a young engineer travelling Europe. They soon fell in ___41___. David asked her to be his ___42___. Dorothy also wanted to marry David. But she loved ___43___,too. She didn’t know what to do. David was against her being a singer. He said, “If you want to be a singer, you must forget about getting married. You can’t ___44___ do both.” Thus her days were gone and would never return. Now Lauren became a singer instead of her, which was her ___45___. 26.A. sorry B. successful C. first D. wonderful 27.A. dance B. moment C. show D. party 28.A. voice B. face C. dress D. life 29.A. think of B. bring back C. go back D. come back 30.A. age B. friend C. mother D. teacher 31.A. musician B. pop star C. lady D. girl 32.A. French B. music C. piano D. dance 33.A. actress B. student C. singer D. dancer 34.A. prepared B. learning C. driven D. waiting 35.A. some B. any C. no D. enough 36.A. eight B. eighteen C. eighty D. eighty-eight 37.A. lucky B. sure C. afraid D. fond 38.A. fill B. live C. lead D. take 39.A. heard B. knew C. talked D. thought 40.A. saw off B. learned from C. heard of D. met with 41.A. love B. feeling C. music D. touch 42.A. assistant B. teacher C. wife D. student 43.A. him B. engineering C. herself D. music 44.A. certainly B. possibly C. only D. mainly 45.A. thought B. hope C. purpose D. will 二.阅读理解 A Forgiveness is not a way of forgetting the past. Indeed, if we have been harmed, we should not forget it. We can learn from the past about how to avoid being harmed in the future. Nor is forgiveness a way of exonerating (免除责任) the one who has hurt us. We recognize that the harm did happen and the person must be responsible for this and must come to terms with their own guilt. When we forgive, we are not sacrificing anything or giving up our sense of self-worth. Indeed, we are doing just the opposite by taking a stand which says that we are strong and finally free of playing the role of victim. Forgiveness is a way of declaring our honesty. Forgiveness is a way of saying that the pain of the past should now be put behind me. Thus, forgiving is a reflection of positive self-esteem. It means that we have better things to do in life than continuing to live under the influence of the one who has caused us pain. Forgiveness signifies breaking the cycle of pain and abuse (辱骂), giving up the belief that the other person should hurt as much as we do. It means abandoning the myth that if we hurt the other person, it will make us feel better. Forgiveness implies giving up the unrealistic hope that an apology will

【英语】完形填空高考真题解析

【英语】完形填空高考真题解析 一、高中英语完形填空 1.阅读下面短文,从短文后各题所给四个选项(A、B、C和D)中,选出可以填入空白处 的最佳选项。 Have you ever seen a miracle happen? Two winters ago, I did. It was 1 that day. The road was covered with snow. I asked my mom if we could go to our neighbor's hill and take my dog, Buddy. She said, "Sure. 2 don't go any farther than that." After a while, we walked over to another hill by a pond (池塘) not far away. Then suddenly I saw Buddy walk out onto the ice. I was 3 when he walked to the middle of the pond. The 4 was thin there. We 5 Buddy, but he didn't listen. Then 6 the ice broke, and my dog 7 into the water. I asked my friend if I should go in and get him out. She said, "NO!" 8 , I ran to my house to tell my mom. It was 9 for me to run on the snowy ground. I was tired, but I had to 10 Buddy. When my mom heard what 11 , she said, "Get in the car." When we got there, she told me to run home 12 and get her phone. But back at the pond, my mom 13 to get Buddy out of the water and fell in. 14 , my friend saw it happen. She ran to a neighbor's house for help. Soon they got my mom out of the water, but Buddy was still 15 for his life. The fire department (部门) put a ladder (梯子) out onto the ice to get him, but it didn't 16 . I was back again by then, and someone said I should 17 inside the neighbor's house. It didn't look 18 for Buddy. I went inside and 19 and hoped for a miracle. Then, my uncle walked into the house with Buddy in his arms! I was so 20 , I hugged Buddy and thanked the firefighters (消防员). 1. A. dry B. rainy C. warm D. cold 2. A. Unless B. Or C. So D. But 3. A. dissatisfied B. moved C. frightened D. excited 4. A. light B. air C. voice D. ice 5. A. looked for B. laughed at C. shouted at D. woke up 6. A. suddenly B. slowly C. naturally D. possibly 7. A. jumped B. fell C. ran D. walked 8. A. Still B. Besides C. However D. Instead 9. A. difficult B. fun C. wrong D. important 10. A. find B. watch C. help D. train 11. A. followed B. happened C. came D. appeared 12. A. again B. once C. too D. together 13. A. stopped B. agreed C. tried D. decided 14. A. Luckily B. Firstly C. Strangely D. Sadly 15. A. swimming B. dreaming C. hunting D. fishing

高中英语完形填空解题技巧大全知识讲解

高中英语完形填空解题技巧大全 开篇练习 My son Joey was born with club feet. The doctors said that with treatment he would be able to walk, but would never run very well. The first three years of his life was 1 in hospital. By the time he was eight,you wouldn‘t know he has a problem when you saw him 2 . Children in our neighborhood always ran around 3 their play, and Joey would jump and ran and play,4 . We never told him that he probably wouldn‘t be 5 to run like the other children. So he didn’t know. In 6 grade he decided to join the school running team. Every day he trained. He ran more than any of the others, 7 only the top seven runners would be chosen to run for the 8 . We didn‘t tell him he probably would never make the team, so he didn’t know. He ran four to five mile every day - even when he had a fever. I was 9 ,so I went to 10 him after school. I found him running 11 . I asked him how he felt. “Okay,” he said. He has two more miles to go. Yet he looked straight ahead and kept 12 . Two weeks later, the names of the team 13 were caked. Joey was number six on the list. Joey had 14 the team. He was in seventh grade - the other six team members were all eighth graders. We never told him he couldn‘t do it … so he didn’t know. He just 15 it. 1. A. spent B. taken C. cost D. paid 2. A. talk B. sit C. study D. walk 3. A. after B. before C. during D. till 4. A. either B. too C. though D. yet 5. A. able B. sorry C. glad D. afraid 6. A. sixth B. seventh C. eighth D. ninth 7. A. so B. if C. then D. because 8. A. neighborhood B. familyC. school D. grade 9. A. excited B. tiredC. pleased D. worried 10. A. think about B. hear fromC. agree with D. look for 11. A. alone B. away C. almost D. already

相关主题
文本预览
相关文档 最新文档