A constraints satisfaction approach to the robust spanning tree problem with interval data
- 格式:pdf
- 大小:194.20 KB
- 文档页数:8
ArticleLanguage. Articles must be written in good EnglishLength. Articles should be as concise as possible. Regular articles should not exceed 25 standard manuscript pages and short communications should not exceed 10 standard manuscript pages, including tables and figures. A standard manuscript page is A4 or letter size, text with 1.5 line spacing, 12 pt font and ample margins. In exceptional cases the Editors may waive this requirement. Supplementary material is allowed, which will be available in e-version only.Supplementary material.The authors of accepted papers can be allowed to have some supplementary material, such as large data tables, appendices, or long proofs of theorems, published online alongside the electronic version of the paper in Elsevier web products, including ScienceDirect: . These materials would not appear in the printed version.The supplementary material should be included in any reviewed version of the submission. In order to ensure that submitted material is directly usable, the data should be provided in one of the recommended file formats. Authors should submit the material in electronic format together with the article and supply a concise and descriptive caption for each file via EES. The main body of the paper should reference the supplementary material.Title. Concise and informative. Avoid abbreviations and formulae.Author names and affiliations. Where the family name may be ambiguous (e.g., a double name), please indicate this clearly using appropriate script (capital cases as first letter of authors' first and surnames followed by lower cases). Present the Authors' affiliation addresses (where the actual work was done) below the names. Indicate all affiliations with a lower-case superscript letter immediately after the Author's name and in front of the appropriate address. Provide the full postal address of each affiliation, including the country name, and, if available, the e-mail address of each Author.Corresponding Author. Clearly indicate who is willing to handle correspondence at all stages of refereeing and publication, also post-publication. Ensure that telephone and fax numbers (with country and area code) are provided in addition to the e-mail address and the complete postal address.Abstract. An abstract of between 50 and 250 words should state the purpose of the research and the main results. An abstract is often presented separate from the article, so it must be able to stand alone. Abstracts should not contain formulae.Keywords. Must be included and at least the first one should be selected from the list below. Some keywords from outside the list may be added but the total number of keywords should not exceed five. The letters before the keywords are those of the surnames of the three editors. The paper is submitted to the editor whose initial is given before the first keyword selected from the list.Illustrations.Graphics files can be uploaded via /ejor A guide onelectronic artwork is available on /artworkReferencesAll citations in the text should refer to:- Single Author: the Author's name (without initials, unless there is ambiguity) and the year of publication;- Two Authors: both Authors' names and the year of publication;- Three or more Authors: first Author's name followed by "et al." and the year of publication. Examples: "as demonstrated in (Allan, 1996a, 1996b, 1999; Allan and Jones, 1995). Lee et al. (2000) have recently shown"In the references list references should be arranged first alphabetically and then further sorted chronologically if necessary. More than one reference from the same Author(s) in the same year must be identified by the letters "a", "b", "c", etc., placed after the year of publication. Examples:Reference to a journal publication:-Griffiths W, Judge G. Testing and estimating location vectors when the error covariance matrix is unknown. Journal of Econometrics 1992;54; 121-138 (note that journal names are not to be abbreviated).Reference to a book:-Hawawini G, Swary I. Mergers and acquisitions in the U.S. banking industry: Evidence from the capital markets. North-Holland: Amsterdam; 1990.Reference to a chapter in an edited book:-Brunner K, Melzer AH 1990. Money Supply. In: Friedman BM, Hahn FH (Eds), Handbook of monetary economics, vol.1. North-Holland: Amsterdam; 1990. p. 357-396.Citing and listing of Web references. As a minimum, the full URL should be given. Any further information, if known (Author names, dates, reference to a source publication, etc.), should also be given. Web references can be listed separately (e.g., after the reference list) under a different heading if desired, or can be included in the reference list.Submission checklist" One Author designated as corresponding Author:" E-mail address" Full postal address" Telephone and fax numbers" All necessary files have been uploaded" Keywords" All figure captions" All tables (including title, description, footnotes)" Manuscript has been "spellchecked"" References are in the correct format for this journal" All references mentioned in the Reference list are cited in the text, and vice versa" Permission has been obtained for use of copyrighted material from other sources (including the Web)List of keywords*" (A) Applied probability" (A) Artificial intelligence" (S) Assignment" (D) Auctions/bidding" (S) Branch and bound" (P) Business process reengineering" (S) Combinatorial optimization" (S) Complexity theory" (A) Computing science" (B) Conic programming" (B) Constraints satisfaction" (A) Control" (B) Convex programming" (P) Cost benefit analysis" (S) Cutting" (D) Data envelopment analysis" (A) Data mining" (A) Decision support systems" (D) Decision analysis" (P) Distributed decision making" (B) Distribution" (A) Dynamic programming" (D) E-commerce" (P) Economics" (D) Education" (D) Environment" (S) Evolutionary computations" (S) Expert systems" (B) Facilities planning and design" (P) Finance" (P) Flexible manufacturing systems" (A) Forecasting" (B) Fractional programming" (S) Fuzzy sets" (A) Game theory" (P) Gaming" (S) Genetic algorithms" (B) Geometric programming" (B) Global optimization" (B) Goal programming" (A) Graph theory" (S) Group decisions and negotiations " (S) Heuristics" (D) Human resources" (B) Integer programming" (B) Interior point methods" (B) Inventory" (P) Investment analysis" (S) Knowledge-based systems" (B) Large scale optimization" (A) Linear programming" (A) Location" (P) Logistics" (A) Maintenance" (P) Manufacturing" (D) Marketing" (A) Markov processes" (S) Metaheuristics" (P) Modelling systems and languages " (S) Multi-agent systems" (S) Multiple criteria analysis" (B) Multiple objective programming " (A) Multivariate statistics" (A) Network flows" (A) Neural networks" (B) Nonlinear programming" (P) Organization theory" (A) OR in agriculture" (A) OR in airlines" (P) OR in banking" (A) OR in biology" (A) OR in developing countries" (A) OR in energy" (D) OR in government" (B) OR in health services" (P) OR in manpower planning" (B) OR in medicine" (B) OR in military" (D) OR in natural resources" (D) OR in research and development " (D) OR in societal problem analysis " (D) OR in strategic planning" (A) OR in telecommunications" (S) Packing" (S) Parallel computing" (B) Parametric programming" (B) Penalty methods" (S) Petri nets" (S) Polyhedra" (P) Pricing" (D) Problem structuring" (P) Production" (D) Productivity and competitiveness " (P) Project management" (S) Project scheduling" (D) Psychology" (P) Purchasing" (B) Quadratic programming" (A) Quality control" (P) Quality management" (A) Queueing" (A) Regression" (A) Reliability" (A) Replacement" (P) Retailing" (D) Revenue management" (P) Risk analysis" (P) Risk management" (A) Robustness and sensitivity analysis " (S) Rough sets" (B) Routing" (P) Scenarios" (S) Scheduling" (S) Search theory" (B) Semi-infinite programming" (S) Simulated annealing" (A) Simulation" (A) Stochastic processes" (A) Stochastic programming" (D) Supply chain management" (P) Systems dynamics" (S) Tabu search" (A) Time series" (S) Timetabling" (B) Traffic" (B) Transportation" (B) Travelling salesman" (P) Uncertainty modelling" (P) Utility theory" (P) Visual interactive modelling*Codes of Editors: (A) - Jesus Artalejo, (B) - Jean-Charles Billaut, (D) - Robert Dyson, (P) - Lorenzo Peccati, (S) - Roman Slowinski。
MINESWEEPER,#MINESWEEPERPreslav Nakov,Zile Wei{nakov,zile}@May14,2003”Hence the easiest way to ensure you always win:100-square board,99mines.”–Chris Mattern,SlashdotAbstractWe address the problem offinding an optimal policy for an agent playing the game of minesweeper.The problem is solved exactly,withinthe reinforcement learning framework,using a modified value iteration.Although it is a Partially Observable Markov Decision Problem(MDP),we show that,when using an appropriate state space,it can be definedand solved as a fully observable MDP.As has been shown by Kaye,checking whether a minesweeper config-uration is non-contradictory is an NP-complete problem.We go a step fur-ther,as we define a corresponding counting problem,named#MINESWEEPER, and prove it is#P-complete.We show that computing both the state tran-sition probability and the state value function requiresfinding the numberof solutions of a system of Pseudo-Boolean equations(or0-1Program-ming),or alternatively,counting the models of a CNF formula describinga minesweeper configuration.We provide a broad review of the existingapproaches to exact and approximate model counting,and show none ofthem matches our requirements exactly.Finally,we describe several ideas that cut the state space dramatically and allow us to solve the problem for boards as large as4×4.We presentthe results for the optimal policy and compare it to a sophisticated prob-abilistic greedy one,which relies on the same model counting idea.Weevaluate the impact of the free-first-move and provide a list of the bestopening moves for boards of different sizes and with different number ofmines,according to the optimal policy.1About minesweeperMinesweeper is a simple game,whose popularity is due to a great extent to the fact that it is regularly included in Microsoft Windows since1991(as imple-mented by Robert Donner and Curt Johnson).Why is it interesting?Because it hides in itself one of the most important problems of the theoretical computer1science:the question whether P=NP.P vs.NP is one of the seven Millennium Prize Problems,for each of which the Clay Mathematics Institute of Cambridge, Massachusetts offers the prize of$1,000,000[2].Its connection to minesweeper has been revealed by Kaye,who proved minesweeper is NP-complete[36],and is further discussed by Stewart in Scientific American[48].Some interesting resources about Minesweeper on the Web include:♦Richard Kaye’s minesweeper pages.Focused on the theory behind the game[35].♦The Authoritative Minesweeper.Tips,downloads,scores,cheats,and info for human players.A world records list is also supported[12].♦Programmer’s minesweeper.Java interface that allows a programmer to implement a strategy for playing minesweeper and then observe how it performs. Contains several ready strategies[6].♦Minesweeper strategies.Tips and tricks for human players[4].♦Minesweeper strategies&tactics.Strategies and tactics for the human player[3].2SAT,k-SATWe start with some definitions.Literals are Boolean variables(with values from{0,1})and their negations.A Boolean formula(or propositional formula)is an expression written using AND,OR,NOT,parentheses and variables.A Boolean formula is in Disjunctive Normal Form(DNF)if it is afinite disjunction(logical OR)of a set of DNF clauses.A DNF clause is a conjunction (logical AND)of afinite set of literals.A Boolean formula is in Conjunctive Normal Form(CNF)if it is afinite con-junction(logical AND)of a set of CNF clauses.A CNF clause is a disjunction (logical OR)of afinite set of literals.Unless otherwise stated,below we will work with Boolean formulas(or just formulas)in CNF.We will use n to refer to the number of variables,and m−to the number of CNF clauses(or just clauses).The Boolean formulas are in the heart of thefirst known NP-complete prob-lem:SAT(Boolean SATisfiability)[24].Given a Boolean formula,it asks whether there exists an assignment that would make it true.The problem remains NP-complete even when is restricted to CNF formulas with up to k variables per clause(a.k.a.k-CNF),provided that k≥3.Some particular cases can be solved in polynomial time: e.g.2SAT,Horn clauses etc.It is inter-esting to note that DNF-SAT is also solved in polynomial time(with respect to the number of clauses,not the variables!):it is enough to check whether there exists a clause without contradictions:it should not contain x and¬x at the same time.SAT is an important problem for logic,circuit design,artificial intelligence,probability theory etc.A major source of information about SAT is the SAT Live!site[8].Ready implementations of different solvers,tools, evaluation,benchmarks etc.can be obtained from SATLIB−The Satisfiability2Library[9].3Related WorkThe most systematic study of minesweeper so far is due to Kaye.He tried to find a strategy such that no risk is taken when there is a square that can be safely uncovered and this led him to the following decision problem definition [36]:MINESWEEPER:Given a particular minesweeper configuration determine whether it is consistent,which is whether it can have arisen from some pattern of mines.In fact,he named the problem minesweeper consistency problem,but we pre-fer to use MINESWEEPER.Kaye studied the following strategy:To determine whether a background square is free we can mark it as a mine.If the resulting configuration is inconsistent(MINESWEEPER solution is NO),the square is safe to uncover.It is easy to see that MINESWEEPER is a special case of SAT,and more precisely,8-SAT,since no clause can contain more than8literals.Is it NP-complete however?This question is reasonable since the MINESWEEPER is a proper subset of8-SAT:only formulas of some special forms are possible. Namely,all clauses can have either only positive or only negative literals.In addition,for each positive-only clause there exists exactly one negative-only one, which contains the same set of variables but all they are negated.Kaye showed how different minesweeper configurations could be used to build logic circuits and thus,to simulate digital computers.For the purpose,he presented configurations that can be used as wire,splitter and crossover,as well as a set of logic gates:NOT,AND,OR and XOR.The presence of AND and NOT gates(or alternatively OR and NOT gates)proves that MINESWEEPER is NP-complete[36,1].This proof follows to a great extent the way Berlekamp and al.[19]proved that the John Conway’s game of life[29]is Turing-complete. In addition,Kaye observed that the game of life is played on an infinite board, so he defined an infinite variant of minesweeper and then proved it is Turing-complete as well[37].There was also some work on the analysis and implementation of real game playing strategies.Adamatzky[13]describes a simple cellular automaton,capa-ble to populate a n×n board in timeΩ(n),with each automaton cell having27 states and25neighbors,including itself.Rhee[46],Quartetti[45]used genetic algorithms.There have been also some attempts to use neural networks(see the term projects[18],[33]).Castillo and Wrobel address the game of minesweeper from a machine learn-ing ing Mio,a general-purpose system for Inductive Logic Pro-gramming(ILP),they learned a strategy,which achieved a”better performance than that obtained on average by non-expert human players”[23].In fact the rules learned by the system were for a board of afixed small size:8×8.The sys-tem learned four Prolog-style rules,exploiting very local information only.Three3of the rules were virtually the same as the most trivial ones as implemented in the equation strategy of PGMS[6].While the average human beginner wins35% of the time,they achieved52%.This rate rose to60%when a very simple local probability,defined separately over each equation is employed to help guessing in case no rule is applicable,thus achieving the win rate of PGMS.In fact,all these attempts,although interesting,are inadequate since solving MINESWEEPER requires global logical inference and model counting.4Pseudo-Boolean problemsShould we necessarily think of minesweeper in terms of Boolean formula satis-fiability?The equations that arise there are of the form:x1+x2+x3+x4+x5+x6+x7+x8=5All the coefficients of the c unknowns on the left-hand-side are1,and the number k on the right-hand-side is an integer,1≤k≤7(we do not allow0 and8,which are trivial),and in addition,k<c.Converting this equation to CNF form requires putting positive-only clauses (e.g.x1∨x3∨x4∨x6∨x7)and C58negative-only counterparts(e.g.¯x1∨¯x3∨¯x4∨¯x6∨¯x7),which is112in total.Do we really need to do this?Cant’t we just look at MINESWEEPER as a constraint satisfaction problem of a special kind and try to solve it more efficiently?Let us explore the options.First,note that SAT(suppose a CNF formula is given)can be written as an Integer Programming problem the following way:minimize wsubject to:w+y i∈c+j x i+y i∈c−j(1−x j)≥1,j=1,···,mx i∈0,1,i=1,···,nw>0(1)where x i(1≤i≤n)are the unknowns,c j(1≤j≤m)are the CNF clauses(c−jand c+j are the sets of the negative and the positive literals of the clause c j),and w is a new variable,added to ensure there will be a feasible solution.The formula is unsatisfiable,if w>0.In the case of minesweeper,we can allow not only1,but also other natural numbers on the right-hand-side of the equation.In addition,we will need to put two inequalities for each equation,but we will have no combinatorial explosion of clauses:x1+x2+x3+x4+x5+x6+x7+x8≤5x1+x2+x3+x4+x5+x6+x7+x8≥5(2)In fact,the reduction,although instructive,is not necessarily the best one. Note,that in SAT we do not have a function to optimize.In addition,in4MINESWEEPER we always have only equalities(for the local constraints), which means we need to solve a system of linear equations.In general this can be done inΘ(n3)using Gaussian elimination.In the discrete case this is a problem of solving a system of Diophantine equations,which is an NP-complete problem.Note however,that our problem is more restricted than that:we want the values to be not just integers,but0or1.So,looking back at the Integer Programming problem with inequalities,we can see it is a special instance,namely a pure0-1programming,where the variable values are restricted to0and1.It is natural to allow the inequalities to accept natural numbers other than1on the right-hand-side.If in addition we allow the coefficients to take integer values other than0and1,we obtain the Pseudo-Boolean problem.The problem can be solved by means of pure Boolean methods(e.g.a Pseudo-Boolean version of relsat[26],randomization[44],or DPLL techniques [14])or using Integer Programming techniques(e.g.branch-and-bound and branch-and-cut[26],[15]]).In the latter case it would be more natural to name the problem0-1programming.A good source of information about solving pseudo-Boolean problems is PBLIB−the Pseudo-Boolean Library[5],where can be found several useful links to sites,papers,people,implementations,tools etc.5Model counting5.1#P,#P-complete,#SAT,#k-SATIt has been realized that there are problems,primarily in AI,that require not satisfiability checking but rather counting the number of modelsµ(F)of a CNF formula F.A broad overview and several examples can be found in[47].Some of these include:computing the number of solutions of a constraint satisfac-tion problem,inference in Bayesian networks,finding the degree of belief of a propositional statement s with respect to a knowledge base B(computed asµ(B+{s})/µ(B)),estimating the utility of an approximate theory A with respect to the original one O(i.e.µ(O)−µ(T)),etc.For the purpose of studying counting problems Valiant[49]introduced some new complexity classes:#P,#P-complete etc.Formally,a problem is in#P if there is a nondeterministic,polynomial-time Turing machine that,for each instance of the problem,has a number of accepting computations,exactly equal to the number of its distinct solutions.It is easy to see that a#P problem is necessarily at least as hard as the corresponding NP problem:once we know the number of solutions,we can just compare this number to zero and thus solve the NP problem.So,any#P problem,corresponding to an NP-complete problem,is NP-hard.In addition to#P,Valiant defined the class#P-complete.A problem is in#P-complete,if and only if it is in#P,and every other problem in#P can be reduced to it in polynomial time.The problem offinding the number of models of a Boolean formula is known5in general as #SAT.Just like SAT,#SAT has versions #2SAT,#3SAT,#k -SAT etc.,where the maximum number of literals in a clause is limited by k .#SAT has been shown to be #P-complete in general [49]and thus,it is neces-sarily exponential in the worst case (unless the complexity hierarchy collapses).Note,that while 2SAT is solvable in polynomial time,#2SAT is #P-complete (and so is k -SAT,k ≥2).5.2#MINESWEEPERWhile Kaye [36]addressed the problem of finding sure squares only,this is not enough to play good,since most of the real game configurations will have no sure cases.So,we face a problem of reasoning under uncertainty.A reasonable heuristic would be to probe the square with the lowest probability to contain a mine,thus acting greedily.This probability can be estimated as the ratio of the number of different mine distributions in which a fixed background square contains a mine to the number of all different mine distributions consistent with the current configuration.So,we need to define the #MINESWEEPER problem:#MINESWEEPER:Given a particular minesweeper configuration find the number of the possible mine distributions over the covered squares.Theorem 5.1.#MINESWEEPER is #P-complete.Proof.Specialization to #CIRCUIT-SAT.#CIRCUIT-SAT is the basic #P-complete problem and is defined as fol-lows:given a circuit,find the number of inputs that make it output ing the Kaye’s result that a minesweeper configuration can represent logical gates,wires,splitters and crossovers,it follows that any program that tries to solve #MINESWEEPER should be able to solve any logical circuit,i.e.solve #CIRCUIT-SAT.The theorem above means we cannot hope to solve #MINESWEEPER in polynomial time unless all problems in #P-complete can be solved in polynomial time,i.e.unless P=NP.But we can try to approximate it.5.2.1Inclusion-exclusion principleBefore we present some of the most interesting approaches so far,it is interesting to say that the problem of model counting is a special case of a more general problem (again #P-complete):finding the probability of the union of n events from some probability space.More formally,let A 1,A 2,...,A m be events from s probability space and we want to find P ( A i ).This is given by the inclusion-exclusion formula:P (m i =1A i )=m i =1P (A i )− 1≤i ≤j ≤m P (A i ∩A j )+ 1≤i ≤j ≤k ≤mP (A i ∩A j ∩A k )−···+(−1)m −1P (A 1∩A 2···∩A m )6This formula contains2m−1terms,and thus the direct evaluation would take exponential time.In addition,it cannot be calculated exactly in the absence of any of these terms.So,here comes the natural question:how well it can be approximated?It is interesting to note,that this is an old question,first studied by Boole[21].In fact,he was interested in approximating the size of the union of a family of sets,which is essentially an equivalent problem;we just have to divide by2m,since:|mi=1A i|=mi=1|A i|−1≤i≤j≤m|A i∩A j|+1≤i≤j≤k≤m|A i∩A j∩A k|−···+(−1)m−1|A1∩A2···∩A m|How do we go fromfinding the probability of the union of probabilistic events to#SAT?Let’s assume a probability space{0,1}n under uniform distribution and let A i be the event that the i-th clause(1≤i≤m)in a DNF formula is satisfied.Then|A i|is the number of models of the i-th clause and|∪m i=1A i|is the number of models of all clauses in the formula F,i.e.the number of models µ(F)of F.But we are interested in CNF,not in DNF.While these forms are equivalent, it is not straightforward to see how the inclusion-exclusion principle can be applied to CNF.The solution comes easily though,if we revise the definition of A i:now it will be the event that the i-th clause in the CNF formula is falsified. So,|A i|will be the number of falsifiers of the i-th clause,and|∪m i=1A i|−the number of falsifiers of all CNF clauses in F,i.e.(2m−µ(F)).Note,that the uniform distribution over{0,1}n for DNF(or equivalently for CNF)is in fact a special case.As we will see below,#SAT is much easier problem thanfinding the probability of the union of probabilistic events.5.3Approximation algorithms5.3.1Luby&Velickovic algorithmLuby and Velickovic[41]try to approximate the proportion of truth assignments of the n variables that satisfy the DNF formula F,i.e.P(F)=µ(F)/2n.They propose a deterministic algorithm,which for any andδ( >0,δ>0) calculates an estimate Q,such that(1−δ)P r(F)− ≤Q≤P(F)+ .The running time of this algorithm is polynomial in nm,multiplied by the term:log(nm)δ log2(m/ )δIf the maximum clause length is t,then for anyβ(β>0)the deterministic algorithmfinds an approximation satisfying:(1−β)P(F)≤Q≤(1+β)P(F). This time the algorithm is polynomial in nm,multiplied by:max(t,log n)β t2β75.3.2Inclusion-exclusion algorithmsLinial and Nisan addressed the more general problem of finding the size of the union of a family of sets [38].They showed that it could be approximated effectively by using only part of the intersections.In particular,if the sizes of all intersections of at most k distinct sets are known,and k ≥Ω(√m ),then the union size can be approximated within e −Ω(k/√m ).The bound above has been shown to be non-tight,and has been improved by Kahn&al.[34]to e −˜Ω(k/√m ).This resulted in a polynomial time approximation algorithm,which is optimal in a sense that this bound is tight and in general,it is impossible to achieve a better approximation (regardless of the complexity).Note that this is only an approximation,and there can be an error even when k =m −1.5.4Exact algorithms 5.4.1Exponential algorithmsThe obvious and naive implementation of exact model counting enumerates all the models and thus runs in time O (m 2n ).Can we do better?If we restrict the maximum number of literals contained in a single clause to r ,then we can decrease the basis of the exponention to αr ,the unique positive root of thepolynomial y r −y r −1−...−1.This gives an algorithm running in time O (mαn r ).Note,that this is generally better than the naive approach since α2≈1.62,α3≈1.84,α4≈1.93,...,but in the limit (r →∞)we still have αr →2n .This algorithm,as described,has been proposed by Dubois [27].A similar approach (with similar complexity)has been presented by Zhang [50].Despite the improvement though,the complexity remains exponential (and there is no improvement in the limit).5.4.2Lozinskii algorithmIwama [30]used model counting together with formula falsification to solve SAT.He achieved a polynomial average running time O (m c +1),where c is a constant such that ln c ≥ln m −p 2n ,where p is the probability that a particular literal appears in a particular clause in the target formula.Lozinskii [40]introduced a similar algorithm but for #SAT.Under reason-able assumptions,it calculates the exact number of models in time O (m c n ),where c ∈O (log m ).5.4.3CDP algorithmBirnbaum and Lozinskii [20]proposed an algorithm,named CDP (Counting by Davis-Putnam ),for counting the number of models of a Boolean CNF formula exactly,based on a modification of the classic Davis-Putnam procedure [25].There are two essential changes:first,the pure literal rule is omitted (otherwise some solutions will be missed),and second,in case the formula is satisfied by8setting values to only part of the variables,the remaining unset i variables can take all possible values and thus,the value2i is returned.The average running time of CDP is O(m d n),where d= −1/log2(1−p) and p is the probability that a particular literal appears in a particular clause in the target formula.5.4.4DDP algorithmThe DDP(Decomposing Davis-Putnam)algorithm has been proposed by Ba-yardo and Pehoushek[16].Just like CDP[20],it is an extension of the Davis-Putnam(DP)procedure[25].The idea is to identify groups of variables that are independent of each other,which is essentially a kind of divide and conquer approach:find the connected components of the graph,solve the problem for each of them independently and then combine the results.This is a fairly old idea,already exploited for SAT as well as for more general constraint satisfac-tion problems by Freuder and Quinn[28]and many other researchers thereafter. The idea of DDP is to apply it recursively,as opposed to just once at the begin-ning.During the recursive exploration of the graph DP assigns values to some of the variables and thus breaks it into separate connected components,which are in turn identified and exploited,which dramatically cuts the calculations.The algorithm is implemented as an extension of version2.00of the relsat algorithm[17]and the source code is available on the Internet[7].The rel-sat algorithm is an extension of the Davis-Putnam-Logman-Loveland(DPLL) method[39],and it dynamically learns nogoods(the only difference with DPLL), which can dramatically speed it up:in case the variables form a chain,SAT can be solved in quadratic time.For#SAT though one would need also to record the goods(this was not implemented and tested).In the case of DDP this would lead to quadratic complexity for solving#SAT even in case the variables form a tree.Anyway,in some hard cases(e.g.when m/n≈4)the complexity will be necessarily exponential.Performance evaluation tests show DDP is orders of magnitude faster than CDP both on random instances,as well as on benchmarks like the ones in SATLIB[9].5.4.5Inclusion-exclusion algorithms specialized to#SATLooking carefully at the general problem offinding the size of the union of a family of sets(see5.2.1above),Kahn&al.[34]found that in the special case of #SAT it is enough to require to know the number of models for every subset of up to1+log n clauses.Moreover,this has been proved tofix the number of satisfied assignments for the whole formula,and thus allows the calculation of the exact number of models,not just approximate it.Note also,that this is dependent on the number of the variables only,and not on the number of the clauses!The latter proof of Kahn&al.was only existentional,and they did not come up with an algorithm.It has been found by Melkman and Shimony[42].9The result has been extended thereafter and specialized to#k-SAT,k≥2, as Iwama and Matsuura showed that n can be changed to k and that only the clauses of some fixed size matter(thus,we should not require to know the ones of smaller size).So,for#k-SAT knowing the number of models for all clauses of size exactly2+ log k has been proved to be both necessary and sufficient[30]. Unfortunately,the latter proof is only existentional and there is no practical algorithm developed yet.5.5What are we using?Initially,we used the Boolean Constraint Solver library module[10],provided with SICStus Prolog[11].It is an instance of the general Constraint Logic Pro-gramming scheme introduced in[32].The internal representation of the Boolean functions is based on Boolean Decision Diagrams as described in[22]and it al-lows using linear equalities as constraints,which saved us from combinatorial explosions of the number of clauses needed.While we found it fairly fast for solving MINESWEEPER,it was not suitable for#MINESWEEPER:the only way tofind the number of satisfying assignments was to enumerate them all, which was unacceptable.We then performed an extensive study of the literature on model counting and adopted another approach.Currently,we use the implementation of DDP(see5.4.4)as provided in version2.00of the relsat algorithm[7].We apply the model counting for rea-soning with incomplete information:when we cannot logically conclude that neither there is nor there is not a mine in a particular square x,we solve the #MINESWEEPER problem twice:once with a mine put in x and once for the original configuration,and then take their ratio.We could also use an approximate inclusion-exclusion algorithm,as special-ized to#SAT by Melkman and Shimony[42],or even the one of Iwama and Matsuura[31],if a practical implementation is found in the future.In addition, we think in the case of#MINESWEEPER it might be possible to improve the result of Iwama and Matsuura further because of the special CNF structure:all clauses are paired positive-negative,and for each one with only positive literals there is another one that contains the same variables but all negated.Looking from the Pseudo-Boolean perspective,the problem is still a proper subset of it: we do not need inequalities,and we do not have coefficients,other than0and 1.Ideally,solving#MINESWEEPER would be done by means of model count-ing for Pseudo-Boolean formulas:e.g.a combination between the Boolean Con-straint Solver library module of SICStus Prolog and the relsat model counter. 6MinesweeperMinesweeper is a simple game whose goal is to clear all squares on a board that don’t contain mines.The basic elements of the game are:playing board, remaining mines counter,and a timer.The rules are the following:10♦The player can uncover an unknown square by clicking on it.If it contains a mine,the game is lost.♦If the square is free,a number between0and8appears on it,indicating how many mines are there in its immediate neighbor squares.♦A square can be marked as a mine by right clicking on it.♦The game is won when all non-mine squares are uncovered.Minesweeper is an imperfect information computer game.The player probes on a n×n map,and has to make decisions based on the current observations only,without knowing which squares contain mines and which are free.The only additional information provided is the total number of remaining mines. If the probed square contains a mine,the player loses the game,otherwise the number of mines in the adjacent squares(up to8)appears in the probed square. In most computer implementations(and in our interpretation in particular)the player never dies on thefirst probe,but,as we will see below,this doesn’t change the complexity of the problem substantially.We assign a{0/1}variable to each unknown square in the map:the value is1,if it contains a mine,and0−otherwise.All variables must obey the constraints imposed by the numbers in the uncovered squares,as well as the global constraint on the remaining number of mines,namely:for each known square i:X j∈N(i)X j=O(i)X j X j=M(3)Where N(i)is the set of the neighbors(up to8)of the known square i,O(i)is the value of the uncovered square i and M is total number of mines in the map. An example follows:X1X21X3X4212mThe following equations correspond to the configuration above:X2+X4=1X3+X4=1X1+X2+X3+X4=M(4)When M=1or M=3,there is a unique solution,but in case M=2,there are two:X2=X3=1and X1=X4=1.Let X be the set of all solutions of(3).Then we can easily compute for each square the probability that it contains a mine,as a simple ratio:P(X i=1)=|{X:X∈X,X(i)=1}||X|11Figure1:Optimal policy on2×4map with2minesSo,when M=2we have P(X1=1)=1/2.In order for a square to be a mine or free for sure,its value in all solutions should be1or0,respectively.Once we have computed the probability of containing a mine for each square, we can come up with a fairly simple greedy policy that always tries the safest square.We will show,this is not always optimal.Consider the2×4map with 2mines,shown onfigure(1).There are5unknown squares,2probed free ones and one sure mine.(The mine was not probed,of course,but was discovered as a result of logical inference.)P(X1,1=1)=P(X1,2=1)=P(X2,1=1)=1/3, but P(X1,3=1)=P(X2,3)=1/2.Although(1,1),(1,2)and(2,1)are safer than(1,3)and(2,3),the optimal policy probes(1,3).Let’s try to analyze the probability that the optimal policy will win on this map.At thefirst step,probing at(1,3)will succeed with probability1/2. There are two possible outcomes after a successful probe at(1,3).If the square (1,3)contains3(this will happen with probability1/3),then the other squares containing a mine will be(1,2)and(2,3),and so the game will be won.If(1,3) contains2(which has a probability2/3),the square(1,2)is surely not a mine and(2,3)is surely a mine.Hence,the agent has to guess the position of the last mine by choosing between(1,1)and(2,1)only,and thus will succeed with probability1/2.So the total probability that the optimal policy will win at this configuration is1/2×(1/3+1/2×2/3)=1/3.If the agent follows the greedy policy though,then he has to choose between (1,2),(1,1)and(2,1),which look equally good.Suppose the agent probes(1,2). Then he will succeed with probability2/3.But in the outcome he has to make two guesses,between(1,3)and(2,3),and between(1,1)and(2,1).Hence it can succeed with probability1/4.If the agent probes(1,1)or(2,1),and tries (1,3),which is from the optimal policy,the possibility that it can succeed is at most1/2.So the expected probability that the greedy policy wins at this configuration is1/3(2/3×1/4+2/3×1/2+2/3×1/2)=2/9,which is less than for the optimal policy.12。
Journal of Artificial Intelligence Research 4 (1996) 419-443Submitted 2/96; published 6/96A Principled Approach Towards SymbolicGeometric Constraint SatisfactionSanjay Bhansali BHANSALI@ School of EECS, Washington State UniversityPullman, WA 99164-2752Glenn A. Kramer GAK@ Enterprise Integration Technologies, 800 El Camino RealMenlo Park, CA 94025Tim J. Hoar TIMHOAR@ Microsoft CorporationOne Microsoft Way, 2/2069Redmond, WA 98052AbstractAn important problem in geometric reasoning is to find the configuration of a collection of geometric bodies so as to satisfy a set of given constraints. Recently, it has been suggested that this problem can be solved efficiently by symbolically reasoning about geometry. This approach, called degrees of freedom analysis, employs a set of specialized routines called plan fragments that specify how to change the configuration of a set of bodies to satisfy a new constraint while preserving existing constraints. A potential drawback, which limits the scalability of this approach, is concerned with the difficulty of writing plan fragments. In this paper we address this limitation by showing how these plan fragments can be automatically synthesized using first principles about geometric bodies, actions, and topology.1. IntroductionAn important problem in geometric reasoning is the following: given a collection of geometric bodies, called geoms, and a set of constraints between them, find a configuration– i.e., position, orientation, and dimension – of the geoms that satisfies all the constraints. Solving this problem is an integral task for many applications like constraint-based sketching and design, geometric modeling for computer-aided design, kinematics analysis of robots and other mechanisms (Hartenberg & Denavit, 1964), and describing mechanical assemblies.General purpose constraint satisfaction techniques are not well suited for solving constraint problems involving complicated geometry. Such techniques represent geoms and constraints as algebraic equations, whose real solutions yield the numerical values describing the desired configuration of the geoms. Such equation sets are highly non-linear and highly coupled and in the general case require iterative numerical solutions techniques. Iterative numerical techniques are not particularly efficient and can have problems with stability and robustness (Press, Flannery, Teukolsky & Vetterling, 1986). For many tasks (e.g., simulation and optimization of mechanical devices) the same equations are solved repeatedly which makes a compiled solution desirable. In theory, symbolic manipulation of equations can often yield a non-iterative, closed form solution. Once found, such a closed-form solution can be executed very efficiently.B HANSALI, K RAMER &H OARHowever, the computational intractability of symbolic algebraic solution of the equations renders this approach impractical (Kramer, 1992; Liu & Popplestone, 1990).In earlier work Kramer describes a system called GCE that uses an alternative approach called degrees of freedom analysis (1992, 1993). This approach is based on symbolic reasoning about geometry, rather than equations, and was shown to be more efficient than systems based on algebraic equation solvers. The approach uses two models. A symbolic geometric model is used to reason symbolically about how to assemble the geoms so as to satisfy the constraints incrementally. The "assembly plan" thus developed is used to guide the solution of the complex nonlinear equations - derived from the second, numerical model - in a highly decoupled, stylized manner.The GCE system was used to analyze problems in the domain of kinematics and was shown to perform kinematics simulation of complex mechanisms (including a Stirling engine, an elevator door mechanism, and a sofa-bed mechanism) much more efficiently than pure numerical solvers (Kramer, 1992). The GCE has subsequently been integrated in a commercial system called Bravo TM by Applicon where it is used to drive the 2D sketcher (Brown-Associates, 1993). Several academic systems are currently using the degrees of freedom analysis for other applications like assembly modeling (Anantha, Kramer & Crawford, 1992), editing and animating planar linkages (Brunkhart, 1994), and feature-based design (Salomons, 1994; Shah & Rogers, 1993).GCE employs a set of specialized routines called plan fragments to create the assembly plan.A plan fragment specifies how to change the configuration of a geom using a fixed set of operators and the available degrees of freedom, so that a new constraint is satisfied while preserving all prior constraints on the geom. The assembly plan is completed when all constraints have been satisfied or the degrees of freedom is reduced to zero. This approach is canonical: the constraints may be satisfied in any order; the final status of the geom in terms of remaining degrees of freedom is the same (p. 80-81, Kramer, 1992). The algorithm for finding the assembly procedure has a time complexity of O(cg) where c is the number of constraints and g is the number of geoms (p. 139, Kramer, 1992).Since the crux of problem-solving is taken care of by the plan fragments, the success of the approach depends on one’s ability to construct a complete set of plan fragments meeting the canonical specification. The number of plan fragments needed grows geometrically as the number of geoms and constraints between them increase. Worse, the complexity of the plan fragments increases exponentially since the various constraints interact in subtle ways creating a large number of special cases that need to be individually handled. This is potentially a serious limitation in extending the degrees of freedom approach. In this paper we address this problem by showing how plan fragments can be automatically generated using first principles about geoms, actions, and topology.Our approach is based on planning. Plan fragment generation can be reduced to a planning problem by considering the various geoms and the invariants on them as describing a state. Operators are actions, such as rotate, that can change the configuration of geoms, thereby violating or achieving some constraint. An initial state is specified by the set of existing invariants on a geom and a final state by the additional constraints to be satisfied. A plan is a sequence of actions that when applied to the initial state achieves the final state.With this formulation, one could presumably use a classical planner, such as STRIPS (Fikes & Nilsson, 1971), to automatically generate a plan-fragment. However, the operators in this domain are parametric operators with a real-valued domain. Thus, the search space consists of an infinite number of states. Even if the real-valued domain is discretized by considering real-valued intervals there is still a very large search space and finding a plan that satisfies theP RINCIPLED S YMBOLIC G EOMETRIC C ONSTRAINT S ATISFACTIONspecified constraints would be an intractable problem. Our approach uses loci information (representing a set of points that satisfy some constraints)to reason about the effects of various operators and thus reduces the search problem to a problem in topology, involving reasoning about the intersection of various loci.An issue to be faced in using a conventional planner is the frame problem: how to determine what properties or relationships do not change as a result of an action. A typical solution is to use the assumption: an action does not modify any property or relationship unless explicitly stated as an effect of the action. Such an approach works well if one knows a priori all possible constraints or invariants that might be of interest and relatively few constraints get affected by each action - which is not true in our case. We use a novel scheme for representing effects of actions. It is based on reifying (i.e., treating as first class objects) actions in addition to geometric entities and invariant types. We associate, with each pair of geom and invariants, a set of actions that can be used to achieve or preserve that invariant for that geom. Whenever a new geom or invariant type is introduced the corresponding rules for actions that can achieve/preserve the invariants have to be added. Since there are many more invariant types than actions in this domain, this scheme results in simpler rules. Borgida, Mylopoulos & Reiter (1993) propose a similar approach for reasoning with program specifications. A unique feature of our work is the use of geometric-specific matching rules to determine when two or more general actions that achieve/preserve different constraints can be reformulated to a less general action.Another shortcoming of using a conventional planner is the difficulty of representing conditional effects of operators. In GCE an operation’s effect depends on the type of geom as well as the particular geometry. For example, the action of translating a body to the intersection of two lines on a plane would normally reduce the body’s translational degrees of freedom to zero; however, if the two lines happen to coincide then the body still retains one degree of translational freedom and if the two lines are parallel but do not coincide then the action fails. Such situations are called degeneracies. One approach to handling degeneracies is to use a reactive planner that dynamically revises its plan at run-time. However, this could result in unacceptable performance in many real-time applications. Our approach makes it possible to pre-compile all potential degeneracies in the plan. We achieve this by dividing the planning algorithm into two phases. In the first phase a skeletal plan is generated that works in the normal case and in the second phase, this skeletal plan is refined to take care of singularities and degeneracies. The approach is similar to the idea of refining skeletal plans in MOLGEN (Friedland, 1979) and the idea of critics in HACKER (Sussman, 1975) to fix known bugs in a plan. However, the skeletal plan refinement in MOLGEN essentially consisted of instantiating a partial plan to work for specific conditions, whereas in our method a complete plan which works for a normal case is extended to handle special conditions like degeneracies and singularities. 1.1 A Plan Fragment Example.We will use a simple example of a plan fragment specification to illustrate our approach. Domains such as mechanical CAD and computer-based sketching rely heavily on complex combinations of relatively simple geometric elements, such as points, lines, and circles and a small collection of constraints such as coincidence, tangency, and parallelism. Figure 1 illustrates some fairly complex mechanisms (all implemented in GCE) using simple geoms and constraints.B HANSALI, K RAMER &H OARStirling EngineFigure 1. Modeling complex mechanisms using simple geoms and constraints. All the constraintsneeded to model the joints in the above mechanisms are solvable using the degrees of freedom approach.Our example problem is illustrated in Figure 2 and is specified as follows:Geom-type: circleName: $cInvariants: (fixed-distance-line $c $L1 $dist1 BIAS_COUNTERCLOCKWISE)To-be-achieved: (fixed-distance-line $c $L2 $dist2 BIAS_CLOCKWISE)In this example, a variable-radius circle $c1 has a prior constraint specifying that the circle is at a fixed distance $dist1 to the left of a fixed line $L1 (or alternatively, that a line drawn parallelto $L1 at a distance $dist1 from the center of $c is tangent in a counterclockwise direction to the circle). The new constraint to be satisfied is that the circle be at a fixed distance $dist2 to the1We use the following conventions: symbols preceded by $ represent constants, symbols preceded by ?represent variables, expressions of the form (>> parent subpart) denote the subpart of a compound term, parent.P RINCIPLED S YMBOLIC G EOMETRIC C ONSTRAINT S ATISFACTION To solve this problem, three different plans can be used: (a) translate the circle from its current position to a position such that it touches the two lines $L2’ and $L1’ shown in the figure (b) scale the circle while keeping its point of contact with $L1’ fixed, so that it touches $L2’ (c) scale and translate the circle so that it touches both $L2’ and $L1’.Each of the above action sequences constitute one plan fragment that can be used in the above situation and would be available to GCE from a plan-fragment library. Note that some of the plan fragments would not be applicable in certain situations. For example, if $L1 and $L2 are parallel, then a single translation can never achieve both the constraints, and plan-fragment (a) would not be applicable. In this paper we will show how each of the plan-fragments can be automatically synthesized by reasoning from more fundamental principles.The rest of the paper is organized as follows: Section 2 gives an architectural overview of the system built to synthesize plan fragments automatically with a detailed description of the various components. Section 3 illustrates the plan fragment synthesis process using the example of Figure 2. Section 4 describes the results from the current implementation of the system. Section 5 relates our approach to other work in geometric constraint satisfaction. Section 6 summarizes the main results and suggests future extensions for this work.2. Overview of System ArchitectureFigure 3 gives an overview of the architecture of our system showing the various knowledge components and the plan generation process. The knowledge represented in the system is broadly categorized into a Geom knowledge-base that contains knowledge specific to particular geometric entities and a Geometry knowledge-base that is independent of particular geoms and can be reused for generating plan fragments for any geom.Figure 3. Architectural overview of the plan fragment generator2.1 Geom Knowledge-baseThe geom specific knowledge-base can be further decomposed into seven knowledge components.B HANSALI, K RAMER &H OAR2.1.1A CTIONSThese describe operations that can be performed on geoms. In the GCE domain, three actions suffice to change the configuration of a body to an arbitrary configuration: (translate g v) which denotes a translation of geom g by vector v; (rotate g pt ax amt) which denotes a rotation of geom g, around point pt, about an axis ax, by an angle amt; and (scale g pt amt) where g is a geom, pt is a point on the geom, and amt is a scalar. The semantics of a scale operation depends on the type of the geom; for example, for a circle, a scale indicates a change in the radius of the circle and for a line-segment it denotes a change in the line-segment’s length. Pt is the point on the geom that is fixed (e.g., the center of a circle).2.1.2 I NVARIANTSThese describe constraints to be solved for the geoms. The initial version of our system has been designed to generate plan fragments for a variable-radius circle and a variable length line-segment on a fixed workplane, with constraints on the distances between these geoms and points, lines, and other geoms on the same workplane. There are seven invariant types to represent these constraints. Examples of two such invariants are:• (Invariant-point g pt glb-coords) which specifies that the point pt of geom g is coincident with the global coordinates glb-coords, and• (Fixed-distance-point g pt dist bias) which specifies that the geom g lies at a fixed distance dist from point pt; bias can be either BIAS_INSIDE or BIAS_OUTSIDE depending on whether g lies inside or outside a circle of radius dist around point pt.2.1.3 L OCIThese represent sets of possible values for a geom parameter, such as the position of a point on a geom. The various kinds of loci can be grouped into either a 1d-locus (representable by a set of parametric equations in one parameter) or a 2d-locus (representable by a set of parametric equations in two variables). For, example a line is a 1d locus specified as (make-line-locus through-pt direc) and represents an infinite line passing through through-pt and having a direction direc. Other loci represented in the system include rays, circles, parabolas, hyperbolas, and ellipses.2.1.4 M EASUREMENTSThese are used to represent the computation of some function, object, or relationship between objects. These terms are mapped into a set of service routines which get called by the plan fragments. An example of a measurement term is: (0d-intersection 1d-locus1 1d-locus2). This represents the intersection of two 1d-loci. In the normal case, the intersection of two 1-dimensional loci is a point. However, there may be singular cases, for example, when the two loci happen to coincide; in such a case their intersection returns one of the locus instead of a point. There may also be degenerate cases, for example, when the two loci do not intersect; in such a case, the intersection is undefined. These exceptional conditions are also represented with each measurement type and are used during the second phase of the plan generation process to elaborate a skeletal plan (see Section 3.3).P RINCIPLED S YMBOLIC G EOMETRIC C ONSTRAINT S ATISFACTION2.1.5 G EOMSThese are the objects of interest in solving geometric constraint satisfaction problems. Examples of geoms are lines, line-segments, circles, and rigid bodies. Geoms have degrees of freedoms which allow them to vary in location and size. For example, in 3D-space a circle with a variable radius, has three translational, two rotational, and one dimensional degree of freedom.The configuration variables of a geom are defined as the minimal number of real-valued parameters required to specify the geometric entity in space unambiguously. Thus, a circle has six configuration variables (three for the center, one for the radius, and two for the plane normal). In addition, the representation of each geom includes the following:•name: a unique symbol to identify the geom;•action-rules: a set of rules that describe how invariants on the geom can be preserved or achieved by actions (see below);•invariants: the set of current invariants on the geom;•invariants-to-be-achieved: the set of invariants that need to be achieved for the geom.2.1.6 A CTION R ULESAn action rule describes the effect of an action on an invariant. There are two facts of interest to a planner when constructing a plan: (1) how to achieve an invariant using an action and (2) how to choose actions that preserve as many of the existing invariants as possible. In general, there are several ways to achieve an invariant and several actions that will preserve an invariant. The intersection of these two sets of actions is the set of feasible solutions. In our system, the effect of actions is represented as part of geom-specific knowledge in the form of Action rules, whereas knowledge about how to compute intersections of two or more sets of actions is represented as geometry-specific knowledge (since it does not depend on the particular geom being acted on).An action rule consists of a three-tuple (pattern, to-preserve, to-[re]achieve). Pattern is the invariant term of interest; to-preserve is a list of actions that can be taken without violating the pattern invariant; and to-[re]achieve is a list of actions that can be taken to achieve the invariant or re-achieve an existing invariant “clobbered” by an earlier action. These actions are stated in the most general form possible. The matching rules in the Geometry Knowledge base are then used to obtain the most general unifier of two or more actions. An example of an action rule, associated with variable-radius circle geoms is:pattern: (1d-constrained-point ?circle (>> ?circle CENTER) ?1dlocus)(AR-1) to-preserve: (scale ?circle (>> ?circle CENTER) ?any)(translate ?circle (v- (>> ?1dlocus ARBITRARY-POINT)(>> ?circle CENTER))to-[re]achieve: (translate ?circle (v- (>> ?1dlocus ARBITRARY-POINT)(>> ?circle CENTER))This action rule is used to preserve or achieve the constraint that the center of a circle geom lie on a 1d locus. There are two actions that may be performed without violating this constraint: (1) scale the circle about its center. This would change the radius of the circle but the position of the center remains the same and hence the 1d-constrained-point invariant is preserved. (2)B HANSALI, K RAMER &H OARtranslate the circle by a vector that goes from its current center to an arbitrary point on the 1-dimensional locus ((v- a b) denotes a vector from point b to point a). To achieve this invariant only one action may be performed: translate the circle so that its center moves from its current position to an arbitrary position on the 1-dimensional locus.2.1.7 S IGNATURESFor completeness, it is necessary that there exist a plan fragment for each possible combination of constraints on a geom. However, in many cases, two or more constraints describe the same situation for a geom (in terms of its degrees of freedom). For example, the constraints that ground the two end-points of a line-segment and the constraints that ground the direction, length, and one end-point of a line-segment both reduce the degrees of freedom of the line-segment to zero and hence describe the same situation. In order to minimize the number of plan fragments that need to be written, it is desirable to group sets of constraints that describe the same situation into equivalence classes and represent each equivalence class using a canonical form.The state of a geom, in terms of the prior constraints on it, is summarized as a signature. A signature scheme for a geom is the set of canonical signatures for which plan fragments need to be written. In Kramer’s earlier work (1993) the signature scheme had to be determined manually by examining each signature obtained by combining constraint types and designating one from a set of equivalent signatures to be canonical. Our approach allows us to construct the signature scheme for a geom automatically by using reformulation rules (described shortly). A reformulation rule rewrites one or more constraints into a simpler form. The signature scheme is obtained by first generating all possible combinations of constraint types to yield the set of all possible signatures. These signatures are then reduced using the reformulation rules until each signature is reduced to the simplest form. The set of (unique) signatures that are left constitute the signature scheme for the geom.As an example, consider the set of constraint types on a variable radius circle. The signature for this geom is represented as a tuple <Center, Normal, Radius, FixedPts, FixedLines> where:•Center denotes the invariants on the center point and can be either Free (i.e., no constraint on the center point), L2 (i.e., center point is constrained to be on a 2-dimensional locus), L1 (i.e., center point is constrained to be on a 1-dimensional locus), or Fixed.•Normal denotes the invariant on the normal to the plane of the circle and can be either Free, L1, or Fixed (in 2D it is always fixed).•Radius denotes the invariant on the radius and can be either Free or Fixed.•FixedPts denotes the number of Fixed-distance-point invariants and can be either 0,1, or 2.•FixedLines denotes the number of Fixed-distance-line invariants and can be either 0,1, or 2.L2 and L1 denote a 2D and 1D locus respectively. If we assume a 2D geometry, the L2 invariant on the Center is redundant, and the Normal is always Fixed. There are then 3 x 1 x 2 x 3 x 3 = 54 possible signatures for the geom. However, several of these describe the same situation. For example, the signature:<Center-Free,Radius-Free, FixedPts-0,FixedLines-2>which describes a circle constrained to be at specific distances from two fixed lines, can be rewritten to:P RINCIPLED S YMBOLIC G EOMETRIC C ONSTRAINT S ATISFACTION<Center-L1, Radius-Free,FixedPts-0,FixedLines-0>which describes a circle constrained to be on a 1-dimensional locus (in this case the angular bisector of two lines). Using reformulation rules, we can derive the signature scheme for variable radius circles consisting of only 10 canonical signatures given below:<Center-Free,Radius-Free, FixedPts-0,FixedLines-0><Center-Free,Radius-Free, FixedPts-0,FixedLines-1><Center-Free,Radius-Free, FixedPts-1,FixedLines-0><Center-Free,Radius-Fixed, FixedPts-0,FixedLines-0><Center-L1,Radius-Free, FixedPts-0,FixedLines-0><Center-L1,Radius-Free, FixedPts-0,FixedLines-1><Center-L1,Radius-Free, FixedPts-1,FixedLines-0><Center-L1,Radius-Fixed, FixedPts-0,FixedLines-0><Center-Fixed,Radius-Free, FixedPts-0,FixedLines-0><Center-Fixed,Radius-Fixed, FixedPts-0,FixedLines-0>Similarly, the number of signatures for line-segments can be reduced from 108 to 19 using reformulation rules.2.2 Geometry Specific KnowledgeThe geometry specific knowledge is organized as three different kinds of rules.2.2.1 M ATCHING R ULESThese are used to match terms using geometric properties. The planner employs a unification algorithm to match actions and determine whether two actions have a common unifier. However, the standard unification algorithm is not sufficient for our purposes, since it is purely syntactic and does not use knowledge about geometry. To illustrate this, consider the following two actions:(rotate $g $pt1 ?vec1 ?amt1), and(rotate $g $pt2 ?vec2 ?amt2).The first term denotes a rotation of a fixed geom $g, around a fixed point $pt1 about an arbitrary axis by an arbitrary amount. The second term denotes a rotation of the same geom around a different fixed point $pt2 with the rotation axis and amount being unspecified as before. Standard unification fails when applied to the above terms because no binding of variables makes the two terms syntactically equal2. However, resorting to knowledge about geometry, we can match the two terms to yield the following term:(rotate $g $pt1 (v- $pt2 $pt1) ?amt1)which denotes a rotation of the geom around the axis passing through points $pt1 and $pt2. The point around which the body is rotated can be any point on the axis (here it is arbitrarily chosen to be one of the fixed points, $pt1) and the amount of rotation can be anything.The planner applies the matching rules to match the outermost expression in a term first; if no rule applies, it tries subterms of that term, and so on. If none of the matching rules apply, then 2 Specifically, unification fails when it tries to unify $pt1 and $pt2.B HANSALI, K RAMER &H OARthis algorithm degenerates to standard unification. The matching rules can also have conditions attached to them. The condition can be any boolean function; however, for the most part they tend to be simple type checks.2.2.2 R EFORMULATION R ULESAs mentioned earlier, there are several ways to specify constraints that restrict the same degrees of freedom of a geom. In GCE, plan fragments are indexed by signatures which summarize the available degrees of freedom of a geom. To reduce the number of plan fragments that need to be written and indexed, it is desirable to reduce the number of allowable signatures. This is accomplished with a set of invariant reformulation rules which are used to rewrite pairs of invariants on a geom into an equivalent pair of simpler invariants (using a well-founded ordering). Here equivalence means that the two sets of invariants produce the same range of motions in the geom. This reduces the number of different combinations of invariants for which plan fragments need to be written. An example of invariant reformulation is the following: (fixed-distance-line ?c ?l1 ?d1 BIAS_COUNTERCLOCKWISE)(fixed-distance-line ?c ?l2 ?d2 BIAS_CLOCKWISE)⇓ (RR-1) (1d-constrained-point ?c (>> ?c center) (angular-bisector(make-displaced-line ?l1 BIAS_LEFT ?d1)(make-displaced-line ?l2 BIAS_RIGHT ?d2)BIAS_COUNTERCLOCKWISEBIAS_CLOCKWISE))This rule takes two invariants: (1) a geom is at a fixed distance to the left of a given line, and (2) a geom is at a fixed distance to the right of a given line. The reformulation produces the invariant that the geom lies on the angular bisector of two lines that are parallel to the two given lines and at the specified distance from them. Either of the two original invariants in conjunction with the new one is equivalent to the original set of invariants.Besides reducing the number of plan fragments, reformulation rules also help to simplify action rules. Currently all action rules (for variable radius circles and line-segments) use only a single action to preserve or achieve an invariant. If we do not restrict the allowable signatures on a geom, it is possible to create examples where we need a sequence of (more than one) actions in the rule to achieve the invariant, or we need complex conditions that need to be checked to determine rule applicability. Allowing sequences and conditionals on the rules increases the complexity of both the rules and the pattern matcher. This makes it difficult to verify the correctness of rules and reduces the efficiency of the pattern matcher.Using invariant reformulation rules allows us to limit action rules to those that contain a single action. Unfortunately, it seems that we still need conditions to achieve certain invariants. For example, consider the following invariant on a variable radius circle:(fixed-distance-point ?circle ?pt ?dist BIAS_OUTSIDE)which states that a circle, ?circle be at some distance ?dist from a point ?pt and lie outside a circle around ?pt with radius ?dist. One action that may be taken to achieve this constraint is: (scale?circle(>> ?circle center)。
Finite Model Theory and Its ApplicationsSpringer61Constraint Satisfactionthe data complexity of a fragment of existential second-order logic.We then go on in Section1.6and offer a logical approach,via definability in Datalog, to establishing the tractability of non-uniform constraint-satisfaction prob-lems.In Section1.7,we leverage the connection between Datalog and certain pebble games,and show how these pebble games offer an algorithmic ap-proach to solving uniform constraint-satisfaction problems.In Section1.8,we relate these pebble games to consistency properties of constraint-satisfaction instances,a well-known approach in constraint solving.Finally,in Section1.9, we show how the same pebble games can be used to identify large“islands of tractability”in the constraint-satisfaction terrain that are based on the concept of bounded treewidth.Much of the logical machinery used in this chapter is described in detail in Chapter??.For a book-length treatment of constraint satisfaction from the perspective of graph homomorphism,see[44].Two books on constraint programming and constraint processing are[3,23].1.2PreliminariesThe standard terminology in AI formalizes an instance P of constraint satis-faction as a triple(V,D,C),where1.V is a set of variables;2.D is a set of values,referred to as the domain;3.C is a collection of constraints C1,...,C q,where each constraint C i is apair(t,R)with t a k-tuple over V,k≥1,referred to as the scope of the constraint,and R a k-relation on D.A solution of such an instance is a mapping h:V→D such that,for each constraint(t,R)in C,we have that h(t)∈R,where h is defined on tuples component-wise,that is,if t=(a1,...,a k),then h(t)=(h(a1),...,h(a k)). The constraint-satisfaction problem asks whether a given instance is solvable,i.e.,whether it has a solution.Note that,without loss of generality, we may assume that all constraints(t,R i)involving the same scope t have been consolidated to a single constraint(t,R),where R is the intersection of all relations R i constraining t.Thus,we can assume that each tuple t of variables occurs at most once in the collection C.Consider the Boolean satisfiability problem3-Sat:given a3CNF-formula ϕwith variables x1,...,x n and clauses c1,...,c m,isϕsatisfiable?Such an instance of3-Sat can be thought of as the constraint-satisfaction instance in which the set of variables is V={x1,...,x n},the domain is D={0,1},and the constraints are determined by the clauses ofϕ.For example,a clause of the form(¬x∨¬y∨z)gives rise to the constraint((x,y,z),{0,1}3−{(1,1,0)}). In an analogous manner,3-Colorability can be modelled as a constraint-satisfaction problem.Indeed,an instance G=(V,E)of3-Colorability can be thought of as the constraint-satisfaction instance in which the set1.2Preliminaries7 of variables is the set V of the nodes of the graph G,the domain is the set D={r,b,g}of three colors,and the constraints are the pairs((u,v),Q), where(u,v)∈E and Q={(r,b)(b,r),(r,g)(g,r),(b,g)(g,b)}is the disequality relation on D.Let A and B be two relational structures1over the same vocabulary.A homomorphism h from A to B is a mapping h:A→B from the universe A of A to the universe B of B such that,for every relation R A of A and every tuple (a1,...,a k)∈R A,we have that(h(a1),...,h(a k))∈R B.The existence of a homomorphism from A to B is denoted by A→B,or by A→h B,when we want to name the homomorphism h explicitly.An important observation made in[29]2is that every such constraint-satisfaction instance P=(V,D,C)can be viewed as an instance of the homomorphism problem,asking whether there is a homomorphism between two structures A P and B P that are obtained from P in the following way:1.the universe of A P is V and the universe of B P is D;2.the relations of B P are the distinct relations R occurring in C;3.the relations of A P are defined as follows:for each distinct relation R onD occurring in C,we have the relation R A={t:(t,R)∈C}.Thus,R Aconsists of all scopes associated with R.We call(A P,B P)the homomorphism instance of P.Conversely,it is also clear that every instance of the homomorphism problem between two struc-tures A and B can be viewed as a constraint-satisfaction instance CSP(A,B) by simply“breaking up”each relation R A on A as follows:we generate a con-straint(t,R B)for each t∈R A.We call CSP(A,B)the constraint-satisfaction instance of(A,B).Thus,as pointed out in[29],the constraint-satisfaction problem can be identified with the homomorphism problem.To illustrate the passage from the constraint-satisfaction problem to the homomorphism problem,let us consider3-Sat.A3CNF-formulaϕwith vari-ables x1,...,x n and clauses c1,...,c m gives rise to a homomorphism instance (Aϕ,Bϕ)defined as follows:•Aϕ=({x1,...,x n},Rϕ0,Rϕ1,Rϕ2,Rϕ3),where Rϕi is the ternary relation consisting of all triples(x,y,z)of variables that occur in a clause ofϕwith i negated literals,0≤i≤3;for instance,Rϕ2consists of all triples (x,y,z)of variables such that(¬x∨¬y∨z)is a clause ofϕ(here,we assume without loss of generality that the negated literals precede the positive literals).•Bϕ=({0,1},R0,R1,R2,R3),where R i consists of all triples that satisfy a3-clause in which thefirst i literals are negated;for instance,R2= {0,1}3−{1,1,0}.Note that Bϕdoes not depend onϕ.It is clear thatϕis satisfiable if and only if there is a homomorphism from Aϕto Bϕ(in symbols,Aϕ→Bϕ).81Constraint SatisfactionAs another example,3-Colorability is equivalent to the problem of de-ciding whether there is a homomorphism h from a given graph G to the com-plete graph K3=({r,b,g},{(r,b)(b,r),(r,g)(g,r),(b,g)(g,b)}with3nodes. More generally,k-Colorability,k≥2,amounts to the existence of a ho-momorphism from a given graph G to the complete graph K k with k nodes (also known as the k-clique).Numerous other important NP-complete problems can be viewed as special cases of the Homomorphism Problem(and,hence,also of the Constraint-Satisfaction Problem).For example,consider the Clique problem:given a graph G and an integer k,does G contain a clique of size k?As a homo-morphism instance this is equivalent to asking if there is a homomorphism from the complete graph K k to G.As a constraint-satisfaction instance,the set of variables is{1,2,...,k},the domain is the set V of nodes of G,and the constraints are the pairs((i,j),E)such that i=j,1≤i,j≤k,and E is the edge relation of G.For another example,consider the Hamiltonic-ity Problem:given a graph G=(V,E)does it have a Hamiltonian cycle? This is equivalent to asking if there is a homomorphism from the structure (V,C V,=)to the structure(V,E,=),where C V is some cycle on the set V of nodes of G and=is the disequality relation on V.NP-completeness of the Homomorphism problem was pointed out explicitly in[53].In this chapter, we use both the traditional AI formulation of constraint satisfaction and the formulation as the homomorphism problem,as each has its own advantages.It turns out that in both formulations constraint satisfaction can be ex-pressed as a database-theoretic problem.We start with the homomorphism formulation,which is intimately related to conjunctive-query evaluation[48].A conjunctive query Q of arity n is a query definable by a positive existential first-order formulaϕ(X1,...,X n)having conjunction as its only propositional connective,that is,by a formula of the form∃Z1...∃Z mψ(X1,...,X n,Z1,...,Z m),whereψ(X1,...,X n,Z1,...,Z m)is a conjunction of(positive)atomic formu-las.The free variables X1,...,X n of the defining formula are called the distin-guished variables of Q.Such a conjunctive query is usually written as a rule, whose head is Q(X1,...,X n)and whose body isψ(X1,...,X n,Z1,...,Z m). For example,the formula∃Z1∃Z2(P(X1,Z1,Z2)∧R(Z2,Z3)∧R(Z3,X2))defines a binary conjunctive query Q,which as a rule becomesQ(X1,X2):-P(X1,Z1,Z2),R(Z2,Z3),R(Z3,X2).If the formula defining a conjunctive query Q has no free variables(i.e.,if it is a sentence),then Q is a Boolean conjunctive query.For example,the sentence∃Z1∃Z2∃Z3(E(Z1,Z2)∧E(Z2,Z3)∧E(Z3,Z1))1.2Preliminaries9 defines the Boolean conjunctive query“is there a cycle of length3?”.If D is a databaseand Q is a n-ary query,then Q(D)is the n-ary relationon D obtained by evaluating the query Q on D,that is,the collection of all n-tuples from D that satisfy the query(cf.Chapter??).The conjunctive-query evaluation problem asks:given a n-ary query Q,a database D, and a n-tuple a from D,does a∈Q(D)?Let Q1and Q2be two n-aryqueries having the same tuple of distinguished variables.We say that Q1iscontained in Q2,and write Q1⊆Q2,if Q1(D)⊆Q2(D)for every database D.The conjunctive-query containment problem asks:given two con-junctive queries Q1and Q2,is Q1⊆Q2?These concepts can be defined forBoolean conjunctive queries in an analogous manner.In particular,if Q is a Boolean query and D is a database,then Q(D)=1if D satisfies Q;otherwise,Q(D)=0.Moreover,the containment problem for Boolean queries Q1and Q2is equivalent to asking whether Q1logically implies Q2.It is well known that conjunctive-query containment can be reformu-lated both as a conjunctive-query evaluation problem and as a homomor-phism problem.What links these problems together is the canonical databaseD Q associated with Q.This database is defined as follows.Each variableoccurring in Q is considered a distinct element in the universe of D Q.Ev-ery predicate in the body of Q is a predicate of D Q as well;moreover,for every distinguished variable X i of Q,there is a distinct monadic pred-icate P i(not occurring in Q).Every subgoal in the body of Q gives rise toa tuple in the corresponding predicate of D Q;moreover,if X i is a distin-guished variable of Q,then P i(X i)is also a(monadic)tuple of D Q.Thus, returning to the preceding example,the canonical database of the conjunctivequery∃Z1∃Z2(P(X1,Z1,Z2)∧R(Z2,Z3)∧R(Z3,X2))consists of the factsP(X1,Z1,Z2),R(Z2,Z3),R(Z3,X2),P1(X1),P2(X2).The relationship be-tween conjunctive-query containment,conjunctive-query evaluation,and ho-momorphisms is provided by the following classical result,due to Chandra and Merlin.Theorem1.1.[11]Let Q1and Q2be two conjunctive queries having thesame tuple(X1,...,X n)of distinguished variables.Then the following state-ments are equivalent.•Q1⊆Q2.•(X1,...,X n)∈Q2(D Q1).•There is a homomorphism h:D Q2→D Q1.It follows that the homomorphism problem can be viewed as a conjunctive-query evaluation problem or as a conjunctive-query containment problem.Forthis,with every structure A,we view the universe A={X1,...,X n}of A as aset of individual variables and associate with A the Boolean conjunctive query ∃X1...∃X n∧t∈R A R(t);we call this query the canonical conjunctive query of A and denote it by Q A.It is clear that A is isomorphic to the canonicaldatabase associated with Q A.101Constraint SatisfactionCorollary1.2.Let A and B be two structures over the same vocabulary. Then the following statements are equivalent.•A→B.•B|=Q A.•Q B⊆Q A.As an illustration,we have that a graph G is3-colorable iffK3|=Q G iff⊆Q G.Q K3A relational join,denoted by the symbol1,is a conjunctive query with no existentially quantified variables.Thus,relational-join evaluation is a special case of conjunctive-query evaluation.For example,E(Z1,Z2)∧E(Z2,Z3)∧E(Z3,Z1)is a relational join that,when evaluated on a graph G=(V,E), returns all triples of nodes forming a3-cycle.There is a well known connec-tion between the traditional AI formulation of constraint satisfaction and relational-join evaluation that we describe next.Suppose we are given a constraint-satisfaction instance(V,D,C).We can assume without loss of gen-erality that in every constraint(t,R)∈C the elements in t are distinct. (Suppose to the contrary that t i=t j.Then we can delete from R every tu-ple in which the i th and j th entries disagree,and then project out that j-th column from t and R.)We can thus view every element of V as a relational attribute,every tuple of distinct elements of V as a relational schema,and every constraint(t,R)as a relation R over the schema t(cf.[1]).It now fol-lows from the definition of constraint satisfaction that CSP can be viewed as a relational-join evaluation problem.Proposition1.3.[6,42]A constraint-satisfaction instance(V,D,C)is solv-able if and only if1(t,R)∈C R is nonempty.Note that Proposition1.3is essentially the same as Corollary1.2.Indeed, the condition B|=Q A amounts to the non-emptiness of the relational join obtained from Q A by dropping all existential quantifiers and using the rela-tions from B as interpretations of the relational symbols in Q A.Moreover, the homomorphisms from A to B are precisely the tuples in the relational join associated with the constraint-satisfaction instance CSP(A,B).1.3Computational Complexity of Constraint SatisfactionThe Constraint-Satisfaction Problem is NP-complete,because it is clearly in NP and also contains NP-hard problems as special cases,including 3-Sat,3-Colorability,and Clique.As explained in Garey and Johnson’s classic monograph[36],one of the main ways to cope with NP-completeness is to identify polynomial-time solvable cases of the problem at hand that are obtained by imposing restrictions on the possible inputs.For instance,1.3Computational Complexity of Constraint Satisfaction11 Horn3-Sat,the restriction of3-Sat to Horn3CNF-formulas,is solvable in polynomial-time using a unit-propagation algorithm.Similarly,it is known that3-Colorability restricted to graphs of bounded treewidth is solvable in polynomial time(see[26]).In the case of constraint satisfaction,the pursuit of tractable cases has evolved over the years from the discovery of isolated cases to the discovery of large“islands of tractability”of constraint satisfaction.In what follows,we will give an account of some of the progress made in this ing the fact that the Constraint-Satisfaction Problem can be identified with the Homomorphism Problem,we begin by introducing some terminology and notation that will enable us to formalize the concept of an “island of tractability”of constraint satisfaction.In general,an instance of the Homomorphism Problem consists of two relational structures A and B.Thus,all restricted cases of this problem can be obtained by imposing restrictions on the input structures A and B.Definition1.4.Let A,B be two classes of relational structures.We write CSP(A,B)to denote the restriction of the Homomorphism Problem to in-put structures from A and B.In other words,CSP(A,B)={(A,B):A∈A,B∈B and A→B}.An island of tractability of constraint satisfaction is a pair(A,B)of classes of relational structures such that CSP(A,B)is in the complexity class PTIME of all decision problems solvable in polynomial time.(A more general definition of islands of tractability of constraint satisfaction would consider classes of pairs(A,B)of structures,cf.[28];we do not pursue this more general definition here.)The ultimate goal in the pursuit of islands of tractability of constraint sat-isfaction is to identify or characterize classes A and B of relational structures such that CSP(A,B)is in PTIME.The basic starting point in this investiga-tion is to consider the cases in which one of the two classes A,B is as small as possible,while the other is as large as possible.This amounts to considering the cases in which one of A,B is the class All of all relational structures over some arbitrary,butfixed,relational vocabulary,while the other is a single-ton,consisting of somefixed structure over that vocabulary.Thus,the start-ing points of the investigation is to determine,forfixed relational structures A,B,the computational complexity of the decision problems CSP({A},All) and CSP(All,{B}).Clearly,for eachfixed A,the decision problem CSP({A},All)can be solved in polynomial time,because,given a structure B,the existence of a homomorphism from A to B can be checked by testing all functions h from the universe A of A to the universe B of B(the total number of such functions is|B||A|,which is a polynomial number in the size of the structure B when A isfixed).Thus,having a singleton structure“on the left’is of little interest.At the other extreme,however,the situation is quite different,since the computational complexity of CSP(All,{B})may very well depend on the121Constraint Satisfactionparticular structure B.Indeed,CSP(All,{K3})is NP-complete,because it is the3-Colorability problem;in contrast,CSP(All,{K2})is in P,because it is the2-Colorability problem.For simplicity,in what follows,for every fixed structure B,we define CSP(B)=CSP(All,{B})and call this the non-uniform constraint-satisfaction problem associated with B.For such problems, we refer to B as the template.Thus,thefirst major goal in the study of the computational complexity of constraint satisfaction is to identify those templates B for which CSP(B)is in PTIME.This goals gives rise to an important open decision problem.The Tractability Classification Problem:Given a relational structure B,decide if CSP(B)is in PTIME.In addition to the family of non-uniform constraint-satisfaction problems CSP(B),where B is a relational structure,we also study decision problems of the form CSP(A,All),where A is a class of structures.We refer to such problems as uniform constraint-satisfaction problems.It is illuminating to consider the complexity of uniform and non-uniform constraint satisfaction from the perspective of query evaluation.As argued in[67](see Chapter??),there are three ways to measure the complexity of evaluating queries(we focus here on Boolean queries)expressible in a query language L:•The combined complexity of L is the complexity of the following decision problem:given an L-query Q and a structure A,does A|=Q?In symbols,{ Q,A :Q∈L and A|=Q}.•The expression complexity of L is the complexity of the following decision problems,one for eachfixed structure A:{Q:Q∈L and A|=Q}.•The data complexity of L is the complexity of the following decision prob-lems,one for eachfixed query Q∈L:{A:A|=Q}.As discussed in Chapter??,the data complexity offirst-order logic is in LOGSPACE,which means that,for eachfirst-order query Q,the problem {A:A|=Q}is in LOGSPACE.In contrast,the combined complexity for first-order logic is PSPACE-complete.Furthermore,the expression complex-ity forfirst-order logic is also PSPACE-complete.In fact,for all but trivial structures A,the problem{Q:Q∈F O and A|=Q}is PSPACE-complete. This exponential gap between data complexity,on one hand,and combined and expression complexity,on the other hand,is typical[67].For conjunc-tive queries,on the other hand,both combined and expression complexity are NP-complete.1.4Non-Uniform Constraint Satisfaction13Consider now the uniform constraint-satisfaction problem CSP(A,All)= {(A,B):A∈A,and A→B},where A is a class of structures.By Corol-lary1.2,we have thatCSP(A,All)={(A,B):A∈A,B is a structure and B|=Q A}.Thus,studying the complexity of uniform constraint satisfaction amounts to studying the combined complexity for a class of conjunctive queries,as, for example,in[12,39,62].In contrast,consider the non-uniform constraint-satisfaction problem CSP(B)={A:A→B}.By Corollary1.2we have that CSP(B)={A:B|=Q A}.Thus,studying the complexity of non-uniform constraint satisfaction amounts to studying the expression complexity of conjunctive queries with respect to different structures.This is a problem that has not been studied in the context of database theory.1.4Non-Uniform Constraint SatisfactionThefirst major result in the study of non-uniform constraint-satisfaction problems was obtained by Schaefer[63],who,in effect,classified the compu-tational complexity of all Boolean non-uniform constraint-satisfaction prob-lems.A Boolean structure is simply a relational structure with a2-element universe,that is,a structure of the form B=({0,1},R B1,...,R B m).A Boolean non-uniform constraint-satisfaction problem is a problem of the form CSP(B)with a Boolean template B.These problems are also known as Generalized-Satisfiability Problems,because they can be viewed as variants of Boolean-satisfiability problems in which the formulas are conjunc-tions of generalized connectives[36].In particular,they contain the well known problems k-Sat,k≥2,1-in-3-Sat,Positive1-in-3-Sat,Not-All-Equal 3-Sat,and Monotone3-Sat as special cases.For example,as seen ear-lier,3-Sat is CSP(B),where B=({0,1},R0,R1,R2,R3)and R i is the set of all triples that satisfy a3-clause in which thefirst i-literals are negated, i=0,1,2,3(thus,R0={0,1}3−{(0,0,0)}).Similarly,Monotone3-SAT is CSP(B),where B=({0,1},R0,R3).Ladner[51]showed that if PTIME=NP,then there are decision problems in NP that are neither NP-complete,nor belong to PTIME.Such problems are called intermediate problems.Consequently,it is conceivable that a given family of NP-problems contains intermediate problems.Schaefer[63],how-ever,showed that the family of all Boolean non-uniform constraint-satisfaction problems contains no intermediate problems.Theorem1.5.(Schaefer’s Dichotomy Theorem[63])•If B=({0,1},R B1,...,R B m)is Boolean structure,then either CSP(B)is in PTIME or CSP(B)is NP-complete.141Constraint Satisfaction•The Tractability Classification Problem for Boolean structures is decidable;in fact,there is a polynomial-time algorithm to decide,given a Boolean structure B,whether CSP(B)is in PTIME or is NP-complete.Schaefer’s Dichotomy Theorem can be described pictorially as follows:NP-completeCSP(B)PSchaefer[63]actually showed that there are exactly six types of Boolean structures such that CSP(B)is in PTIME,and provided explicit descriptions of them.Specifically,he showed that CSP(B)is in PTIME precisely when at least one of the following six conditions is satisfied:•Every relation R B i,1≤i≤m,of B is0-valid,that is,R B i contains the all-zeroes tuple(0,...,0).•Every relation R B i,1≤i≤m,of B is1-valid,that is,R B i contains the all-ones tuple(1,...,1).•Every relation R B i,1≤i≤m,of B is bijunctive,that is,R B i is the set of truth assignments satisfying some2-CNF formula.•Every relation R B i,1≤i≤m,of B is Horn,that is,R B i is the set of truth assignments satisfying some Horn formula.•Every relation R B i,1≤i≤m,of B is dual Horn,that is,R B i is the set of truth assignments satisfying some dual Horn formula.•Every relation R B i,1≤i≤m,of B is affine,that is,R B i is the set of solutions to a system of linear equations over the two-elementfield.Schaefer’s Dichotomy Theorem established a dichotomy and a decidable classification of the complexity of CSP(B)for Boolean templates B.After this, Hell and Neˇs etˇr il[43]established a dichotomy theorem for CSP(B)problems in which the template B is an undirected graph:if B is bipartite,then CSP(B) is solvable in polynomial time;otherwise,CSP(B)is NP-complete.To illus-trate this dichotomy theorem,let C n,n≥3,be a cycle with n elements.Then CSP(C n)is in PTIME if n is even,and is NP-complete if n is odd.The preceding two dichotomy results raise the challenge of classifying the computational complexity of CSP(B)for arbitrary relational templates B. Addressing this question,Feder and Vardi[29]formulated the following con-jecture.Conjecture1.6.(Dichotomy Conjecture)[29]If B=(B,R B1,...,R B m)is an arbitrary relational structure,then either CSP(B)is in PTIME or CSP(B)is NP-complete.1.4Non-Uniform Constraint Satisfaction15 In other words,the Dichotomy Conjecture says that the picture above de-scribes the complexity of non-uniform constraint-satisfaction problems CSP(B) for arbitrary structures B.The basis for the conjecture is not only the evi-dence from Boolean constraint satisfaction and undirected constraint satis-faction,but also from the seeming inability to carry out the diagonalization argument of[51]using the constraint-satisfaction machinery[27].The Dichotomy Conjecture inspired intensive research efforts that signif-icantly advanced our understanding of the complexity of non-uniform con-straint satisfaction.In particular,Bulatov confirmed two important cases of this conjecture.We say that a structure B=(B,R B1,...,R B m)is a3-element structure if B contains at most three element.We say that B is conserva-tive if all possible monadic relations on the universe included,that is,every non-empty subset of B is one of the relations R B i of B.Theorem1.7.[8,9]If B a3-element structure or a conservative structure, then either CSP(B)is in PTIME or CSP(B)is NP-complete.Moreover,in both cases the Tractability Classification Problem is decidable in poly-nomial time.In spite of the progress made,the Dichotomy Conjecture remains unre-solved in general.The research efforts towards this conjecture,however,have also resulted into the discovery of broad sufficient conditions for tractabil-ity and intractability of non-uniform constraint satisfaction that have pro-vided unifying explanations for numerous seemingly disparate tractability and intractability results and have also led to the discovery of new islands of tractability of CSP(B).These broad sufficient conditions are based on con-cepts and techniques from two different areas:universal algebra and logic.The approach via universal algebra yields sufficient conditions for tractabil-ity of CSP(B)in terms of closure properties of the relations in B under cer-tain functions on its universe B.Let R be a n-ary relation on a set B and f:B k→B a k-ary function.We say that R is closed under f,if whenever t1=(t11,t21,...,t n1),...,t k=(t1k,t2k,...,t n k)are k(not necessarily distinct) tuples in R,then the tuple(f(t11,...,t1k),f(t21,...,t2k),...,f(t n1,...,t n k))is also in R.We say that f:B k→B is a polymorphism of a structure B=(B,R1,...,R m)if each of the relations R j,1≤j≤m,is closed under f.It is easy to see that f is a polymorphism of B if and only if f is a homomorphism from B k to B,where B k is the k-th power of B.By definition, the k-th power B k is the structure(B k,R′1...,R′m)over the same vocabulary as B with universe B k and relations R′j,1≤j≤m,defined as follows:if R j is of arity n,then R′j(s1,...,s n)holds in B k if and only if R j(s i1,...,s i n) holds in B for1≤i≤n.We write Pol(B)for the set of all polymorphisms of B.As it turns out, the complexity of CSP(B)is intimately connected to the kinds of functions161Constraint Satisfactionthat Pol(B )contains.This connection was first unveiled in [29],and explored in depth by Jeavons and his collaborators;for a recent survey see [10].In particular,they showed that if Pol(B 1)=Pol(B 2)for two structures B 1and B 2(over finite vocabularies),then CSP(B 1)and CSP(B 2)are polynomially reducible to each other.Thus,the polymorphisms of a template B characterize the complexity of CSP(B ).The above mentioned dichotomy results for 3-element and conservative constraint satisfaction are based on a rather deep analysis of the appropriate sets of polymorphisms.1.5Monotone Monadic SNP and Non-UniformConstraint SatisfactionWe discussed earlier how non-uniform constraint satisfaction is related to the study of the expression complexity of conjunctive queries.We now show that it can also be viewed as the study of the data complexity of second-order logic.This will suggest a way to identify islands of tractability via logic.As described in Chapters ??and ??,existential second-order logic ESO defines,by Fagin’s Theorem,precisely the complexity class NP.The class SNP (for strict NP)[46,57]is a fragment of ESO,consisting of all existential second-order sentences with a universal first-order part,namely,sentences of the form (∃S ′)(∀x )Φ(x ,S,S ′),where Φis a first-order quantifier-free formula.We refer to the relations over the input vocabulary S as input relations ,while the relations over the quantified vocabulary S ′are referred to as existential re-lations .3-Sat is an example of an SNP problem.The input structure consists of four ternary relations C 0,C 1,C 2,C 3,on the universe {0,1},where C i cor-responds to a clause on three variables with the first i of them negated.There is a single existential monadic relation T describing a truth assignment.The condition that must be satisfied states that for all x 1,x 2,x 3,if C 0(x 1,x 2,x 3)then T (x 1)or T (x 2)or T (x 3),and similarly for the remaining C i by negating T (x j )if j ≤i .Formally,we can express 3-Sat with the SNP sentence:(∃T )(∀x 1,x 2,x 3)((C 0(x 1,x 2,x 3)→T (x 1)∨T (x 2)∨T (x 3))∧(C 1(x 1,x 2,x 3)→¬T (x 1)∨T (x 2)∨T (x 3))∧(C 2(x 1,x 2,x 3)→¬T (x 1)∨¬T (x 2)∨T (x 3))∧(C 3(x 1,x 2,x 3)→¬T (x 1)∨¬T (x 2)∨¬T (x 3))).It is easy to see that CSP(B )is in SNP for each structure B .For each ele-ment a in the universe of B ,we introduce an existentially quantified monadic relation T a ;intuitively,T a (x )indicates that a variable x has been assigned value a by the homomorphism.The sentence ϕB says that the sets T a cover all elements in the universe 3,and that the tuples in the input relations satisfy the constraints imposed by the structure B .Thus,if R (a 1,...,a n )does not hold in B ,then ϕB contains the conjunct ¬(R (x 1,...,x n )∧ n i =1T a i (x i )).For。
A Constraint Satisfaction Model of Causal Learning and ReasoningYork Hagmayer(york.hagmayer@bio.uni-goettingen.de)Department of Psychology,University of GöttingenGosslerstr.14,37073Göttingen,GermanyMichael R.Waldmann(michael.waldmann@bio.uni-goettingen.de)Department of Psychology,University of GöttingenGosslerstr.14,37073Göttingen,GermanyAbstractFollowing up on previous work by Thagard(1989,2000)we have developed a connectionist constraint satisfaction model which aims at capturing a wide variety of tasks involving causal cognitions,including causal reasoning,learning,hy-pothesis testing,and prediction.We will show that this model predicts a number of recent findings,including asymmetries of blocking,and asymmetries of sensitivity to structural im-plications of causal models in explicit versus implicit tasks.IntroductionCausal reasoning has been widely investigated during the last decade,which has led to a number of interesting novel findings(see Shanks,Holyoak,&Medin,1996;Hagmayer &Waldmann,2001,for overviews).For example,it has been shown that participants’causal judgments are sensitive to the contingency between the cause and the effect,and that people’s judgments reflect the causal models underlying the observed learning events(see Hagmayer&Waldmann, 2001;Waldmann,1996).Moreover,causal reasoning has been studied in the context of a number of different tasks, such as learning,reasoning,categorization,or hypothesis testing.Most psychological theories and computational models of causal learning and reasoning are rooted in two traditions. They are either based on associationistic or on probabilistic or Bayesian models(see Shanks et al.,1996;Thagard, 2000).Both kinds of models have been criticized.Associa-tionistic learning networks have proven unable to capture the fundamental semantics of causal models because they are insensitive to the differences between learning events that represent causes versus effects(see Waldmann,1996). By contrast,Bayesian networks are perfectly capable of rep-resenting causal models with links directed from causes to effects(see Pearl,2000).However,although the goal of these networks is to reduce the complexity of purely prob-abilistic reasoning,realistic Bayesian models still require fairly complex computations,and they presuppose compe-tencies in reasoning with numerical probabilities which seem unrealistic for untutored people(see Thagard,2000,for a detailed critique of these models).The aim of this paper is to introduce a more qualitatively oriented,connectionist constraint satisfaction model of causal reasoning and learning.Our model is inspired by Thagard’s(2000)suggestion that constraint satisfaction models may qualitatively capture many insights underlying normative Bayesian network models in spite of the fact that constraint satisfaction model use computationally far sim-pler,and therefore psychologically more realistic processes. The model differs from standard associationist learning models(e.g.,Rescorla&Wagner,1972)in that it is capable of expressing basic differences between causal models.Our model embodies a uniform mechanism of learning and rea-soning,which assesses the fit between data and causal mod-els.This architecture allows us to model a wide range of different tasks within a unified model,which in the literature have so far been treated as separate,such as learning and hypothesis testing.Constraint Satisfaction Models Constraint satisfaction models(Thagard,1989,2000)aim at capturing qualitative aspects of reasoning.Their basic as-sumption is that people hold a set of interconnected beliefs. The beliefs pose constraints on each other,they either sup-port each other,contradict each other,or are unrelated.Co-herence between the beliefs can be achieved by processes which attempt to honor these constraints.Within a constraint satisfaction model beliefs are repre-sented as nodes which represent propositions(e.g.,“A causes B”).The nodes are connected by symmetric relations. The numerical activation of the nodes indicates the strength of the belief in the proposition.A belief that is highly acti-vated is held strongly,a belief that is negatively activated is rejected.The activation of a node depends on the activation of all other nodes with which it is connected.More pre-cisely,the net input to a single node j from all other nodes i is defined as the weighted sum of the activation a of all re-lated nodes(following Thagard,1989,p.466,eq.5):Net j=∑i w ij a i(t)(1) The weights w represent the strength of the connection of the beliefs.In our simulations,they are generally pre-set to default values which are either positive or negative and re-main constant throughout the simulation.At the beginning of the simulations,the activation of the nodes representing hy-potheses are set to a low default value.However,nodes rep-resenting empirical evidence are connected to a special acti-vation node whose activation remains constant at1.0.This architecture allows us to capture the intuition that more faith is put into empirical evidence than into theoretical hypothe-ses(see Thagard,1989).To update the activation in eachcycle of the simulation,first the net input net j to each node is computed using Equation1.Second the activation of all nodes is updated using the following equation(Thagard, 1989,p.446,eq.4):a j(t+1)=a j(t)(1-θ)+net j(max-a j(t))if net j>0=a j(t)(1-θ)+net j(a j(t)-min)otherwise.(2) In Equation2,θis a decay parameter that decrements the activity of each node in every cycle,min represents the minimum activation(-1)and max the maximum activation (+1).The activations of all nodes are updated until a stable equilibrium is reached,which means that the activation of all nodes do no longer substantially change.To derive quantita-tive predictions it would be necessary to specify rules that map the final activations to different types of responses. This is an important goal which should be addressed in fu-ture research.In the present article we only derive ordinal, qualitative predictions from the model.The ModelFollowing causal-model theory(Waldmann,1996)we as-sume that people typically enter causal tasks with initial as-sumptions about the causal structure they are going to ob-serve.Even though specific knowledge about causal rela-tions may not always be available,people often bring to bear knowledge about abstract features of the models,such as the distinction between events that refer to potential causes and events that refer to potential effects.In virtually all psycho-logical studies this information can be gleaned from the ini-tial instructions and the materials(see Waldmann,1996).Figure1displays an example of how the model repre-sents a causal model.The nodes represent either causal hy-potheses or observable events.The causal hypothesis node at the top represents a structural causal hypothesis(H1),in this case the hypothesis that the three events e1,e2,x form a common-effect structure with e1and e2as the two alternative causes and x as the common effect.The two nodes on the middle level refer to the two causal relations H2and H3that are part of the common-effect model with two causes and a single effect.The nodes on the lowest level refer to all pat-terns of events that can be observed with three events(a dot represents“and”).On the left side,the nodes represent pat-terns of three events,in the middle pairs,and on the right side single events.Not only the present but also the corre-sponding absent events are represented within this model (for example~x).The links connecting the nodes represent belief relations.Thus,they do not represent probabilities or causal relations as in Bayesian models.There are two differ-ent kinds of connections between the nodes.Solid lines indi-cate excitatory links,dashed lines inhibitory links.How are the connections defined?A connection is positive if the propositions support each other.For example,if all three events are present,the observation is in accordance with both hypotheses H2and H3.This pattern might be observed if both e1and e2cause x.Therefore the evidence node e1.e2.x is positively connected to H2and H3.In general,a hypothesis is positively connected to an evidence node if the events mentioned in the hypothesis are either all present or all absent.If this is not the case,that is if one of the relevant events specified in the hypothesis is absent,the link is as-signed the negative default value.Exploratory studies have shown,that participants share a common intuition whether a certain pattern of events supports or contradicts a hypothesis (Hagmayer&Waldmann,2001).The assigned weights mir-ror these general intuitions.The weights of the links remain the same throughout the simulations.Figure1does not dis-play the special activation node.This node was pre-set to 1.0and attached to event nodes describing present events in the respective experiment.and reasoning.See text for further explanations.In Figure1,the dashed line between the hypotheses H1and H2,which signifies an inhibitory link,is of special interest. The network represents a common-effect structure.This means that there are two causes e1and e2which compete in explaining the occurrence of effect x.Therefore the two hypotheses referring to the individual causal relations have to be connected by a inhibitory link(see also Thagard, 2000).However,both hypotheses H2and H3are positively connected to the structural hypothesis H1.By contrast,a common-cause structure is represented slightly differently. In such a structure,event x would be the common cause of the two effects e1and e2(i.e.,H1:x!e1.e2).A model of this structure looks almost identical to the one for the com-mon-effect structure in Figure1.There is only one very im-portant difference.Because there is no competition between the effects of a common cause,a common-cause model has no inhibitory link between H2and H3.All other nodes and links in the two models are identical.Both the common-effect and the common-cause model were implemented using Microsoft Excel.Default values were adopted from the literature if not indicated otherwise (Thagard,1989).Initial activations were set to0.01,inhibi-tory links between nodes to–0.05,and excitatory links to +0.05.The inhibitory link between H1and H2within the common-effect model was pre-set to a value of–0.20.Thespecial activation node was attached to all evidence nodes. The additional activation was divided among the evidence nodes according to the relative frequency of the evidence in the learning input.This principle captures the intuition that more faith is put into evidence that is observed more fre-quently.EvaluationIn order to evaluate the proposed constraint satisfaction model different tasks and paradigms from the literature on causal learning and reasoning were modeled.One of our main goals was to show that the same architecture can be used to simulate different types of tasks.However,different tasks required different sections of the model depicted in Figure1.We used two principles for the construction of task specific networks.The first principle is that we only in-cluded the event nodes that corresponded to the event pat-terns observed in the learning phase or that corresponded to events that have to be evaluated or predicted in the test phase.For example,to model a task in which only event triples were shown,only the event nodes on the left side of the event layer in Figure1would be incorporated in the model.However,if the task following the learning phase required the prediction of single events,the corresponding nodes for single events would have to be added to the event layer.The second principle is that only the hypothesis nodes were included that represent hypotheses that are given or suggested to participants.These two principles ensure that for each paradigm a minimally sufficient sub-model of the complete model is instantiated.Test1:Asymmetries of BlockingBlocking belongs to the central phenomena observed in as-sociative learning which,among other findings,have moti-vated learning rules that embody cue competition(e.g.,Res-corla&Wagner,1972).A typical blocking experiment con-sists of two learning phases.In Phase1participants learn that two events e1and x are either both present or absent.In Phase2a third event e2is introduced.Now all three events are either present or absent.In both phases,events e1and e2 represent cues and x the outcome to be predicted.Associa-tive theories generally predict a blocking effect which means that participants should be reluctant about the causal status of the redundant event e2that has been constantly paired with the predictive event e1from Phase1.This prediction has come under attack by recent findings that have shown that the blocking effect depends on the causal model learn-ers bring to bear on the task(see Waldmann,1996,2000).If participants assume that e1and e2are the causes of x(com-mon-effect structure)a blocking effect can be seen.In con-trast,if participants assume that e1and e2are the collateral effects of the common cause x(common-cause structure),no blocking of e2is observed.In this condition,learners tend to view both e1and e2as equally valid diagnostic cues of x.To model blocking,we used a network that was ex-tended after Phase1.In Phase1,the net consisted of a hy-pothesis node(H2)and the nodes for patterns of two events (e1,x).After Phase1,the final activation of the hypothesis node was transferred to Phase2.In Phase2,the network consisted of two nodes for the two causal hypotheses(H2 and H3),and nodes that represented patterns of three events, the patterns participants observed within the learning phase. Furthermore,the node H1was included,which,depending on the condition,either coded a common-cause or a com-mon-effect hypothesis.The nodes for the event pairs from Phase1were removed.Figure2shows the activation of the two hypotheses re-ferring to the causal relations in Phase1and2.Figure2A depicts the activation for the common-cause model and Fig-ure2B for the common-effect model.Figure2A:Simulation of a blocking paradigm(Test1).Activation of hypothesis nodes for a common-causemodel.The solid line represents the activation ofH2:x→e1,the dotted line of H3:x→e2.Phase2started at the101st.cycle.The model shows no blocking for event e2in the context of the common-cause model.It quickly acquires the belief that there is a causal connection between x and e2.Figure2B:Simulation of a blocking paradigm(Test1).Activation of hypothesis nodes for a common-effectstructure.The upper line represents the activation ofH2:e1→x,the lower line of H3:e2→x.Phase2started at the101st cycle.For the common-effect model the simulation shows blocking of the second cause,that is the second hypothesis is believed to be wrong.Thus,the simulations closely correspond to the empirical finding that blocking interacts with the structure of the causal model used to interpret the learning data.Test2:Testing Complex Causal HypothesesThe first test of the model used a phenomenon from the lit-erature on causal learning.We now want to turn to a com-pletely different paradigm,hypothesis testing.In experi-ments on causal learning participants are typically instructed about a causal structure,and the task is to learn about the causal relations within the structure.They are not asked whether they believe that the structure is supported by the learning data or not.In recent experiments(Hagmayer, 2001;Hagmayer&Waldmann,2001)we gave participants the task to test a complex causal model hypothesis.For ex-ample,we asked them whether three observed events sup-port a common-cause hypothesis or not.Normatively this task should be solved by testing the implications of the given structural hypothesis.For example,a common-cause model implies a(spurious)correlation of the effects of the single common cause.In contrast,a common-effect structure does not imply a correlation of the different causes of the joint effect.Unless there is an additional hidden event that causes a correlation among the causes,they should be uncorrelated. In the experiment,participants were given data which either displayed a correlation between all three events(data set1) or correlations between e1-x and e2-x only,that is e1and e2 were marginally independent in this data(data set2).Data set1was consistent with a common-cause hypothesis which implies correlations between all three events.In contrast, data set2favors the common-effect hypothesis with x as the effect and e1and e2as independent causes.However,in a series of experiments we found that participants were not aware of these differential structural implications when test-ing the two hypotheses.Instead they checked whether the individual causal relations within the complex structures held(e.g.,e1-x).Thus,participants dismissed a hypothesis if one of the assumed causal links was missing.However,they proved unable to distinguish between the common-cause and the common-effect structure when both structures specified causal connections between the same events(regardless of the direction).To model this task we used the model without the nodes for event pairs and individual events.The special activation node was connected to the patterns of three events.As be-fore the activation of the individual event patterns was pro-portional to the frequency of the respective pattern in the data.To test the model,we used three sets of data.Either all three events were correlated(data set1),e1and x,and e2 and x were correlated and e1and e2were marginally inde-pendent(data set2),or e1and x,and e1and e2were corre-lated,and e2and x were uncorrelated(data set3).As com-peting hypotheses we either used a common-cause model with x as the common cause,or a common-effect model with x as the common effect.Figure3shows the activation of the node H1which represents the hypothesis that the respective causal model underlies the observed data.Figure3A shows the results for the common-cause hy-pothesis,Figure3B for the common-effect hypothesis.The results clearly mirror the judgments of our participants. Whenever the two assumed causal relations within either causal model were represented in the data,the structural hypothesis was accepted(solid lines),if one link was miss-ing the hypothesis was rejected(dotted line).One slight deviation from our empirical findings was ob-served.In early cycles there seems to be an effect favoring the common-effect hypothesis with data consistent with this hypothesis.However,the difference between the hypotheses is relatively small and further decreases after100updating cycles.Thus,the results are consistent with participants’insensitivity to structural implications of causal models in hypothesis testing tasks.Figure3A:Activation of hypothesis node H1for a com-mon-cause model(Test2).The solid lines represent the activations for data set1and2,the dotted line the activa-tions for data set3.Figure3B:Activation of hypothesis node H1for a com-mon-effect model(Test2).The solid lines represent the activations for data set1and2,the dashed line at thebottom the activations for data set3Why does the model not differentiate between the two causal structures?The reason is that it is assumed that com-plex structural hypotheses are not directly linked to empiri-cal evidence.In our model empirical evidence is connected to the hypotheses that represent individual causal links which in turn are linked to more complex model-related hypotheses.This architecture allows it to model learning and hypothesis testing within the same model.It also seems to capture the empirical finding that participants can easily decide whether a certain pattern of events supports a simple causal hypothesis,but have a hard time to relate event pat-terns to complex causal hypotheses.Test3:Causal InferencesIn the previous section we have mentioned studies showing insensitivity to spurious relations implied by causal models.A last test for our model is a task in which participants have to predict other events under the assumption that a certain causal model holds.Interestingly we have empirically dem-onstrated sensitivity to structural implications of causal models in this more implicit task(Hagmayer&Waldmann, 2000).In this task participants do not have to evaluate the validity of a causal model in light of observed evidence but rather are instructed to use causal models when predicting individual events.In our experiments we presented partici-pants with two learning phases in which they learned about two causal relations one at a time.Thus,in each phase par-ticipants only received information about the presence and absence of two events(x and e1,or x and e2).They never saw patterns of all three events during the experiment.The initial instructions described the two causal relations,which were identically presented across conditions,either as parts of a common-cause model with x as the cause or as part of a common-effect model with x as the effect.After participants had learned about the two causal relations we asked them to predict whether e1and e2were present given that x was present.We found that participants were more likely to pre-dict that both e1and e2would co-occur when x was viewed as the common cause than when it was seen as a common effect.Thus,in this more implicit task the predictions ex-pressed knowledge about structural implications of causal models.In particular,the patterns the participants predicted embodied a spurious correlation among the effects of a com-mon cause,whereas the causes of a common effect tended to be marginally uncorrelated in the predicted patterns.By contrast,in a more direct task which required explicit judgments about correlations,no such sensitivity was observed,which is consistent with the results reported in the previous section.To model this experiment we eventually used the com-plete network depicted in Figure1which was successively augmented according to our two principles.In Phase1,the learning phase,patterns of two events were connected to the hypotheses H2and H3.Depending on the learning condi-tion,these two hypotheses were either linked to a common-cause or a common-effect hypothesis(H1).The activations of the hypothesis nodes at the end of Phase1were used as initial activation values in Phase2.In Phase2the model consisted of the three hypothesis nodes,the nodes for pat-terns of three events and the nodes representing single events.The single event nodes were included because the task required the prediction of individual events.The special activation node was now attached to event x.The model then predicted the other two individual events and patterns of all three events.The model quickly learned the causal relations during Phase1of the experiment.Figure4depicts the results of Phase2.Figure4A shows the predictions of the model for the condition in which participants assumed a common-cause model,Figure4B shows the results for the common-effect condition.The results of the simulations are consistent with the behavior we have observed in our participants. When the model assumes a common-cause model the pres-ence of x leads to a high positive activation of the two ef-fects e1and e2.This means that the model tends to prefer the prediction that the two effects of a common cause co-occur.In contrast,for the common-effect structure the model does not show such a preference.In this condition, both causes or either one of them equally qualify as possible explanations of the observed effect.This means that our model,similar to the one Thagard(2000)has proposed, tends to“explain away”the second cause when one of the competing causes is present.This is a consequence of the competition between the two causal hypothesis H2and H3.Figure4A:Implicit causal inferences(Test3).Activa-tion of single event nodes for the common-cause model: Event x(top),events e1and e2(bottom)Figure4B:Implicit causal inferences(Test3).Activa-tion of single event nodes for the common-effect model: Event x(top),event e1(middle),event e2(bottom)DiscussionA constraint satisfaction model of causal learning and rea-soning was presented in this paper that extends the architec-ture and scope of the model proposed by Thagard(2000). Thagard’s model focuses upon causal explanations of singu-lar events and belief updating.Our aim was to create a model that allows it to model both learning and reasoning within causal models.The model was successfully applied to three different tasks.It modeled people’s sensitivity to struc-tural implications of causal models in tasks involving learn-ing and predictions whereas the same model also predicted that people would fail in tasks which required explicit knowledge of the statistical implications of causal models.One question that might be raised is whether the pro-posed model really captures learning or just models causal judgment.In our view,the concept of learning does not nec-essarily imply incremental updating of associative weights Our model embodies a hypothesis testing approach to learn-ing which assumes that learners modify the strength of belief in deterministic causal hypotheses based on probabilistic learning input.This view also underlies recent Bayesian models of causality(Pearl,2000).In the model the activa-tion(i.e.,degree of belief)of the hypothesis nodes is modi-fied based on the learning input.This way the model is ca-pable of modeling trial-by-trial learning as well as learning based on summary data within the same architecture.Thus far we have pre-set the weights connecting evi-dence and hypotheses.In our view,the assigned values re-flect everyday qualitative intuitions about whether an event pattern supports or contradicts a hypothesized causal hy-pothesis.These weights remained constant throughout the simulations.Despite this restriction the model successfully predicted empirical phenomena in learning and reasoning. However,pre-setting these weights is not a necessary feature of the model.It is possible to add a learning component that acquires knowledge about the relation between event pat-terns and hypotheses based on feedback in a prior learning phase(see Wang et al.,1998,for a model adding associative learning to Echo).In summary,our constraint satisfaction model seems to offer a promising new way to model causal learning and reasoning.It is capable of modeling phenomena in a wide range of different tasks,which thus far have been treated as separate in the literature.Relative to normative Bayesian models,our connectionist model allows it to simulate a large number of different tasks and different phenomena while using fairly simple computational routines.It proved capable of capturing a number of recent phenomena that have pre-sented problems to extant models of causal cognition.More tests of the model clearly seem warranted.ReferencesHagmayer,Y.(2001).Denken mit undüber Kausalmodelle. Unpublished Doctoral Dissertation,University of Göttin-gen.Hagmayer,Y.,&Waldmann,M.R.(2000).Simulating causal models:The way to structural sensitivity.In L.R. Gleitman& A.K.Joshi(Eds.),Proceedings of the Twenty-Second Annual Conference of the Cognitive Sci-ence Society(pp.214-219).Mahwah,NJ:Erlbaum. Hagmayer,Y.,&Waldmann,M.R.(2001).Testing com-plex causal hypotheses.In M.May&U.Oestermeier (Eds.),Interdisciplinary perspectives on causation(pp.59-80).Bern:Books on Demand.Pearl,J.(2000).Causality:Models,reasoning,and infer-ence.Cambridge:Cambridge University Press. Rescorla,R.A.,&Wagner,A.R.(1972).A theory of Pav-lovian conditioning:Variations in the effectiveness of rein-forcement and non-reinforcement.In A.H.Black&W.F. Prokasy(Eds.),Classical conditioning II.Current re-search and theory(pp.64-99)New York:Appleton-Century-Crofts.Shanks,D.R.,Holyoak,K.J.,&Medin,D.L.(Eds.)(1996). The psychology of learning and motivation,Vol.34: Causal learning.San Diego:Academic Press. Thagard,P.(1989).Explanatory coherence.Behavioral and Brain Sciences,12,435-467.Thagard,P.(2000).Coherence in thought and action.Cam-bridge,MA:MIT Press.Waldmann,M.R.(1996).Knowledge-based causal induc-tion.In D.R.Shanks,K.J.Holyoak&D.L.Medin(Eds.), The psychology of learning and motivation,Vol.34: Causal learning(pp.47-88).San Diego:Academic Press. Waldmann,M.R.(2000).Competition among causes but not effects in predictive and diagnostic learning.Journal of Experimental Psychology:Learning,Memory,and Cognition,26,53-76.Wang,H.,Johnson,T.R.,&Zhang,J.(1998).UEcho:A model of uncertainty management in human abductive rea-soning.In M.A.Gernsbacher&S.R.Derry(Eds.),Pro-ceedings of the Twentieth Annual Conference of the Cog-nitive Science Society(pp.1113-1118).Mahwah,NJ:Erl-baum.。
约束的英文作文范文高中英文:Constraints are a part of our daily lives. They can be physical, financial, or even emotional. However,constraints are not always negative. In fact, they can be beneficial in many ways.For example, financial constraints can force us to be more creative with our spending. We may have to find ways to save money or come up with alternative solutions to meet our needs. This can lead to a more fulfilling andsatisfying life.Similarly, physical constraints can also have positive effects. For instance, if we have a physical disability, we may need to find new ways to accomplish tasks. This can lead to new discoveries and innovations that benefit not only ourselves but also others.Emotional constraints can also be beneficial. For instance, if we are in a difficult relationship, we mayneed to learn to communicate better and find ways to work through our problems. This can lead to a stronger and healthier relationship.Overall, constraints can be challenging, but they can also be opportunities for growth and development. We just need to approach them with an open mind and a willingnessto learn and adapt.中文:约束是我们日常生活中的一部分。
约束也是一种幸福作文指导英文回答:Constraints can indeed be a form of happiness. By imposing limits on our actions and choices, constraints can help us to focus our attention, develop clarity, and appreciate the present moment.One way in which constraints can foster happiness is by providing a sense of purpose and direction. When we have too many options, it can be difficult to know what to do or where to start. Constraints can help us to narrow down our choices and focus on what is truly important. By knowing what we are not allowed to do, we can more easily identify the things that we should do.Constraints can also help us to develop a sense of appreciation for the things that we have. When we are constantly bombarded with new and tempting options, it can be easy to take our current circumstances for granted.Constraints can help us to break out of this cycle of dissatisfaction and appreciate the beauty and simplicity of what we already have. By knowing what we cannot have, we can more easily recognize the value of what we do have.Finally, constraints can help us to live in the present moment. When we are constantly thinking about the future or the past, it is difficult to be truly present in the current moment. Constraints can help us to break out ofthis cycle of distraction and focus on the things that are happening right now. By knowing what we are not allowed to do, we can more easily let go of the things that we cannot control and focus on the things that we can.中文回答:约束也可以是一种幸福。
流程优化方法论Process optimization is essential for any organization looking to improve efficiency, reduce costs, and increase productivity. 流程优化对于任何希望提高效率、降低成本并增加生产力的组织来说都至关重要。
It involves analyzing current processes, identifying inefficiencies, and making changes to streamline operations. 这涉及分析当前流程,找出低效之处,并进行改变以简化运营。
However, the process of optimizing workflows can be challenging, as it requires a deep understanding of the organization's goals, resources, and constraints. 然而,优化工作流程的过程可能是具有挑战性的,因为它需要对组织的目标、资源和限制有深刻的理解。
One approach to process optimization is to use Lean methodology, which focuses on eliminating waste and maximizing value for the customer. 优化流程的一个方法是使用精益方法论,该方法论侧重于消除浪费并最大化价值。
By identifying and eliminating non-value-added activities, organizations can improve efficiency and reduce lead times. 通过识别和消除非增值活动,组织可以提高效率并缩短交货时间。
A Constraint Satisfaction Approachto the Robust Spanning Tree Problem with Interval DataIonut¸Aron and Pascal Van HentenryckBrown University,Department of Computer Science,Providence,RI,02912E-mail:{ia,pvh}Abstract.Robust optimization is one of the fundamental approaches to deal withuncertainty in combinatorial optimization.This paper considers the robust span-ning tree problem with interval data,which arises in a variety of telecommunica-tion applications.It proposes a constraint satisfaction approach using a combina-torial lower bound,a pruning component that removes infeasible and suboptimaledges,as well as a search strategy exploring the most uncertain edgesfirst.Theresulting algorithm is shown to produce very dramatic improvements over themathematical programming approach of Yaman et al.and to enlarge consider-ably the class of problems amenable to effective solutions.In particular,it runs several hundred times faster than the MIP approach on the instance data proposed in [Yaman et al.,2001].It also exhibits even more signif-icant speed-ups on other instances which have more structure.In addition,the constraint satisfaction ap-proach significantly broadens the class of instances that are amenable to effective solutions.Observe also that the constraint satifaction approach should apply equally well to other robust optimization problems,such as ro-bust matching and robust shortest paths,which also arise in telecommunications and computational fi-binatorial lower bounds can be obtained similarly and the challenge is mainly to find effective characterizations of suboptimal solutions.As a consequence,we believe that constraint satisfaction approaches are likely to play a fundamental role in the solving of robust optimization problems in the future.We also believe that the concept of suboptimality pruning,i.e.,removing values that can-not appear in any optimal solution,is fundamental and deserves further study for optimization problems in gen-eral.The rest of the paper is organized as follows.Sections 2and 3define the problem and discuss prior work.Sec-tions 4,5,6,and 7discuss the search algorithm,the lower bound,suboptimality pruning,and branching.Section 8presents the experimental results,and Section 9con-cludes the paper.2The ProblemInformally speaking,the robust spanning tree prob-lem,given an undirected graph with interval edge costs,amounts to finding a tree whose cost is as close as possi-ble to that of a minimum spanning tree under any possi-ble assignment of costs.This section defines the problem formally and introduces the main concepts and notations used in the paper.We are given an undirected graph G =(V,E )with |V |=n nodes and |E |=m edges and an interval [c e c e ]for the cost of each edge e ∈E .A scenario s is a partic-ular assignment of a cost c e ∈[c e c e ]to each edge e ∈E .We use c s e to denote the cost of edge e under a given sce-nario s .Recall that a spanning tree for G =(V,E )is a set of edges T ⊆E such that the subgraph G =(V,T )is acyclic and ∀i ∈V,∃j ∈V :(i,j )∈T .The cost of a spanning tree T for a scenario s ,denoted by c s T ,is the sum of the costs of all edges under scenario s :c s T = e ∈Tc s e .A minimum spanning tree for scenario s ,denoted byMST s ,is a spanning tree with minimal cost for sce-nario s and its cost is denoted by c MST s .Following [Yaman et al.,2001],we now define the worst case sce-nario for a spanning tree T and the robust deviation of T ,two fundamental concepts for this paper.Figure 1.The relative worst case scenario determines the robust deviation ∆T of agiven spanning tree T .Our goal is to find a tree T ∗for which this deviation (distance)is minimal.Definition 2.1(Relative worst case scenario).Given a spanning tree T ,a scenario w (T )which maximizes the difference between the cost of T and the cost of a mini-mum spanning tree under w (T )is called a relative worst case scenario for T .More precisely,a relative worst case scenario w (T )satisfiesw (T )∈arg-max s ∈S(c s T −c MST s )(1)where S is the set of all possible scenarios.Note that arg-max s ∈S f (s )denotes the set{e ∈S |f (e )=max s ∈Sf (s )}.and arg-min is defined similarly.Definition 2.2(Robust Deviation).The robust deviationof a spanning tree T ,denoted by ∆T ,is the distance be-tween the cost of T and the cost of a minimum spanning tree under the relative worst case scenario of T :∆T =c w (T )T−c MST w (T ).(2)The goal of this paper is to compute a robust spanning tree,i.e.,a spanning tree whose robust deviation is mini-mal.Definition 2.3(Robust Spanning Tree).A (relative)ro-bust spanning tree is a spanning tree T ∗whose robust deviation is minimal,i.e.,T ∗∈arg-min T ∈ΓG max s ∈S(c s T −c MST s )where ΓG is the set of all spanning trees of G and S is the set of all possible scenarios.According to the definition of w (T )(Eq.1),this is equivalent to:T ∗∈arg-min T ∈ΓG(c w (T )T−c MST w (T )).2Figure 1depicts these concepts graphically.The horizon-tal axis depicts various scenarios.For each scenario s ,it shows the cost c s T of T under s and the cost c MST s of an MST under s .In the figure,scenario s 5is a worst-casescenario,since the distance c s 5T −c MST s 5is maximal andthis distance is then the robust deviation of T .Such a fig-ure can be drawn for each spanning tree and we are in-terested in finding the spanning tree with the smallest ro-bust deviation.[Kouvelis and Yu,1997]conjectured that the robust spanning tree problem with interval edges is NP-complete.3Prior Work[Yaman et al.,2001]proposed an elegant MIP formula-tion for the RSTPIE problem.The formulation combines the single commodity model of the minimum spanning tree with the dual of the multicommodity model for the same problem.In addition,they introduced the concept of weak and strong edges and used them for prepro-cessing.They showed that preprocessing significantly enhances the performance of their MIP implementation.We now summarize their relevant results.We first introduce the concept of weak edges,i.e.,the only edges that need to be considered during the search.Definition 3.1(Weak Tree).A tree T ⊆E is weak if there exists at least one scenario under which T is a minimum spanning tree of G .Definition 3.2(Weak Edge).An edge e ∈E is weak if it lies on at least one weak tree.Strong edges are edges that are necessarily part of a ro-bust spanning tree.Definition 3.3(Strong Edge).An edge e ∈E is strong if it lies on a minimum spanning tree of G for all possible scenarios.The following propositions characterize weak and strong edges in terms of the minimum spanning trees of two scenarios.Proposition 3.4.An edge e ∈E is weak if and only if there exists a minimum spanning tree using edge e when its cost is at the lowest bound and the costs of the remaining edges are at their highest bounds.Proposition 3.5.An edge e ∈E is strong if and only if there exists a minimum spanning tree using edge e when its cost is at the highest bound and the costs of the remaining edges are at their lowest bounds.As a consequence,[Yaman et al.,2001]showed that it is possible to use a slightly modified version of Kruskal’s algorithm to find all the weak and strong edges of a given graph in O (m 2log m )time.In Section 6,we give an im-proved algorithm for finding weak edges,which runs in O (mlogm ).The following two propositions capture the intuition we gave earlier on weak and strong edges.procedure Search ( S,R )beginif |S |<n −1thenS,R =P runeInfeasible ( S,R ); S,R =P runeSuboptimal ( S,R );if LB ( S,R )≤f ∗thene =SelectEdge (E \(S ∪R ));Search( S,R ∪{e } );Search( S ∪{e },R );elseif ∆S <f ∗thenT ∗=S ;f ∗=∆SendFigure 2.The Search AlgorithmProposition 3.6.A relative robust spanning tree T is a weak tree.Thus an edge can be part of a relative robust spanning tree only if it is a weak edge.Proposition 3.7.There exists a relative robust tree T such that every strong edge in the graph lies on T.The next result is also fundamental:it makes it possi-ble to characterize precisely the worst case scenario of a spanning tree T .Proposition 3.8.The scenario in which the costs of all edges in a spanning tree T are at upper bounds and the costs of all other edges are at lower bounds is a relative worst case scenario for T .In other words,w (T )is specified asc w (T )e =,e ∈E \T (3)In other words,once we select a tree T ,we can easily find the worst-case scenario for T by assigning to the edges in T their upper bounds and to the edges not in T their lower bounds.We use Proposition 3.8in our new algo-rithm.In the rest of the paper,we also use the notation w (S )to denote the scenario where c e =otherwise even when S is not a tree.4The Search AlgorithmFigure 2gives a high-level description of the search al-gorithm.A node in the search tree is called a configu-ration to avoid confusion with the nodes of the graph.A configuration is a pair S,R ,where S represents the set of selected edges and R represents the set of rejected edges.The algorithm receives a configuration as input.If the configuration is not a spanning tree,the algorithm prunes infeasible and suboptimal edges.Both steps re-move edges from E \(S ∪R )and adds them to R .If the lower bound of the resulting configuration is smaller than the best found solution,the algorithm selects an edge and explores two subproblems recursively.The3c s T the cost of tree T under scenario sc s T=P e∈T c s ew(T)worst case scenario for edge set T:= ,e∈E\Tc w(T)eΓG the set of all spanning trees of graph G T(S,R)the set of spanning trees derived fromS,RT(S,R)={T∈ΓG|S⊆T⊆E\R} M s(S,R)the set of spanning trees from T(S,R)with mimimal cost under scenario sM s(S,R)=arg-min T∈T(S,R)c s TMST s(S,R)an arbitrary tree from M s(S,R)M s M s(∅,∅)MST s an arbitrary tree from M s(∅,∅)Proposition 6.1.Let s be a scenario and T ∈M s .Let e =(u,v )/∈T and f =(x,y )be the edge of maximal cost on the path from u to v .Consider ˆs the scenario s where c s e is replacedby c ˆs e ,all other costs remaining the same.Then T ∈M ˆs if c ˆs e ≥c ˆs f .Proof:By contradiction.Assume that there exists a treeT containing e such that c ˆs T <c ˆs T .Removing e from T produces two connected components C 1and C 2.If x ∈C 1and y ∈C 2,then we can construct a treeT =T \{e }∪{f }and we havec ˆs T =c ˆs T −(c ˆs e −c ˆs f )<c ˆs T <c ˆs T .Since e ∈T and e ∈T ,we havec s T =c ˆs T <c ˆs T =c sTwhich contradicts the fact that T ∈M s .If x,y ∈C 1(resp.C 2),since there exists a cycle in the graph containing e and f ,there exists at least one edge g on the path from u to v in T such that g ∈T (otherwise T would not be atree).By hypothesis,c s g ≤c s f and hence c ˆs e ≥c ˆs g .We can thus apply the same construction as in the case x ∈C 1and y ∈C 2with f replaced by g .Proposition 6.2.Let s be a scenario and T be an MST s .Let e =(u,v )/∈T and f =(x,y )be the edge of maximal cost on the path from u to v .Then,T \{f }∪{e }∈M s ({e },∅).The proof of this result is similar to the proof of Proposi-tion 6.1.We are now ready to present a new characteri-zation of weak edges.The characterization only uses the scenario ¯s where all costs are at their upper bounds.Proposition 6.3.Let ¯s be the scenario where all costs are attheir upper bounds and T ∈M ¯s .An edge e =(u,v )is weak if e ∈T or if an edge f of maximal cost on the path from u to v in T satisfies c e c f .Proof:Let ˆs the scenario ¯s where c ¯s e is replaced by c ˆs e .Ife ∈T ,then T ∈M ˆs as well and hence e is weak byProposition 3.4.If e /∈T and c ec f ,then T ∈M ˆs by Proposition 6.1.By Proposition 6.2,T \{f }∪{e }∈M ¯s ({e },∅)and its cost is greater than c T since c e c f .Hence e is not weak.If e /∈T and c e c f ,then T \{f }∪{e }is an MST ¯s and hence e is weak.The same holds for the case where e ∈T and c e c f .We now describe how to use Proposition 6.3to obtain an O (m log m )algorithm on dense graphs.The key ideais to compute MST ¯s ,i.e.,the MST when all costs are at their upper bounds.All edges in this MST are weak.In addition,for each e =(u,v )not in the MST,we com-pute the largest cost of any edge on the path from u to v .This can be done easily by upgrading Prim’s algorithm slightly to associate a level and a parent to each vertex.When an edge e =(u,v )with u ∈T and v /∈T is added to T in Prim’s algorithm,we setfunction MaxCost(u ,v vertices):intbeginif u =v thenreturn 0;else if level (u )<level (v )thenreturn max(cost (v,p (v )),MaxCost(u ,p (v )));else if level (u )>level (v )thenreturn max(cost (u,p (u )),MaxCost(p (u ),v ));elsemaxedge,MaxCost(p (u ),p (v )));endFigure 3.Finding the Largest Cost of an Edgelevel(v)=level(u)+1;parent(v)=u;These modifications do not affect the complexity of the algorithm.It now suffices to apply the algorithm de-picted in Figure 3to compute the cost of the maxi-mal edge and to apply Proposition 6.3.The algorithm in Figure 3simply follows the paths from u and v to their common ancestor,computing the cost of the max-imal edge on the way.Since algorithm MaxCost takes O (n )in the worst case,the overall complexity becomes O (m log m +mn ).However,it is easy to reduce this complexity to O (m log m +n 2)by amortizing the com-putation of the maximal costs.It suffices to cache the results of MaxCost in an n ×n matrix M .For each edge e =(u,v )∈E \T we first examine entry M [u,v ]and call MaxCost (u ,v )only if this entry is uninitialized.Since there are at most n 2entries and each call to MaxCost (u ,v )costs O (1)per entry it fills,the overall complexity com-plexity becomes O (m log m +n 2).We proved the follow-ing theorem.Theorem 6.4.All weak edges of a graph can be computed in O (m log m +n 2)time.7Branching StrategyIt is well-known that a good branching heuristic may improve performance significantly.We now show a branching heuristic adapting the first-fail principle [Haralick and Elliot,1980]to robust optimization.The key idea is to explore the most uncertain edges first,i.e.,to branch on an edge e with the maximal difference.Indeed,rejecting e (i.e.,adding e to R )allows MST w (S ∪L )to select e at a low cost,possibly giving a large deviation.Hence this branch is likely to fail early.However,this is only important if MST w (S ∪L )is likely to select e which may not necessarily the case if c ec e −c eFigure4.A Class7NetworkFigure5.A Class8Networkbranching strategy.Definition7.1(Branching Strategy).Let S,R be a con-figuration,L∗=L∩MST w(S∪L),and L−=L\ MST w(S∪L).The branching strategy selects an edge s defined as follows:s= arg-max e∈L∗if L∗=∅arg-max e∈L−otherwise.8Experimental ResultsWe now report experimental results comparing the constraint atisfaction and the MIP approaches.The com-parison uses the instances in[Yaman et al.,2001],as well as some new instances which capture additional struc-ture arising in practical applications.The experimental setting of[Yaman et al.,2001]uses complete graphs with n vertices and six classes of prob-lems.Three of the six classes use tight intervals for edge costs,while the other three allowed larger differences be-tween the lower and upper bounds.The edge intervals were chosen as follows.The values of c e c e are uni-formly distributed in the intervals:class1:c e c e∈(c e∈[0,15],,15]class3:c e c e∈(c e∈[0,10],,20]class5:c e c e∈(c e∈[0,20],,40]Note that the size of the search space to explore is O(2300) for a complete graph of25nodes.Of course,our con-straint satisfaction algorithm will only explore a smallfraction of that space.In addition to these six classes, we also generate instances(Classes7and8)whose cost structure is not the same for all the edges.Class7con-tains instances which represent a two-level network.The lower level consists of clusters of5nodes whose edges are generated according to Class1above.The upper level links the clusters and these edges have higher costs, i.e.,Class1costs shifted by a constant which is larger than the Class1edges.This captures the fact that,in net-works,there are often various types of edges with dif-ferent costs.See Figure4for an instance of Class7with 35nodes.Class8contains instances which are similar to Class7,except that the upper-level layer is organized as a binary tree.See Figure5for an instance of Class8with35 nodes.In general,classes7and8are significantly harder than classes1to6,since preprocessing is less effective than in Classes1to6because of the additional structure. Observe also that these instances are sparser for the same number of nodes.Table2compares the efficiency of the two approaches. It reports the computation times in CPU seconds of the MIP implementations,the constraint satisfaction algo-rithm with suboptimality pruning at the root node only (CSR),and the constraint satisfaction algorithm with suboptimality pruning at every node(CSF).The times are given on a Sun Sparc440Mhz processor and the av-erage is computed over10instances for each class and each size.We used CPLEX6.51for solving the MIP,after preprocessing of the weak and strong edges.Observe that the constraint satisfaction approach produces extremely dramatic speedups over the MIP approach.On Class1,CSR runs about134times faster than the MIP on graphs with15nodes and about217times faster on graphs with20nodes.On Classes7and8,the speed-ups are even more impressive.On graphs with20nodes for classes7and8,CSR runs about3500and200times faster than the MIP.(Recall that these graphs are not com-plete).The MIP times are not given for graphs with more than20nodes,since they cannot be obtained in reason-able time.The results indicate that the constraint satis-faction approach is also better at exploiting the network structure,which is likely to be fundamental in practice. Observe also that the constraint satisfaction approach is able to tackle much large instances in reasonable times.Figures6,7,and8plot the execution times of the two constraint satisfaction algorithms.Observe also that ap-plying suboptimality pruning at every node is not cost-effective on Classes1to6.This is due to the fact that the graphs are complete and the costs are uniformly dis-tributed in these instances.Classes7and8,which add a simple additional cost structure,clearly indicate that suboptimality pruning becomes increasingly important when the network is more heterogeneous.The benefits of suboptimality pruning clearly appears on large graphs for class8,where CSF is about three times as fast in the average.In fact,CSF is significantly faster(e.g.,10times faster)than CSR on some instances,while the two algo-6Algo.Class1Class3Class5Class7 CSR0.120.080.060.06 CSF0.140.120.090.07 MIP 6.59 4.10 4.68 2.29 CSR 1.82 1.090.86 1.76 CSF 2.53 2.15 2.01 1.83 MIP245.19136.8891.33730.20 CSR39.728.91 2.097.37 CSF74.8612.66 4.69 5.28 MIP8620.6614385.5512862.2918547.27 CSR121.8568.4212.2391.89 CSF244.18145.1130.5597.94 CSR926.65942.3863.851719.61 CSF2100.431811.88167.58804.78 CSR4639.552304.36188.3632511.77 CSF10771.374183.22548.0114585.73 CSR27206.3815059.391122.5757309.23 CSF29421.7022084.003241.0428432.91C P U s e c o n d sGraph size - number of edgesExecution time for problems of class 8Figure 8.Constraint Satisfaction on Class 8.O (m log m )algorithm for detecting strong edges,to quantify the benefits of incremental MST algorithms,to investigate more sophisticated feasibility pruning,and to evaluate the approach on sparser graphs.Finally,it would be interesting to study whether suboptimal con-ditions can be derived for a wide variety of (robust)com-binatorial optimization problems given the ubiquity of these problems in practice.In particular,robust match-ing and robust shortest paths have fundamental appli-cations in computational finance and networks and are worth investigating.AcknowledgmentsIonut Aron is supported by NSF ITR Award ACI-0121497.Pascal Van Hentenryck is partially supported by NSF ITR Award DMI-0121495.References[Bertsimas and Sim,2002]Bertsimas,D.and Sim,M.(2002).Robust discrete optimization.Working Paper,Operations Research Center,MIT.[Birge and Louveaux,1997]Birge,J.and Louveaux,F.(1997).Intro-duction to stochastic programming.In Verlag,S.,editor,Springer Verlag Series in Operations Research ,New York.[Haralick and Elliot,1980]Haralick,R.and Elliot,G.(1980).Increas-ing tree search efficiency for constraint satisfaction problems.Artifi-cial Intelligence ,14:263–313.[Kouvelis and Yu,1997]Kouvelis,P .and Yu,G.(1997).Robust Discrete Optimization and Its Applications .Kluwer Academic Publishers.[Kozina and Perepelista,1994]Kozina,G.and Perepelista,V .(1994).Interval spanning tree problem:Solvability and computational com-plexity.Interval Computing ,1:42–50.[Rauch et al.,1997]Rauch,Henzinger,M.,and V .,K.(1997).Maintain-ing minimum spanning trees in dynamic graphs.In Proceedings of the 24th International Colloquim on Automata,Languages and Programming (ICALP 1997),pages 594–605.[Walsh,2001]Walsh,T.(2001).Stochastic constraint programming.In AAAI Fall Symposium on Using Uncertainty within Computation .[Yaman et al.,2001]Yaman,H.,Karasan,O.E.,and Pinar,M.C.(2001).The robust spanning tree problem with interval data.Operations Re-search Letters ,29:31–40.8。