Reestimation and Best-First Parsing Algorithm for Probabilistic Dependency Grammar
- 格式:pdf
- 大小:682.48 KB
- 文档页数:15
ABSTRACT.The association rule discovery problem consists in identifying frequent itemsets in a database and,then,forming conditional implication rules among them.The algorithmically most difficult part of this task isfinding all frequent sets.There exists a wealth of algorithms both for the problem as such and for variations,particular cases,and generalizations.Except for some recent,fully different approaches,most algorithms can be seen either as a breadth-first search or a depth-first search of the lattice of itemsets.In this paper,we propose a way of developing best-first search strategies.RÉSUMÉ.Le problème de la quête de règles d’association consiste en l’identification d’itemsets fréquents dans une base de données,et puis en produir des rè difficultéalgorithmique la plus importante de cette tâche est celle de chercher tous les itemsets fréquents.Il y a beaucoup d’algorithmes pour ce même problème et pour les plusieurs variations.Sauf récents solutions très différentes,la majoritédes algorithmes peuventêtre classifiés en largeur d’abord(breadth-first)ou profondeur d’abord(depth-first).Nous proposons une façon de déveloper la quête pour les théchniques du meilleur d’abord(best-first).KEYWORDS:data mining,frequent itemsets,association rules,breadth-first search,depth-first search.MOTS-CLÉS:data mining,itemsets fréquents,règles d’association,quête en largeur d’abord, quête en profondeur d’abord.2EGC’2002.1.IntroductionThe researchfield of Knowledge Discovery in Databases(KDD)aims at develop-ing methods,algorithms,and criteria forfinding useful information in large masses ofdata.One of the most relevant tasks in KDD is the discovery of association rules,firstformulated in[R.A93].The“association rule”discovery task identifies the groups of items that most often appear along with other groups of items.The standard input toa frequent sets algorithm is a collection of itemsets,or transactions,where each trans-action is a subset of a given,fixed set of items.A-itemset is an itemset consisting of items.The support of an itemset is the fraction of transactionsin the input database that contain the itemset.An itemset is called frequent if its sup-port exceeds a given threshold,.The frequent sets problem is:given the database and the threshold,find all frequent sets of items.For this frequent itemsets problem,there is a rich variety of algorithmic proposals. Most of them,with the only exception of some recent alternative approaches[R.A00], [J.H00a],[J.H00b](which are based on computing projections of the input database), are based on traversing the lattice of itemsets in search of either all frequent sets or at least all maximal frequent sets.As analyzed in[J.H00c],the principal lattice-traversal algorithms can be roughly classified into two high-level frameworks:either they focus on counting the support of each itemset,usually through a breadth-first search,like Apriori[R.A96]and its variants,or onfinding frequent itemsets by intersecting lists of transactions containing each itemset,usually in a depth-first-like search,like Eclat [M.J97]and its variants.However,there are other alternatives,and we propose here to study one of them,a best-first strategy in a well-defined sense.In our best-first strategy,the exploration deepens early on those itemsets that are easy to certify as candidates:a sort of adaptive search.Currently our strategy does not seem to be truly competitive with fully differ-ent approaches like FPGrowth[J.H00b]or the TreeProjection algorithm of[R.A00];however,we continue to study these algorithms from our perspective.1.1.Breadth-First AlgorithmsMany frequent sets algorithms traverse the lattice of all itemsets according to inclu-sion,by seeing it as a graph.A basic,well-known breadth-first algorithm forfinding frequent itemsets is the Apriori algorithm[R.A96].It devotes one pass through the database to each itemset size.Along each pass,it determines the support of all can-didate itemsets of that size,and at the end of each pass,a candidate generation phase is performed.A variation of Apriori is the Dynamic Itemset Counting(DIC)algorithm[S.B97], which partially retains the breadth-first strategy but tries to jump one level up every transactions,for a user-specified parameter.A Best-First Strategy31.2.Depth-First AlgorithmsThe itemset lattice can be explored also in a form close to depth-first search. Some algorithms of this sort were proposed in[M.J97]and clear examples appear in [J.H00c].It has been argued that counting supports is inefficient in depth-first search, and indeed most depth-first proposals construct instead,for each frequent itemset,the whole list of transactions in which it appears,walking up through the lattice by in-tersecting them,and complementing the procedure with various heuristics like fast intersections[M.J97].See also[J.H00c]for precise descriptions and analyses.2.A Best-First StrategyOur proposal is based on the idea of advancing as much as possible the start point for counting the support of a candidate itemset;so,frequent itemsets detected early will lead sooner to the exploration of larger itemsets.To count the support of a can-didate,one needs to fully scan the database,but the order is irrelevant;thus,if we see the input database as“cyclic”,we can start counting the support of an itemset at any arbitrary transaction,wrap up to the beginning upon reaching the end,and stop one line earlier than the start.Thus:our concept of“best”is precisely defined by how early the algorithm can realize that an itemset is,in fact,potentially frequent.Besides,we fully avoid the time-consuming process of generating a new round of candidates from the supports found up to the point;instead,we keep track of how many immediate subitemsets of a given itemset have already been proved frequent.At the moment of receiving the last such confirmation,the itemset is declared potentially frequent,and its support is started to be counted right there without delay.The earlier we know that all subitemsets of an itemset are frequent,the earlier we start counting its support.2.1.Best-First Exploration Based On AprioriLet us describe more precisely this specific case of best-first:the algorithm,which we call Ready and Go(R&G),obtained by applying our strategy to the Apriori ap-proach.Along the exploration,our algorithm maintains two collections of itemsets:–The candidate itemsets,whose support is being counted in the database.–The borderline itemsets,immediate supersets of some candidate itemset that we know is frequent,and might become frequent themselves.An itemset enters the collection of borderline itemsets as soon as one of its im-mediate subsets has been found frequent.Each borderline itemset evolves in one of two forms:either the algorithmfinishes counting an infrequent subitemset,and then it gets pruned as with the Apriori heuristic;or the algorithm eventuallyfinds that all4EGC’2002.its subitemsets are frequent(possibly beforefinishing counting them!)and then the borderline itemset becomes a candidate and its support starts being counted.The algorithm cycles through the database,reading a transaction at a time andprocessing it,and wrapping up to the beginning upon reaching the last transaction;for each transaction read,the data structure that maintains candidate itemsets is carefullyupdated.A boderline itemset becomes an actual candidate whenever the algorithm is surethat it will not be pruned by the Apriori antimonotonicity property.The algorithm maintains a counter for each borderline itemset.Each frequent immediate subitemsetwill increment it,so that we always know how many of its immediate subitemsetshave been declared already frequent.Hence,as soon as a candidate itemset of size becomes frequent,it increments the counter of its supersets of size,and all such counters reaching the value at that point indicate borderline itemsets that mustimmediately join the set of candidates.Support counting for them starts at that very same transaction:in this way,each individual candidate is generated as soon as possi-ble,instead of waiting to a later candidate generation batch.Of course,the algorithm must keep track of the transaction where counting startedfor each candidate,in order to know when the corresponding database pass is com-plete.A more precise formal description of the algorithm is as follows:1/Inputs:database,support threshold.2/Initialize the candidate itemset collection with all-itemsets,their supports to zero,and the borderline itemsets collection with an empty set.3/Loop on the following:4/Read a new transaction from the database and increment the support of all can-didates that appear in the transaction.5/For each candidate becoming frequent(its support reaches)do6through8: 6/Increment the subitemset counter of those borderline itemsets that are its imme-diate supersets.7/Add,as borderline itemsets,with subitemset counter initialized to1,all imme-diate supersets that are not found among the current borderline itemsets.8/If the counter of a borderline itemset of size reaches value,then remove from the borderline itemsets collection and add to the candidate itemset collec-tion,initializing its support to0or1according to whether it is present in the current transaction,and storing the transaction identifier with it.9/If a candidate has been counted through all the transactions then:10/remove it from the candidate itemset collection,andA Best-First Strategy511/if its support is at least,output it together with its support,else remove all its supersets from the set of borderline itemsets1.12/If we are at the end of the transactionfile,rewind to the beginning.13/Exit loop in case there are no candidates left.2.2.Experiments and ComparisonsIt is interesting to compare our algorithm with DIC,specifically with respect to the parameter that controls the generation of candidates.Setting it to the length of the database results in Apriori again.Setting it very small the candidate generation function gets called too often,and this creates an overhead that makes DIC less effi-cient than standard Apriori.In this context,Ready and Go can be seen as an intent of running DIC in the extreme case,that is,when the size of the interval is1,but,of course,without incurring in the overhead.Testing was made on a PentiumII PC at233MHz with256Mb of RAM,running Linux;the implementation was made in C++and compiled by.For comparing with DIC,both DIC and R&G maintained candidates in a prefix tree[S.B97];we avoided the design of sophisticated data structures for the borderline sets to be sure that any improvement was only due to the algorithm,and simply stored them in a hash table together with the relevant counters and start transactions.We used data of an average size of10items chosen from100items from a syntethic database generator2,and compared the differences in running time and transactions visited as a function of the variations of mininum support and length of the database.For thefirst case,we used a100000-transaction synthetic database.We tried R&G and DIC for supports varying from0.3%to2%.DIC was run with a value of equal a10%of the database size.In absolutely all cases Ready and Go was faster by a modest but very stable factor close to20percent.As a function of the length of the database,we ran DIC and R&G forfixed support thresholds on databases from 250000to1M transactions,at the same value of.We give here the results for a threshold of0.6%,but the results obtained for other values were similar.Again,as in the previous cases,the savings were of about5%to10%in number of transactions and20%in running times.We tuned DIC experimentally to what seemed its best value of for each of the experiments(this used to be near).Even in this case,R&G was better, although marginally;its advantage for this case being that there is no need to tune any parameter.Again the experiments correspond to decreasing support and increasing size as before.Table1show the results in number of transactions.6EGC’2002.DIC Apriori R&G 3930452496301000002595232545231228421648332 491306254676500000130180726652336857261941000 49130625795010000002583897279523Table1.Processed transactions as support decreases and as database size increases (optimal)3.References[J.H00a]J.H AN,J.P EI,“Mining Frequent Patterns by Pattern-Growth:Methodology and Im-plications”,ACM SIGKDD Explorations,,2000,p.31-36.[J.H00b]J.H AN,J.P EI,Y.Y IN,“Mining Frequent Patterns Without Candidate Generation”, Proc.of the ACM SIGMOD Int’l Conf.on Management of Data,2000.[J.H00c]J.H IPP,U.GÜNTZER,G.N AKHAEIZADEH,“Algorithms for Association Rule Min-ing–A General Survey and Comparison”,ACM SIGKDD Explorations,,2000,p.58-64. [M.J97]M.J.Z AKI,S.P ARTHASARATHY,M.O GIHARA,W.L I,“New Algorithms For Fast Discovery of Association Rules”,Proc.of the3rd.Int’l Conf.on KDD and Data Mining (KDD’97),1997.[R.A93]R.A GRAWAL,T.I MIELINSKI,A.S WAMI,“Mining Association Rules Between Sets of Items in Large Databases”,Proc.of the ACM SIGMOD Int’l Conf.on Management of Data,1993,p.207-216.[R.A96]R.A GRAWAL,H.M ANNILA,R.S RIKANT,H.T OIVONEN,I.V ERKAMO,“Fast Dis-covery of Association Rules”,Advances in Knowledge Discovery and Data Mining,,1996, p.307-328.[R.A00]R.A GRAWAL,C.A GGARWAL,V.P RASAD,“A Tree Projection Algorithm for Gener-ation of Frequent Item Sets”,Parallel and Distributed Computing,,2000,p.350-371. [S.B97]S.B RIN,R.M OTWANI,J.U LLMAN,S.T SUR,“Dynamic Itemset Counting and Impli-cation Rules for Market Basket Data”,Proc.of the ACM SIGMOD Int’l Conf.on Manage-ment of Data,1997,p.255-264.。
New thinking on the financial crisisRoy E.Allen and Donald SnyderSaint Mary’s College of California,Moraga,California,USAAbstractPurpose –The purpose of this paper is to expand understanding of the current global financial crisis in light of other large-scale financial crises.Design/methodology/approach –The phenomenon of large-scale financial crisis has not been modeled well by neo-classical general equilibrium approaches;the paper explores whether evolutionary and complex systems approaches might be more useful.Previous empirical work and current data are coalesced to identify fundamental drivers of the boom and bust phases of the current crisis.Findings –Many features of financial crisis occur naturally in evolutionary and complex systems.The boom phase leading to this current crisis (early 1980s through 2006)and bust phase (2007-)are associated with structural changes in institutions,technologies,monetary processes,i.e.changing “meso structures”.Increasingly,purely financial constructs and processes are dominant infrastructures within the global economy.Research limitations/implications –Rigorous analytical predictions of financial crisis variables are at present not possible using evolutionary and complex systems approaches;however,such systems can be fruitfully studied through simulation methods and certain types of econometric modeling.Practical implications –Common patterns in large-scale financial crises might be better anticipated and guarded against.Better money-liquidity supply decisions on the part of official institutions might help prevent economy-wide money-liquidity crises from turning into systemic solvency crises.Originality/value –Scholars,policymakers,and practitioners might appreciate the more comprehensive evolutionary and complex systems framework and see that it suggests a new political economy of financial crisis.Despite a huge scholarly literature (organized recently as first-second-and third-generation models of financial crises)and a flurry of topical essays in recent months,systemic understanding has been lacking.Keywords Recession,Financial markets,Money,CreditPaper type Viewpoint 1.Introduction The current global financial crisis has many patterns in common with other recent large-scale financial crises,including in developed countries (e.g.Finland and Sweden in 1991,Japan after 1989)and less developed countries (tin America in 1982,Asia in 1997,Russia and Brazil in 1998,Argentina in 2002).Despite a huge historical literature on boom-bust processes,and despite a flurry of topical essays in recent months,some of these more important patterns are only now being identified.Section 2summarizes recent taxonomies with particular attention to the current situation.In order to describe the current crisis,first-,second-,and third-generation financial crisis models,which are found in the economics literature,are all helpful,but remain incomplete,especially in explaining “non-equilibrium”movements in exchange rates,interest rates,international investment flows,stock market and real estateThe current issue and full text archive of this journal is available at/1742-2043.htmCPOIB5,1/236critical perspectives on internationalbusinessVol.5No.1/2,2009pp.36-55q Emerald Group Publishing Limited1742-2043DOI 10.1108/17422040910938677values,and other keyfinancial variables.Thus,as elaborated in Section3,there is a revival of interest in psychological and social constructs,including what Keynes,in The General Theory,called“animal spirits”and other behaviors that may occur if people have a limited cognitive and informational basis for fully rational decision-making,and therefore,they may rely on less rational social conventions, vague beliefs,and other psychological factors.1.1What does this paper propose to contribute to this literature?First,in Section4keyfinancial variables are,indeed,shown to be driven somewhat more by subjective,even transcendental(of observable gross domestic product–GDP –or“real”processes)psychological and social constructs than is commonly understood.Since the1980s,structural changes in evolvingfinancial markets, especially advances in information-processing technology and government deregulation,have allowed a greater separation offinancial market processes from GDP processes.Specifically,as per the econometric research of one of the authors,the demand for money-liquidity forfinancial market participation has become–especially during episodes of chaotic structural change–an important source of money demand which absorbs money-liquidity away from observable GDP uses.As a related process, monetary wealth can thus be created,transferred,and destroyed across time and space more powerfully and independently of observed GDP processes than is commonly understood.Second,as elaborated in Section5,expanding our understanding of the current crisis can be assisted by evolutionary and complex systems approaches,especially ones that privilege the role of interactive knowledge and belief systems.The relative rise and fall of interactive institutional and technological systems,or“meso”structures in the language of evolutionary economics,are seen to play a key role in driving the current boom-bust pattern.Transitions and imbalance between meso structures can account for long-run discontinuities or instabilities beyond what can be explained by normal business cycle theory.These conclusions are then summarized in Section6as they might direct us towarda new political economy offinancial crisis.2.Definitions and common patternsA“financial crisis”is generally defined to be“a wider range of disturbances,such as sharp declines in asset prices,failures of largefinancial intermediaries,or disruption in foreign exchange markets”(De Bonis et al.,1999).There is a“crisis”,generally speaking,because the real economy is seriously and adversely affected,including negative impacts on employment,production,purchasing power,as well as the possibility that large numbers of households andfirms or governments are fundamentally unable to meet their obligations,i.e.“insolvency”.When an organization is fundamentally solvent but temporarily unable to meet itsfinancial obligations,then the notion of“illiquidity”is often used,but in practice insolvency and illiquidity are difficult to distinguish.For example,a common pattern is that“vicious circles”start from a money-liquidity crisis at a few banks,which then extends to an international crisis of investor confidence in thefinancial sector,which extends to a balance of payments problem for the country and currency devaluation,which extends to,therefore,even further liquidity and,at some point,solvency crises at the banks.New thinking on financial crisis37Recent models of large-scale financial crises are often characterized as “first-generation”models or “second-generation”or,in the last few years,“third-generation”.First-generation models,as pioneered by Krugman (1979)and others,emphasize the importance of a country’s foreign exchange reserves,i.e.if government budget deficits are excessive,then ultimately a government loses the ability to maintain these reserves,and a speculative attack on its currency exchangerate is inevitable.Second-generation models,as summarized by Rangvid (2001),arose during the 1990s when this cause-effect linkage no longer explained various currency crises.In particular,there is now a weaker relationship between economic fundamentals such as public sector deficits and the timing and severity of speculative currency attacks and related instabilities.The timing of government decisions to abandon a currency regime in favor of other political-economic goals has also proved difficult to predict.Second-generation models thus tell stories of “multiple equilibrium”values that key variables might assume,unpredictable or irrational behavior by private investors and governments,and there has been an effort to discover new “sunspot variables”that will better explain sudden changes in markets.Third-generation models,as per Krugman (1999)and Allen et al.(2002),introduce additional variables and feedback processes,especially the role of companies’,entrepreneurs’,and governments’balance sheets,and the impact of international financial flows and exchange rates on those balance sheets.During and after the 1997Asian financial crises,the financial condition of firms weakened more than was anticipated by second-generation models,which drew attention to these processes.Furthermore,until new entrepreneurs come forward,or until balance sheets return to normal,it has been difficult for economies to return to normal growth and stability.In most recent cases of large-scale financial crisis,a country or region initially benefits from expanded supplies of base money,new “quasi-moneys”which are created from base moneys,and credit supplies –a financial liberalization and deregulation phase.Typically the financial sector expands as it captures profit from new efficiencies and opportunities allowed by globalization.The country or region,for a time,may be favored by international investors;thus the banking system,including government,is well-capitalized and able to expand money-liquidity.Assets increase in monetary value and interest rates are low,and this wealth effect encourages consumption,borrowing,business investment,and government spending.Productive resources are more fully utilized and economic growth is well supported.There is a “boom”,as measured by increased monetary wealth held by private and public sectors of an economy,such as the value of stocks,real estate,currency reserves,etc.,and/or the current production of merchandise and services (GDP).Then,typically,the supply of base money (m )times its rate of circulation or velocity for GDP purposes (v )contracts,and therefore so does the equivalent nominal GDP.The decline in nominal GDP is usually split between its two components:(1)real GDP,which is the volume of current production measured in constant prices (q );and(2)and the GDP price level (p ).By definition,these variables are linked by the equation of exchange:(m £v ¼p £q ).When q declines for a sustained period (typically at least six months)we call it a recession,and when p declines we call it deflation.After this process starts,monetaryCPOIB 5,1/238policymakers may react by rapidly expanding(m),but this action may be too little too late–individuals and institutions may have non-payable debts,banks may be failing, and international confidence in the country or region may already be damaged.In this pessimistic case,which is typical in less-developed countries with weakfinancial systems,the desperate increase in m may reverse the slide in p and even lead to hyper-inflation(destabilizing,rapid increases in p),but q would continue to fall.Also,a weakfinancial system may be unable to maintain the circulation rate of secure currencies for productive activities,especially if people are hoarding money,and thus v would decline.The initial contraction in“effective money”(m£v)in a crisis may be caused by monetary authorities or national and international investors draining money(m)from the country or region,or there may be a decline in v for reasons having to do with the inability of thefinancial system to direct money toward productive activities.A contraction in effective money or withdrawal of international investment may undermine equity markets,debt markets,bank capital,or government reserves,and monetary wealth is then revalued downwards.General economic or political uncertainty worsens the situation–the resulting austerity-mentality causes a contraction of spending and credit,and an increased“risk premium”attached to business activity scares away investment and bank lending.Interest rates rise,the demand for quasi-money and credit–i.e.the desire to hold and use the insecure “monetaryfloat”–declines and people try to convert the monetaryfloat into more secure base money,Treasury securities,and other more secure assets.No reserve-currency banking system is able to cover all of its monetaryfloat with secure bank reserves if customers try to redeem too much of thefloat at once,and thus “runs”on banks can destroy the banks themselves.A deteriorating banking sector may be unable to honor its deposits,bad loan problems surface and a“lender of last resort”such as the central bank,taxpayers,or the International Monetary Fund(IMF) may need to be found.How far do these common patterns go toward explaining the current global financial crisis?The authors agree with many other commentators,such as Reinhart and Rogoff(2007),that the current crisisfits many historical patterns,as explained next with emphasis on the US situation.In particular,Reinhart and Rogoff(2007)find that:...the run-up in US housing and equity prices that Kaminsky and Reinhart(1999)find to be the best leading indicators of crisis in countries experiencing large capital inflows closely tracks the average of the previous18post-World War II banking crises in industrial countries (p.339).The initialfinancial liberalization and globalization boom phase of the current crisis was driven in the early1980s by widespread deregulation offinancial markets in the developed world especially the Reagan administration reforms in the USA and the Thatcher administration reforms in the UK.These policies were guided by ideologies and belief systems aimed at restoring a more capitalist tradition,and they eventually prevailed across the global system.French President Mitterrand’s attempt to advance a more socialist set of values and policiesfloundered by late1982andfinancial market deregulation and international integration spread not only in Europe and North America,but also Asia,Latin America,Africa,and Eastern Europe.Newly unregulated financial products,entities and markets came to play a larger role.Also,dramatic New thinking on financial crisis39advances in information-processing technology (electronic banking systems,communication satellites,the computer revolution,etc.)facilitated international arbitrage.The commoditization and securitization of financial products by the private sector,including through virtually unregulated no-reserve-requirement “offshore banking facilities”led to dramatic increases in international money liquidity and credit,and by the end of the 1980s,global financial markets were generating a netinternational flow of funds of more than $3trillion each month,i.e.the flow of funds between countries that reconciles end of the month balance of payments accounts.Of that $3trillion,$2trillion was so-called stateless money,which was virtually beyond the control of any government or official institution,but available for use by all countries (Barron’s Magazine ,1987,p.45).This 1980-2006boom phase of the current crisis was driven by fundamental changes in the basic social and technical rules of the game,or “meso”structure of the global economy (as per the evolutionary economics language of Section 5),which,among other characteristics,replaced more hierarchically organized,communitarian-disciplined,national-government-controlled rules with,instead,more free-market,technologically innovative,decentralized,and chaotically-individualistic meso structures as financial globalism prevailed.As one of the authors has elaborated elsewhere (Allen,1999),these structural changes associated with financial globalization supported an increasing net financial inflow from the rest of the world into the USA from near-balance in 1980to approximately $800billion per year or 6percent of GDP as a net financial inflow in 2006–the peak of the long boom –which supported classic-pattern excesses in low-interest rate debt financing and spending,monetary wealth creation processes,consumerism and financial asset inflation,and now-famous lax standards in mortgage financing and securitization vehicles such as “collateralized debt obligations”and “structured investment vehicles”that passed the rights to the mortgage payments and related credit/default risk to third-party investors.The US household personal savings rate dropped from 8percent of disposable income to 0percent over this 1980-2006period,while the “net wealth of the US”(net value of business assets,real estate,consumer durables,and US government property)increased from approximately $7trillion to $50trillion in nominal terms.The end of this long boom from the early 1980s to 2006,and the beginning of the crisis or bust phase,began with the bursting of the US housing bubble and a sharp rise in home foreclosures in the USA in late 2006,which spread to become a more broad-based global financial crisis within a year.The mortgage lenders that retained the risk of payment default,such as Countrywide Financial,were the first financial institutions to be affected as borrowers defaulted.Major banks and other financial institutions reported losses of approximately $100billion by the end of 2007.By October 2007,16percent of subprime loans in the USA with variable interest rate features were 90days delinquent or in foreclosure proceedings,roughly triple the rate of 2005,and by January of 2008,this number increased to 21percent.Losses in the money-credit pyramid then began to spread across the system including through the collapse in June 2007of two hedge funds owned by Bear Stearns that were invested heavily in subprime mortgages.The lender-of-last-resort (LOLR)phase of the crisis began as the Federal Reserve took unprecedented steps to avoid a Bear Stearns bankruptcy by assuming $30billion in its liabilities and engineering the sale of Bear Stearns to JPMorgan Chase.In August 2008the US Treasury (andCPOIB 5,1/240therefore the US taxpayer)joined the LOLR phase by taking over and guaranteeing the funding of Fannie Mae and Freddie Mac,the quasi-government housing market entities.In September2008American International Group,because of its exposure to credit default swaps,was bailed out by the Federal Reserve in an$85billion deal,and then later that month the US taxpayer-sponsored$700billion bailout bill was passed after Congress amended the plan to add more oversight,limits on executive pay,and the option for the government to gain equity in the companies that it bails out.As in Finland and Sweden’s crises in1991,Europe and the USA quickly moved to acquire equity stakes in,and partly nationalize,the banks.As this essay is written in late2008,the world thus moves quickly to regain coordinated state“social market capitalism”control of thefinancial system,i.e.back toward the pre-1980“meso”rule structures that were more hierarchically organized and communitarian-disciplined(by treasuries and central banks and other official institutions)in place of the more free-market innovative,decentralized,and individualistic rule structures that dominated between1980and2008.In testimony before the US Congress on23October2008,former US Federal Reserve Chairman Alan Greenspan famously said that“I made a mistake in presuming that the self-interest of organizations,specifically banks and others,was such that they were best capable of protecting their own shareholders”.3.The role of psychological and social factorsSecond-and now third-generation models offinancial crisis,while simulating many common patterns,do not yet explain sudden movements in exchange rates,interest rates,international investmentflows,stock market and real estate values,and other key variables to levels beyond normalfluctuations.Thus,there is a revival of interest in what Keynes,in The General Theory,called“animal spirits”such as“spontaneous optimism”among entrepreneurs and others(Marchionatti,1999).Essentially,Keynes argued that people may have a limited cognitive and informational basis for fully rational decision-making,and therefore,they may rely on less rational social conventions,vague beliefs,and other psychological factors.One implication is that: ...the market will be subject to waves of optimistic and pessimistic sentiment,which are unreasoning and yet in a sense legitimate where no solid basis exists for a reasonable calculation(Keynes,1936,p.154).As authoritatively summarized by Kindleberger(1989)in Manias,Panics,and Crashes:A History of Financial Crises,over the long history of market capitalism,the start of an unsustainablefinancial boom or“mania”is always linked to a sudden increase in money liquidity and lending.Unstable and exaggerated expectations, which are quite subjective,play a role:The heart of this book is that the Keynesian theory is incomplete[in explaining economic instabilities and crises],and not merely because it ignores the money supply.Monetarism is incomplete,too.A synthesis of Keynesianism and monetarism,such as the Hansen-Hicks IS-LM curves that bring together the investment-saving(IS)and liquidity-money(LM) relationships,remains incomplete,even when it brings in production and prices(as does the most up-to-date macroeconomic analysis),if it leaves out the instability of expectations, speculation,and credit and the role of leveraged speculation in various assets(Kindleberger, 1989,p.25).New thinking on financial crisis41Given the increased importance,across a larger and more dynamic global political economy,of subjective expectations,leveraged investment as supported by new electronic and derivative money and credit forms,unregulated no-reserve-requirement offshore financial markets,etc.,some current research is consistent with the innovative notion that monetary-wealth,or what Marx called “unproductive finance capital”(as opposed to physical capital or capital goods such as machines and factories)may be a“driver”of economic instabilities.As elaborated by Philip Cerny among others,the new global financial markets may even be an “infrastructure of the infrastructure”.Cerny’s initial position,elaborated in debates that began in the early 1990s,was that “a country without efficient and profitable financial markets and institutions will suffer multiple dis advantages in a more open world [...][and will]attempt to free-ride on financial globalization through increasing market liberalization”(Cerny,1993,p.338).Helleiner (1995)restrained this position –of a determinist,autonomous,technology-driven financial globalization –by demonstrating that states,especially the USA and the UK,have fostered and guided the entire process.Scholarly journals have been launched in recent decades to respond to these issues.For example,in the first edition of Review of International Political Economy ,the editors state:The creation of a global economic order has come to represent the defining feature of our age,as a major force shaping economies and livelihoods in all areas of the world.Globalization,of course,has many aspects [...]The first of these is the emergence of a truly global financial market [...]and the resulting increase in the power of finance over production (Review of International Political Economy ,1994,p.3).Reversing the causality of Karl Marx’s (and many others’)philosophical materialism,it may increasingly be true that autonomous,invisible financial processes can drive changes in the physical relations of production,as well as vice versa .As part of this process,central banks and other financial market participants can (usually haphazardly)increase or reduce wealth independently of any initial changes in the production of GDP or other “real”economic prospects.The Chairman of the US Federal Reserve,Alan Greenspan,began allowing for this possibility in the late 1990s:Today’s central banks have the capability of creating or destroying unlimited supplies of money and credit [...].It is probably fair to say that the very efficiency of global financial markets,engendered by the rapid proliferation of financial products,also has the capability of transmitting mistakes at a far faster pace throughout the financial system in ways that were unknown a generation ago,and not even remotely imagined in the 19th century [...].Clearly,not only has the productivity of global finance increased markedly,but so,obviously,has the ability to generate losses at a previously inconceivable rate (Greenspan,1998).The authors would emphasize from Greenspan’s quote that “the capability of creating or destroying unlimited supplies of money and credit”is equivalent to “the capability of creating or destroying monetary wealth”.Money and credit are “stores of value”,as determined by social consensus within nations and between nations.Transcendental notions of value applied to monetary assets need not reflect,or even be compatible with,the observed empirical world.For example,the purely transcendental “law of compound interest”is a social agreement,which may not correlate with the way that the physical economy grows.Growth in the physical economy is subject to thermodynamics,biological growth processes and carryingCPOIB 5,1/242capacity,endowments of resources,sunlight and rain,etc.Perhaps debtors as a group, who are required to pay exponentially increasing interest under this transcendental law,can generate goods and services and therefore economic revenues only in arithmetically increasing increments over time.Therefore,perhaps some debtors have to fail,and yield their economic resources to the others,so that the others can meet their obligations.Invisible belief systems,including those of money-gamblers and optimistic market capitalists,have supported a money economy based on exponential interest payments, and an easing of lending restrictions.The acceptance and growth of offshorefinance in the1980s and1990s,without reserve requirements or other significant regulations,is an example of how belief systems–in this case market capitalist ideology—drive institutional change.Offshore market institutions,such as the Bangkok International Banking Facility,encouraged unsustainable over-lending,excessive unprofitable construction of real estate,etc.,and ultimately contributed to the risk of crisis, recession and misery in the Asianfinancial crisis of1997.Belief systems and their supporting institutions can thus drive empirical changes in human populations and their physical environments.Much of the economic wealth that was initially created through deposit-lending,as it exercised itself in production power,consumption power, and power to change institutions,was destroyed in the recent Asian crisis,but it nevertheless did exist as a broad social agreement.Whether or notfinancial crises arising from over-indebtedness and deflation–as per Irving Fisher’s classic work(Fisher,1933)–should be systemically expected in modern capitalism has been debated.Mainstream economic thinking has until recently been generally confident that“hard-to-qualify”lending-restrictions,wherein all parties are conscious of systemic risk,can avoid over-lending and debt-failure.In contrast, Marxists(see Clarke,1994)and others across various disciplines,such as Frederick Soddy(1926),are convinced that these crises are endemic to capitalism.Most recently, as summarized by Reinhart and Rogoff(2008),a new data set covering eight centuries and68countries shows“a perennial problem of serial default[...]in this respect the 2007-08US sub-primefinancial crisis is hardly exceptional”(p.2).Given that massive debt-repudiation crises continue to happen in the world system, we have not yet been able to avoid over-lending and periodic disjuncture-crises between,on the one hand,the belief system which includes mathematical compound interest,and on the other hand,the ability to generate money from tangible“real world”processes.Some borrowers and lenders are well informed but reckless risk-takers who know that periodic failures are required in“casino capitalism”(John Maynard Keynes’s phrase),whereas other borrowers and lenders underestimate systemic risk and allow over-lending based upon a mistaken ideology regarding the stability of the system.Thus,on both accounts,the literature generally concludes that debt crises are likely to remain with us.4.New thinking4.1What do the authors propose to contribute to this literature?First,in the authors’view,keyfinancial variables can be driven somewhat more by subjective,even transcendental(of observable GDP or“real sector”processes) psychological and social constructs than is commonly understood.Consequently,there has been an even greater separation offinancial market processes from GDP processes than is commonly understood.And,monetary wealth has thus been created,New thinking on financial crisis43。
1 亚当·斯密的“有效需求”"Effectual Demand", in Adam Smith2 自回归综合移动平均模型ARIMA Models3 不在地主Absentee4 绝对地租Absolute Rent5 绝对的和可交换的价值Absolute and Exchangeable value6 国际收支的开支吸收分析法Absorption Approach to the Balance of Payments7 吸收能力Absorptive Capacity8 节欲Abstinence9 抽象劳动与具体劳动Abstract and Concrete Labour10 加速原理Acceleration Principle11 会计学与经济学Accounting and Economics12 私人和社会会计Accounting, Private and Social13 资本的积累Accumulation of Capital14 非循环性Acyclicity15 适应性预期Adaptice Expectation16 总额相符问题Adding-up Problem17 调整的成本Adjustment Cost18 调整过程与稳定性Adjustment Processes and Stability19 有管理的价格Administered Prices20 预付Advances21 逆选择Adverse Selection22 广告Advertising23 顾问Advisers24 人口老化Ageing Populations25 代理费Agency Costs26 生产要素Agents of Production27 总需求理论Aggregate Demand Theory28 总需求和总供给分析Aggregate Demand and Supply Analysis29 总供给函数Aggregate Supply Function30 加总问题Aggregation Problem31 经济关系的总和Aggregation of Economic Relations32 农业经济学Agricultural Economics33 农业增长和人口变化Agricultural Growth and Population Change34 农产品供给Agricultural Supply35 农业与经济发展Agriculture and Economic Development36 农业与土地Agriculture and Land37 异化Alienation38 阿莱悖论Allais Paradox39 阿尔蒙滞后Almon Lag40 利他主义Altruism41 美国经济协会American Economic Association42 摊销Amortization43 类比Analogy44 无政府主义Anarchism45 反托拉斯政策Antitrust Policy46 适用技术Appropriate Technology47 套利Arbitrage48 套利定价理论Arbitrage Pricing Theory49 仲裁Arbitration50 军备竞赛Arms Races51 阿罗定理Arrow''s Theorem52 阿罗-德布勒一般均衡模型Arrow-Debren Model of General Equilibrium53 资产定价Asset Pricing54 资产与负债Assets and Liabilities55 指派问题Assignment Problems56 非对称信息Asymmetric Information57 原子状竞争Atomistic Competition58 拍卖者Auctioneer59 拍卖Auctions60 奥地利经济学派Austrian School of Economics61 自给自足Autarky62 自发支出Autonomous Expenditures63 自回归和移动平均时间序列过程Autoregressive and Moving-average Time-series Processes64 平均成本定价Average Cost Pricing65 阿弗奇一约翰逊效应Averch-Johnson effect66 公理化理论Axiomatic Theories67 交割延期费Backwardation68 落后性Backwardness69 贸易差额理论史Balance of Trade, History of The Theory70 平衡预算乘数Balanced Budget Maltiptier71 平衡增长Balanced Growth72 中央银行利率Bank Rate73 银行学派,通货学派,自由银行学派Banking School, Currency School, Free Banking School74 讨价还价(议价) Bargaining75 物物交换Barter76 物物交换和交易Barter and Exchange77 基本品和非基本品Basics and Non-Basics78 基点计价制Basing Point System79 杂牌凯恩斯主义Bastard Keynesianism80 贝叶斯推断Bayesian Inference81 以邻为整Beggar-the-neighbor82 行为经济学Behavioral Economics83 有偏和无偏的技术进步Biased and Unbiased technological Change84 出价Bidding85 双边垄断Bilateral Monopoly86 复本位制Bimetallism87 生物经济学Bioeconomics88 经济学在生物学中的应用Biological Applications of Economics89 伯明翰学派Birmingham School90 生死过程Birth-and-death Processes91 债券Bonds92 有限理性论Bounded Rationality93 资产阶级Bourgeoisie94 贿赂Bribery95 泡沫状态Bubbles96 预算政策Budgetary Policy97 缓冲存货Buffer Stocks98 内在稳定器Built-in Stabilizers99 金银本位主义的争论Bullionist Controversy100 束状图Bunch Maps101 公债负担Burden of The Debt102 官僚制度Bureaucracy103 经济周期Business Cycles104 不变替代弹性生产函数CES Production Function105 变分法Calculus of Variations106 官房经济学派Cameralism107 资本资产定价模型Capital Asset Pricing Model108 资本预算的编制Capital Budgeting109 资本外逃Capital Flight110 资本的收益与损失Capital Gains and Losses111 资本品Capital Goods112 资本的反常现象Capital Perversity113 资本理论Capital Theory114 资本的理论:争论Capital Theory: Debates115 资本理论:悖论Capital Theory: Paradoxes116 固定资本利用程度Capital Utilization117 作为一种生产要素的资本Capital as A Factor of Production118 作为一种社会关系的资本Capital as a Social Relation119 资本、信贷和货币市场Capital, Credit and Money Markets120 资本主义Capitalism121 资本主义的与非资本主义的生产Capitalistic and Acapitalistic Production 122 卡特尔Cartel123 交易学Catallactics124 突变论Catastrophe Theory125 赶超Catching-up126 因果推理Causal Inference127 经济模型中的因果关系Causality in Economic Models128 删截数据模型Censored Data Models129 中央银行业务Central Banking130 中心地区理论Central Place Theory131 中央计划Central Planning132 波动重心Centre of Gravitation133 确定性等价Certainty Equivalent134 如果其他条件不变Ceteris Paribus135 偏好的改变Changes in Tastes136 宪章运动:宪章的条款Chantism: the point of the Charter 137 物品特性Characteristics138 宪章运动Chartism139 低息借款Cheap Money140 芝加哥学派Chicago School141 技术选择与利润率Choice of Technique and the Rate of Profit 142 牟利学(理财) Chrematistics143 基督教社会主义Christian Socialism144 循环流动Circular Flow145 流通资本Circulating Capital146 阶级Class147 古典经济学Classical Economics148 古典增长模型Classical Growth Models149 古典货币理论Classical Theory of Money150 历史计量学Cliometrics151 社团Clubs152 合作社Co-operatives153 科斯定理Coase Theorem154 柯布-道格拉斯函数Cobb-Douglas Function155 蛛网定理Cobweb Theorem156 共同决定和利润分享Codetermination and Profit-sharing157 同族学科Cognate Displines158 柯尔培尔主义Colbertism159 集体行动Collective Action160 集体农业Collective Agriculture161 劳资集体谈判Collective bargaining162 合谋Collusion163 殖民主义Colonialism164 殖民地Colonies165 联合Combination166 组合论Combinatorics167 命令经济Command Economy168 商品拜物教Commodity Fetishism169 商品货币Commodity Money170 商品储备货币Commodity Reserve Currency171 公共土地Common Land172 习惯法Common Law173 公共财产权Common Property Rights174 通讯Communications175 共产主义Communism176 社会(公共)无差异曲线Community Indifference Curves177 比较利益Comparative Advantage178 比较静态学Comparative Statics179 补偿需求Compensated Demand180 补偿Compensation181 补偿原理Compensation Principle182 竞争Competition183 竞争政策Competition Policy184 竞争与效率Competition and Efficiency185 竞争与选择Competition and Selection186 国际贸易竞争Competition in International Trade187 奥地利学派的竞争理论Competition: Austrian Conceptions188 古典竞争理论Competition: Classical Conceptions189 马克思学派的竞争理论Competition: Marxian Conceptions190 竞争性市场过程Competitive Market Processes191 一般均衡的计算Computation of General Equlibria192 集中比率Concentration Ratios193 冲突与解决Conflict and Settlement194 冲突与战争Conflict and War195 拥挤Congestion196 综合性大企业Conglomerates197 推测均衡Conjectural Equilibria198 炫耀性消费Conspicuous Consumption199 不变资本和可变资本Constant and Variable Capital200 制度经济学Constitutional Economics201 耐用消费品Consumer Durables202 消费者剩余Consumer Surplus203 消费者支出Consumers, Expenditure204 消费函数Consumption Function205 消费集Consumption Sets206 消费税Consumption Taxation207 消费与生产Consumption and Production208 可竞争市场Contestable Markets209 或有商品Contingent Commodities210 经济历史的连续性Continuity in Economic History211 连续和离散时间模型Continuous and Discrete Time Models212 连续-时间随机模型Continuous-time Stochastic Model213 连续时间随机过程Continuous-time Stochastic Processes214 矛盾Contradiction215 资本主义的矛盾Contradictions of Capitalism216 经济活动的控制与协调Control and Coordination of Economic Activity 217 趋向性假说Convergence Hypothesis218 凸规划Convex Programming219 凸性Convexity220 合作均衡Cooperative Equilibrium221 合作对策Cooperative Games222 核心Cores223 谷物法Corn Laws224 谷物模型Corn Model225 公司经济Corporate Economy226 公司Corporations227 社团主义Corporatism228 对应原理Correspondence Principle229 对应Correspondences230 成本函数Cost Functions231 成本最小化和效用最大化Cost Minimization and Utility Maximization 232 成本和供给曲线Cost and Supply Curves233 生产成本Cost of Production234 成本-效益分析Cost-benefit Analysis235 成本推动型通货膨胀Cost-push Inflation236 反向贸易Counter Trade237 反设事实Counterfactuals238 抗衡力量Countervailing Power239 蠕动钉住汇率Crawling Peg240 创造性破坏Creative Destruction241 信贷Credit242 信贷周期Credit Cycle243 信贷配给Credit Rationing244 犯罪与处罚Crime and Punishment245 危机Crises246 关键路径分析Critical Path Analysis247 挤出效应Crowding Out248 累积的因果关系Cumulative Causation249 累积过程Cumulative Processes250 通货Currencies251 通货委员会Currency Boards252 关税同盟Customs Unions253 周期Cycles254 社会主义经济的周期Cycles in Socialist Economies255 技能退化De-skilling256 高息借款Dear Money257 销路理论Debouches, Theorie des258 分权Decentralization259 决策理论Decision Theory260 衰落产业Declining Industries261 人口下降Declining Population262 国防经济学Defence Economics263 赤字财政Deficit Financing264 赤字支出Deficit Spending265 垄断程度Degree of Monopoly266 效用程度Degree of utility267 需求管理Demand Management268 需求价格Demand Price269 需求理论Demand Theory270 货币需求:经验研究Demand for Money: Empirical Studies271 货币需求:理论研究Demand for Money: Theoretical Studies272 需求拉动型通货膨胀Demand-pull Inflation273 人口转变Demographic Transition274 人口统计学Demography275 依附Dependency276 折耗Depletion277 折旧Depreciation278 萧条Depressions279 派生需求Derived Demand280 决定论Determinism281 发展Development282 发展经济学Development Economics283 发展计划Development Planning284 辩证唯物主义Dialectical Materialism285 辩证推理Dialectical Reasoning286 微分对策Differential Games287 获得的困难Difficulty of Attainment288 生产的难易程度Difficulty or Facility of Production289 技术扩散Diffusion of Technology290 经济量的维数Dimension of Economic Quantities291 直接税Direct Taxes292 直接非生产性寻利活动Directly Unproductive Profit-seeking (DUP) Activities 293 离散的选择模型Discrete Choice Models294 歧视性垄断Discriminating Monopoly295 歧视Discrimination296 非均衡分析Disequilibrium Analysis297 隐蔽性失业Disguised Unemployment298 反中介行动Disintermediation299 扭曲Distortions300 分配Distribution301 占典分配理论Distribution Theories: Classical302 凯恩斯主义的分配理论Distribution Theories: Keynesian303 马克思主义的分配理论Distribution Theories: Marxian304 新古典分配理论Distribution Theories: Neoclassical305 分配伦理Distribution, Ethics of306 分配规律Distribution, Law of307 分配公平Distributive Justice308 多样化经营Diversification of activities309 分段的总体和随机模型Divided Populations and Stochastic Models310 股息政策Dividend Policy311 迪维西亚指数Divisia Index312 劳动分工Division of Labour313 经济学说Doctrines314 土地调查清册Domesday Book315 家务劳动Domestic Labour316 复式簿记Double-entry Bookkeeping317 二元经济Dual Economies318 二元性Duality319 虚拟变量Dummy Variables320 倾销Dumping321 双头垄断Duopoly322 动态规划和马尔可夫决策过程Dynamic Programming and Markov Decision Process 323 经济增长和发展的动力学Dynamics, Growth and Development324 东西方经济关系East-west Economic Relations325 伊斯特林假说Easterlin Hypothesis326 经济计量学Econometrics327 经济人类学Economic Anthropology328 社会主义经济的经济计算Economic Calculation in Socialist Economies329 经济自由Economic Freedom330 经济增长Economic Growth331 经济和谐Economic Harmony332 经济史Economic History333 经济一体化Economic Integration334 历史的经济学解释Economic Interpretation of History335 经济法则Economic Laws336 经济人Economic Man337 经济组织Economic Organization338 经济组织与交易成本Economic Organization and Transaction Costs339 经济科学与经济学Economic Science and Economics340 经济剩余与等边际原理Economic Surplus and the Equimarginal Principle341 经济理论与理性假说Economic Theory and The Hypothesis of Rationality342 国家的经济理论Economic Theory of the State343 经济战Economic War344 经济和社会人类学Economic and Social Anthropology345 经济和社会史Economic and Social History346 经济学图书馆与文献的使用Economics Libraries and Documentation347 规模经济与规模不经济Economies and Diseconomies ofScale348 经济计量学Economitrics349 有效需求Effective Demand350 实际保护Effective Protection351 有效配置Efficient Allocation352 有效率市场假说Efficient Market Hypothesis353 国际收支的弹性分析方法Elasticities Approach to the Balance of Payments354 弹性Elasticity355 替代弹性Elasticity of Substitution356 就业理论Employment, Theories of357 空匣Empty Boxes358 内生性与外生性Endogencity and Exoyeneity359 内生货币与外生货币Endogenous and Exogenous Money360 能源经济学Energy Economics361 强制执行Enforcement362 恩格尔曲线Engel Curve363 恩格尔定律Engel''s Law364 英国历史学派English Historical School365 权利Entitlements366 企业家Entrepreneur367 熵Entropy368 进入与市场结构Entry and Market structure369 包络定理Envelope Theorem370 环境经济学Environmental Economics371 妒忌Envy372 国民历代大事记或民族精神编年史Ephemerides du Citoyen ou Chronique de I''esprit National 373 经济学中的认识论问题Epistemological Issues in Economics374 均等利润率Equal Rates of Profit375 平等Equality376 交易方程Equation of Exchange377 均衡:概念的发展Equilibrium: Development of The Concept378 均衡:一个预期性的概念Equilibrium: an Expectational Concept379 公平Equity380 遍历理论Ergodic Theory381 变量误差Errors in Variables382 估计Estimation383 欧拉定理Euler''s Theorem384 欧洲美元市场Eurodollar Market385 事前与事后Ex Ante and Ex Post386 过度需求与供给Excess Demand and Supply387 交换Exchange388 外汇管制Exchange Control389 汇率Exchange Rate390 可能竭资源Exhaustible Resources391 一般均衡的存在性Existence of General Equilibrium392 退出和进言Exit and Voice393 预期Expectations394 预期效用假说Expected Utility Hypothesis395 预期效用及数学期望Expected Utility and Methematical Expectation396 消费支出税Expenditure Tax397 经济学中的实验方法(i) Experimental Methods in Economics(i)398 经济学中的实验方法(ii) Experimental Methods in Economics(ii)399 剥削Exploitation400 展延家庭Extended Family401 扩展型对策Extensive Form Games402 粗放与集约地租Extensive and Intensive Rent403 外债External Debt404 外在经济External Economies405 外在性Externalities406 费边经济学Fabian Economics407 因子分析Factor Analysis408 要素价格边界Factor Price Frontier409 公平分配Fair Division410 公平性Fairness411 下降的利润率Falling Rate of Profit412 家庭Family413 计划生育Family Planning414 饥荒Famine415 法西斯主义Fascism416 生育力Fecundity417 人口出生率Fertibity418 封建主义Feudalism419 法定不兑现纸币Fiat Money420 虚拟资本Fictitious Capital421 信用发行Fiduciary Issue422 最终效用程度Final Degree of Utility423 最终效用Final Utility424 金融Finance425 金融资本Finance Capital426 融资和储蓄Finance and Saving427 金融危机Financial Crisis428 金融中介Financial Intermediaries429 金融新闻业Financial Journalism430 金融市场Financial Markets431 微调Fine Tuning432 厂商理论Firm, Theory of The433 财政联邦主义Fiscal Federalism434 财政态势Fiscal Stance435 发展中国家的财政和货币政策Fiscal and Monetary Policies in Developing Countries 436 渔业Fisheries437 固定资本Fixed Capital438 固定汇率Fixed Exchange Rates439 不变生产要素Fixed Factors440 不动点定理Fixed Point Theorems441 固定价格模型Fixprice Models442 浮动汇率Flexible Exchange Rates443 强制储蓄Forced Saving444 预测Forecasting445 对外援助Foreign Aid446 国外投资Foreign Investment447 对外贸易Foreign Trade448 对外贸易乘数Foreign Trade Multiplier449 森林经济Forests450 欺骗Fraud451 自由银行制度Free Banking452 自由处置Free Disposal453 免费物品Free Goods454 免费午餐Free Lunch455 自由贸易和保护主义Free Trade and Protection456 充分就业Full Employment457 充分就业预算盈余Full Employment Budget Surplus458 完全及有限信息方法Full and Limited Information Methods459 泛函分析Functional Analysis460 功能财政Functional Finance461 根本性失衡Fundamental Disequilibrium462 可替代性Fungibility463 期贷市场、套头交易与投机Futures Markets, Hedging and Speculation 464 期货交易Futures Trading465 模糊集合Fuzzy Sets466 贸易收益Gains from Trade467 对策论(博奕论) Game Theory468 不完全信息对策Games With Incomplete Information469 赌博合同Gaming Contracts470 度规函数Gauge Functions471 资本搭配Gearing472 性别Gender473 一般均衡General Equilibrium474 一般系统理论General System Theory475 德国历史学派German Historical School476 吉布拉定律Gibrat''s Law477 吉芬悖论Giffen''s Paradox478 赠品Gifts479 吉尼比率Gini Ratio480 经济理论中的整体分析Global Analysis in Economic Theory481 金本位Gold Standard482 黄金时代Golden Age483 黄金律Golden Rule484 货物与商品Goods and Commodities485 政府预算约束Government Budget Restraint486 图论Graph Theory487 重力模型Gravity Models488 格莱辛定律Gresham''s Law489 总替代品Gross Substitutes490 群(李群)论Group(Lie Group)Theory491 增长的核算Growth Accounting492 增长与周期Growth and Cycles493 经济增长与国际贸易Growth and International Trade494 哈恩问题Hahn Problem495 汉密尔顿体系Hamiltonians496 哈里斯-托达罗模型Harris-Todaro Model497 哈罗德-多马增长模型Harrod-Domar Growth Model498 霍金斯一西蒙条件Hawkins-Simon Condition499 卫生经济学Health Economics500 赫克歇尔-俄林贸易理论Heckscher-Ohlin Trade Theory501 套头交易Hedging502 享乐函数和享乐指数Hedonic Functions and Hedonic Indexes503 享乐主义Hedonism504 黑格尔主义Hegelianism505 赫芬达尔指数Herfindahl index506 异方差性Heteroskedasticity507 隐蔽活动,道德风险与合同理论Hidden Action, Moral Hazard and Contract Theory 508 等级制度Hierarchy509 讨价还价Higgling510 健全货币与货币基础High-powered Money and The Monetary Base511 历史成本会计Historical Cost accounting512 历史人口统计学Historical Demography513 经济思想及学说史History of Thought and Doctrine514 齐次函数和位似函数Homogeneous and Homothetic Functions515 国际游资Hot Money516 家庭预算Household Budgets517 家庭生产Household Production518 家务劳动Housework519 住房市场Housing Markets520 人力资本Human Capital521 人类资源Human Resources522 虚构的生产函数Humbug Production Function523 持猎和采集经济Hunting and Gathering Economies524 恶性通货膨胀Hyperinflation525 假设检验Hypothesis Testing526 IS-LM分析IS-LM Analysis527 理想指数Ideal Indexes528 理想产出Ideal Output529 理想类型Ideal Type530 识别Identification531 意识形态Ideology532 贫困化增长Immiserizing Grow533 尽早消费偏好Impatience534 不完全竞争Imperfect Competition535 不完全模型Imperfectionist Models536 帝国主义Imperialism537 默认契约Implicit Contracts538 进口替代和出口导向型增长Import Substitution and Export-Led Growth 539 派算Imputation540 剌激的协调性Incentive Compatibility541 刺激性合同Incentive Contracts542 收入Income543 收入-支出分析Income-Expenditure Analysis544 收入政策Incomes Policies545 不完全合同Incomplete Contracts546 不完全市场Incomplete Markets547 规模报酬递增Increasing Return to Scale548 指数Index Numbers549 指数化证券Indexed Securities550 指导性计划Indicative Planning551 指标Indicators552 无差异定律Indifference, Law of553 间接税Indirect Taxes554 间接效用函数Indirect Utility Function555 个人主义Individualism556 不可分性Indivisibilities557 归纳Induction558 产业组织Industrial Organization559 劳资关系Industrial Relations560 产业革命Industrial Revolution561 工业化Industrialization562 不等式Inequalities563 不平等Inequality564 国家之间的不平等Inequality between Nations565 人与人的不平等Inequality between Persons566 性别的不平等Inequality between The Sexes567 工资的不平等Inequality of Pay568 新生工业Infant Industry569 婴儿死亡率Infant Mortality570 通货膨胀Inflation571 通货膨胀会计Inflation Accounting572 通货膨胀与增长Inflation and Growth573 通货膨胀预期Inflationary Expections574 通货膨胀缺口Inflationary Gap575 非正规经济Informal Economy576 信息论Information Theory577 继承Inheritance578 继承税Inheritance Taxes579 创新Innovation580 投入-产出分析Input-output Analysis581 制度经济学Institutional Economics582 工具变量Instrumental Variables583 保险Insurance584 整数规划Integer Programming585 需求的可积性Integrability of Demand586 智力Intelligence587 相依偏好Interdependent Preferences588 利率Interest Rate589 利息和利润Interest and Profit590 多种利益Interests591 代际模型Intergenerational Models592 内部经济Internal Economies593 国内移民Internal Migration594 内部收益率Internal Rate of Return595 国际资本流动International Capital Flows596 国际金融International Finance597 国际收入比较International Income Comparisons598 国际债务International Indebtedness599 国际清偿能力International Liquidity600 国际移民International Migration601 国际货币经济学International Monetary Economics602 国际货币体制International Monetary Institutions603 国际货币政策International Monetary Policy604 国际贸易International Trade605 人际效用对比Interpersonal Utility Comparison606 时际均衡与效率Intertemporal Equilibrium and Efficiency607 时际资产组合理论和资产定价Intertemporal Portfolio Theory and Asset Pricing 608 价值的不可变标准Invariable Standard of value609 存货Inventories610 存货周期Inventory Cycles611 确定性条件下的存货政策Inventory policy under certainty612 投资Investment613 投资决策标准Investment Decision Criteria614 投资计划Investment Planning615 投资与积累Investment and Accumulation616 看不见的手Invisible Hand617 非自愿失业Involuntary Unemployment618 工资铁律Iron Law of Wages619 作为经济理论家的杰文斯Jevons As An Economic Theorist 620 联合生产Joint Production621 线性模型中的联合生产Joint Production in Linear Models 622 法理学Jurisprudence623 公平价格Just Price624 公平Justice625 公平、不平等及岐视Justices, Inequality and Discrimination 626 凯恩斯的《通论》Keynes''s General Theory627 凯恩斯主义经济学Keynesian Economics628 凯恩斯革命Keynesian Revolution629 凯恩斯主义Keynesianism630 弯折的需求曲线Kinked Demand Curve631 圣殿骑士团Knights Templar632 康德拉季耶夫周期Kondratieff Cycle633 库兹涅茨波动Kuznets Swings634 劳动经济学Labour Economics635 劳动交换Labour Exchange636 劳动市场歧视Labour Market Discrimination637 劳动市场Labour Markets638 劳动力Labour Power639 劳动过程Labour Process640 妇女劳动供给Labour Supply of Women641 劳动剩余经济Labour Surplus Economies642 劳动价值论Labour Theory of value643 劳动与就业Labour and Employment644 劳动者管理经济Labour-Managed Economies645 拉格朗日乘子Lagrange Multipliers646 自由放任主义Laissez-Faire647 土地改革Land Reform648 地租Land Rent649 土地税Land Tax650 兰格一勒纳机制Lange一Lerner Mechanism651 巨大经济Large Economies652 潜在变量Latent Variables653 大庄园制Latifundia654 法律与经济学Law and Economics655 解雇Layoffs656 沙特利耶原理Le Chatelier Principle657 起前与滞后Leads and Lags658 边干边学Learning-by-doing659 最小二乘法Least Squares660 闲暇Leisure661 有闲阶级Leisure Class662 里昂惕夫悖论Leontief Paradox663 字典式序Lexicographic Orderings664 自由主义Liberalism665 自由Liberty666 生命周期假说Life Cycle Hypothesis667 人寿保险Life Insurance668 寿命表Life Tables669 似然Likelihood670 极限定价Limit Pricing671 有限应变量Limited Dependent Variables672 增长的极限Limits to Growth673 林达尔均衡Lindahl Equilibrium674 林达尔论财政Lindahl on Public Finance675 线性模型Linear Models676 线性规划Linear Programing677 联系Linkages678 流动性Liquidity679 流动性偏好Liquidity Preference680 可贷资金Loanable Funds681 地方财政Local Public Finance682 经济活动的区位Location of Economic Activity683 对数正态分布Lognormal Distribution684 长周期Long Cycles685 经济增长中的长波Long Swing in Economic Growth686 长期和短期Long-run and Short-run687 洛伦茨曲线Lorenz Curve688 低工资Low Pay689 一次总付税Lump Sum Taxes690 李雅普诺夫函数Lyapunov Functions691 李雅普诺夫定理Lyapunov''s Theorem692 机器问题Machinery Question693 宏观经济计量模型Macroeconometric Models694 宏观经济政策Macroeconomic Policy695 宏观经济学理论Macroeconomic Theory696 宏观经济学:与微观经济学的关系Macroeconomics Relations with Microeconomics 697 保持资本完整无缺Maintaining Capital Intact698 马尔萨斯的人口理论Malthus Theory of Population699 马尔萨斯与古典经济学Malthus and Classical Economics700 经理资本主义Managerial Capitalism701 曼彻斯特学派Manchester School702 制造业活动与非工业化Manufacturing and De-industrialization703 资本边际效率Marginal Efficiency of Capital704 边际生产力理论Marginal Productivity Theory705 货币的边际效用Marginal Utility of Money706 边际和平均成本定价Marginal and Average Cost Pricing707 边际主义经济学Marginalist Economics708 市场失灵Market Failure709 营销期Market Period710 集贸市场Market Places711 市场价格Market Price712 市场份额Market Share713 市场社会主义Market Socialism714 市场结构Market Structure715 市场结构与创新Market Structure and Innovation716 市场价值与市场价格Market value and Market Price717 购销管理局Marketing Boards718 马歇尔-勒纳条件Marshall-Lerner Condition719 鞍Martingales720 马克思主义经济学Marxian Economics721 马克思主义价值分析Marxian value Analysis722 马克思主义Marxism723 马克思主义经济学Marxist Economics724 物资平衡Material Balances725 数理经济学Mathematical Economics726 政治经济学的数学方法Mathematical Method in Political Economy727 矩阵乘子Matrix Multiplier728 极大似然Maximum Likelihood729 最大满足Maximum Satisfaction730 平均值Mean value731 均值-方差分析Mean-variance Analysis732 确义性与不变性Meaningfulness and Invariance733 测度论Measure Theory734 经济增长的测算Measurement of Economic Growth735 测算理论Measurement, Theory of736 重商主义Mercantilism737 兼并Mergers738 有益品Merit Goods739 方法论之争Methodentreit740 方法论Methodology741 微观经济学Microeconomics742 军费开支Military Expenditure743 最低工资Minimum Wages744 生产方式Mode of Production745 模型与理论Models and Theory746 增长模型Models of growth747 货币主义Monetarism748 国际收支的货币分析法Monetary Approach to the Balance of Payments749 货币基础Monetary Base750 货币幻想Monetary Cranks751 货币非均衡和市场出清Monetary Disequilibdum and Market Clearing752 货币均衡Monetary Equilibrium753 货币体制Monetary Institution754 货币政策Monetary Policy755 货币理论Monetary Theory756 货币幻觉Money Illusion757 货币供应Money Supply758 货币和一般均衡理论Money and General Equilibrium Theory759 货币与宏观经济学Money and Macroeconomics760 经济活动中的货币Money in Economic Activity761 货币贷款者Moneylenders762 城市经济学中的单中心模型Monocentric Models in Urban Economics 763 垄断性竞争Monopolistic Competition764 垄断性竞争与一般均衡Monopolistic Competition and General Equilibrium 765 垄断Monopoly766 垄断资本主义Monopoly Capitalism767 垄断与寡头垄断Monopoly and Oligopoly768 单调映射Monotone Mappings769 蒙特卡罗方法Monte Carlo Methods770 道德风险Moral Hazard771 道德哲学Moral Philosophy772 死亡率Mortality773 多重共线性Multicollinearity774 多国公司Multinational Corporations775 乘数分析Multiplier Analysis776 乘数-加速器相互作用Multiplier-accelerator Interaction777 多部门增长模型Multisector Growth Models778 多元时间序列模型Multivariate Time Series Models779 近视决策规则Myopic Decision Rules780 纳什均衡Nash Equilibrium781 国债National Debt782 国民收入National Income783 国民体系National System784 民族主义Nationalism785 国有化Nationalization786 自然法Natural Law787 自然垄断Natural Monopoly788 自然价格Natural Price789 自然利率和市场利率Natural Rate and Market Rate790 自然失业率Natural Rate of Unemployment791 自然资源Natural Resources792 自然资源和环境Natural Resources and Enviroment793 自然选择与进化Natural Selection and Evolution794 自然工资Natural Wage795 自然和人类资源Natural and Human Resources796 自然的及正常的条件Natural and Normal Conditions797 自然的和有保证的增长率Natural and Warranted Rates of Growth 798 必需品Necessaries799 负所得税Negative Income Tax800 负量Negative Quantities801 新李嘉图主义Neo-Ricardianism802 新古典的Neoclassical803 新古典增长理论Neoclassical Growth Theory804 新古典综合Neoclassical Synthesis805 净产品Net Product806 中性税收Neutral Taxation807 货币中性Neutrality of Money808 新古典宏观经济学New Classical Macroeconomics809 非合作对策Non-Cooperative Game810 非线性规划Non-Linear Programming811 非参数统计方法Non-Parametric Statistical Methods812 非竞争集团Non-competing Groups813 非凸性Non-convexity814 经济计量学中的非线性方法Non-linear Methods in Econometrics 815 非嵌套假设Non-nested Hypotheses816 非价格竞争Non-price Competition817 非盈利机构Non-profit Organizations818 非标准分析Non-standard Analysis819 无替代定理Non-substitution Theorems820 南北经济关系North-south Economic Relations821 价值标准Numeraire822 效用定律的数值确定Numerical Determination of the Laws of utility 823 营养Nutrition824 奥卡姆剃刀Occam''s (Ockham''s) Razor825 职业分离Occupational Segregation826 提供Offer827 提供曲线或相互需求曲线Offer Curve or Reciprocal Demand Curve 828 (卖方)寡头垄断Oligopoly829 寡头垄断与对策论Oligopoly and Game Theory830 敞地制Open Field System831 公开市场业务Open-market Operations832 运筹学Operations Research833 满足度Ophelimity834 机会成本Opportunity Cost835 最优控制与动态经济学Optimal Control and Economic Dynamics 836 最适度储蓄Optimal Savings837 最优关税Optimal Tariffs838 最优税收Optimal Taxation839 最优性与效率Optimality and Efficiency840 乐观主义与悲观主义Optimism and Pessimism841 最优货币区Optimum Currency Areas842 最适度人口量Optimum Population843 最适度货币数量Optimum Quantity of Money844 期权定价理论Option Pricing Theory845 期权Options846 序Orderings847 资本有机构成Organic Composition of Capital848 组织理论Organization Theory849 离群值Outliers850 产出与就业Output and Employment851 过度储蓄Over saving852 过度投资Over-investment853 间接成本Overhead Costs854 一般均衡的交叠世代模型Overlapping Generations Model of General Equilibrium 855 生产过剩Overproduction856 峰突Overshooting857 自生利率Own Rates of Interest858 帕尔格雷夫政治经济学辞典Palgrave''s Dictionary of Political Economy859 范式Paradigm860 悖论与异常Paradoxes and Anomalies861 帕累托分布Pareto Distribution862 帕累托效率Pareto Efficiency863 作为经济学家的帕累托Pareto as an Economist864 专利Patents865 路径分析Path Analysis866 回收期Pay-off Period867 工资税Payroll Taxes868 旺季定价Peak-load Pricing869 小农经济Peasant Economy870 小农Peasants871 货币经济与非货币经济Pecuniary and Non-Pecuniary Economies872 完全竞争Perfect Competition873 完全预见Perfect Foresight874 完全信息Perfect Information875 完全竞争市场和不完全竞争市场Perfectly and Imperfectly Competitive Markets 876 表演艺术Performing Arts877 生产周期Period of Production878 外围Periphery879 佩龙一弗罗宾尼斯定理Perron-Frobenius Theorem880 菲利普斯曲线Phillips Curve。
Conflict and Deterrence under Strategic Risk∗Sylvain Chassang Princeton University chassang@Gerard Padr´o i Miquel London School of Economics and NBERg.padro@January2010AbstractWe examine the determinants of cooperation and the effectiveness of deterrence when fear is a motive for conflict.We contrast results obtained in a complete informa-tion setting,where coordination is easy,to those obtained in a setting with strategicrisk,where players have different information about their environment.These twostrategic settings allow us to identify and distinguish the role of predatory and pre-emptive incentives as determinants of cooperation and conflict.We show that whileweapons unambiguously facilitate peace under complete information,this does nothold anymore under strategic risk.Rather,wefind that increases in weapon stockscan have a non-monotonic effect on the sustainability of cooperation.We also showthat under strategic risk,inequality in military strength can actually facilitate peaceand that anticipated peace-keeping interventions may improve incentives for peacefulbehavior.Keywords:cooperation,deterrence,strategic risk,global games,conflict,interven-tion,exit games.JEL classification codes:D74,C72,C73∗We are grateful to Sandeep Baliga,Micael Castanheira,Jon Eguia,Joan Esteban,Christian Hellwig, Matt Jackson,Patrick Legros,David Miller,Andrea Prat,Jesse Shapiro,Joel Sobel and Flavio Toxvaerd for comments and conversations.The paper also benefited from feedback by seminar participants at Sussex University,Universit´e Libre de Bruxelles,Rochester,UC Berkeley,UCSD,Cambridge University,and con-ference participants at the2008AEA meetings,the Polarization and Conflict Group Meeting at LSE,the UC Berkeley Conference on Endogenous Institutions and Conflict,as well as the ESSET meeting at Gerzensee. Andrew Robinson provided excellent research assistance.All remaining errors are,of course,our own.11IntroductionThe usual rationale for deterrence is closely related to the rationale behind grim trigger pun-ishment in a repeated prisoners’dilemma.Imagine two neighboring groups that repeatedly decide whether to be peaceful–i.e.to cooperate–or to launch a surprise attack on each other.A peaceful equilibrium can only be sustained if the short-run gains from a surprise attack are counterbalanced by the long-run costs of triggering conflict.In this context,if both groups accumulate weapons,the cost of conflict increases,thereby improving incen-tives for peaceful behavior.This is the logic of deterrence,which reflects the idea frequently highlighted in the literature on repeated games that harsher punishments should improve incentives for cooperation.1The symmetric accumulation of weapons,insofar as it generates higher costs of war,should facilitate peace.This paper examines the limits of this argument by contrasting the mechanics of cooper-ation and deterrence under complete information and under strategic risk,i.e.when players do not share a common understanding of their environment.While the complete information model suggests unambiguous predictions about the effect of weapons on peace,and about the impact of inequality on cooperation,these predictions need to be considerably nuanced once strategic risk is taken into account.We develop these points in detail and emphasize the importance of both predatory and preemptive incentives in determining the sustainability of cooperation under strategic risk.We model conflict as a very stylized dynamic exit game,keeping grim-trigger strategies in a repeated game as a benchmark.In each period,players decide whether to be peaceful or attack.When both players choose to be peaceful,they enjoy the economic benefits of peace and the game moves on to the next period.However,if one of the players attacks, conflict begins and players obtain exogenous continuation values.2Our model of strategic 1See for instance Abreu(1988)on penal codes.Garfinkel(1990)makes a similar point in the context of conflict and armament.2Because the players’payoffs upon conflict are exogenously specified,this game is not a repeated game. However,trigger strategies of a repeated game are naturally mapped into an exit game in which continuation values upon conflict are those that players obtain from repeatedly playing(Attack,Attack).Therefore,this2risk follows the global games literature.3More precisely,we consider a situation in which payoffs upon peace depend on an uncertain state of the world about which players obtain very informative but noisy signals.Because players do not have the same assessment of the state of the world,this creates strategic uncertainty in equilibrium.At a state around which behavior switches,there will be a high probability that one player will choose peace while the other one attacks.This causes the players to second guess each other’s move,and significantly affects the sustainability of peace.These effects remain even as the players’information becomes arbitrarily precise and we approach the complete information case. Throughout the paper we compare and contrast the conditions under which cooperation is sustainable in environments with and without strategic uncertainty.To understand the difference that strategic risk makes,it is important to distinguish between the two motives for conflict that exist in this game.First,one may be tempted to attack an otherwise peaceful opponent–this is the predatory motive for conflict.Second, one may attack to avoid suffering a surprise strike from an opponent who is expected to be aggressive–this is the preemptive motive for conflict.Under complete information,it is easy for players to coordinate and only predatory motives matter.Under strategic uncertainty however,the sustainability of peace depends significantly on both predatory and preemptive incentives.Because weapon stocks can affect preemptive and predatory incentives differently, many comparative statics that were unambiguous under complete information become much more nuanced under strategic risk.Ourfirst set of results considers symmetric increases in weapon stocks.Under complete information,increased weapons stocks facilitate peace by diminishing payoffs upon conflict. Under strategic risk however,the symmetric accumulation of weapons may very well be destabilizing.Indeed,while weapons diminish predatory incentives,they may increase pre-emptive incentives if being the victim of a surprise attack is particularly weakening.It follows exit framework encompasses the insights we obtain from a repeated prisoners’dilemma.See Chassang and Takahashi(2009)for a full-fledged analysis of repeated games under related incomplete information perturbations.3See for instance Carlsson and van Damme(1993)and Morris and Shin(1998)for seminal work on global games,and Morris and Shin(2003)for a review.3that under general conditions the impact of weapons on peace will be non-monotonic.In particular,very large stocks of weapons(e.g.nuclear stocks sufficiently large to guarantee mutually assured destruction)will foster peace,whereas intermediate stocks of weapons(e.g.a few nuclear warheads that could be destroyed by a surprise strike)may be destabilizing.Our second set of results explores how inequality in military strength affects stability.It is easy to show that unequal military power is always destabilizing under complete informa-tion.This is because inequality increases the predatory temptation of the stronger player. However,inequality reduces the preemptive motive for conflict for two reasons.First,the stronger player knows she has little to fear from the weaker one and hence she has smaller preemptive needs.Second,when the strong player is overwhelmingly dominant,the weaker player can only gain very little by launching a preemptive attack.As a consequence,under strategic risk,peace might be possible between unequal contenders in circumstances under which equally armed opponents wouldfight.This result,however,should not be interpreted as making a case for complete monopoly of violence.Indeed,while inequality can help,peace is only sustainable if the weaker player keeps enough weapons to limit the stronger player’s predatory incentives.This suggests that restrained superiority may sustain the greatest level of peace.Finally,we examine the impact of peace-enforcing interventions on peace and conflict. Wefirst highlight that under complete information,unless intervention is immediate and war is prevented altogether,intervention will always have a destabilizing impact.Indeed,as in the familiar case of grim trigger strategies,it is precisely the prospect of a long and painful conflict that deters players from attacking in thefirst place.This conclusion,however,is not robust to strategic risk.By alleviating the potential costs of being the victim of a surprise attack,intervention reduces preemptive incentives.In that setting we show that the promise of intervention may promote peace even if it can only happen with delay.This paper focuses entirely on the impact of strategic risk on the mechanics of deterrence and peace.As a result,the paper abstracts from a number of other realistic dimensions of conflict already emphasized in the literature.These include several frictions that induce bar-4gaining failures,such as imperfect information(see Fearon(1995)or Powell(1999)),leader bias(see Jackson and Morelli(2007)),and commitment problems(as in Powell(2004)or Yared(2009)).Also,we do not consider the question of endogenous investment in weapons and the guns vs butter trade-off(see for instance Grossman(1991),Skaperdas(1992),Es-teban and Ray(2008),as well as Jackson and Morelli(2009)who examine a model based on this trade-offthat exhibits deterrence).Rather,our purpose here is to revisit a more primitive question:how does the accumulation of weapons affect the stability of peace?While our contribution here is mostly applied,this paper also belongs to the recent theoretical literature on dynamic global games.4It is closely related to the work of Steiner (2008),Chassang(2009),Giannitsarou and Toxvared(2009),or Ordo˜n ez(2009),all of which use a simple dynamic programming approach to simplify the analysis of large global games. In these papers,as well as in ours,payoffshocks are independent across periods and the focus is on how incomplete information affects the provision of incentives,rather than on how players may learn the underlying state of the world.A complementary literature focuses on such learning by considering dynamic global games in which the state is constant or follows a random walk.See for instance Chamley(1999),Angeletos,Hellwig and Pavan(2007), Dasgupta(2007)or Dasgupta,Steiner and Stewart(2008).Because the exit game we consider can be thought of as a reduced form for trigger strategies in a repeated game,the basic insights of the paper can be applied in other en-vironments usually modeled using repeated games.Whenever predatory and preemptive incentives move in different directions,taking strategic risk seriously will significantly affect comparative statics.One possible application is the model of price wars during booms of Rotemberg and Saloner(1986)which shows that collusion is hardest to sustain during times 4It is also useful to relate this paper to some of our other applied work on conflict.In a small extension of the current paper(Chassang and Padro i Miquel(2009a)),we use the framework developed here to discuss the relative merit of defensive weapons and defensive alliances as means to sustain peace.In an other recent paper(Chassang and Padro i Miquel(2009b))we use a complete information model to discuss the impact of wealth on conflict in a context where wealth is expropriable.We highlight that it is temporary changes in wealth,rather than the level of wealth,that determine conflict.We note that in contrast to the current paper,considerations of strategic risk do not change the intuitions obtained in the complete information setting.5of temporary high demand since this is when predatory incentives are maximized.To the extent that preemptive incentives might be highest when demand is low(failing to react might put afirm out of business),introducing strategic risk may alter comparative statics. Similarly,the relational contracting literature(see for instance,Shapiro and Stiglitz(1985), Bull(1987),Baker,Gibbons and Murphy(1994,2002),or Levin(2003))often makes the point that reducing the players’outside option facilitates cooperation.This need not hold anymore in a model with strategic risk if reducing the players’outside option increases their incentives to preempt.The paper is organized as follows.Section2describes the framework and provides neces-sary and sufficient conditions for the sustainability of peace under complete and incomplete information.Section3contrasts the mechanics of deterrence with and without strategic risk. Section4studies how inequality in military strength affects conflict.Section5explores the impact of intervention on peace.Section6concludes.Proofs are contained in Appendix A.2Framework2.1A Simple Class of Cooperation GamesWe consider two groups i∈{1,2}that play an infinite horizon trust game,with discrete time t∈N,and share a common discount factorδ.Each period t,players simultaneously decide whether to be peaceful(P)or attack(A).If both players are peaceful at time t,they obtain aflow payoffπand the game moves on to period t+1.When either of the players attacks,the game enters a conflict mode.Players receive an exogenously specified stream of payoffs and strategic interaction per-se ends.When player i attacks while−i is peaceful, she is afirst mover and gets a stream of payoffs(f i,n)n≥0,where n denotes the number of periods elapsed since conflict began.5If the opposite happens,player i is a second mover and gets a stream of payoffs(s i,n)n≥0.If both players attack at the same time,simultaneous 5i.e.if conflict started at time t,theflow payoffobtained by afirst mover i at time t+n is f i,n.6war begins and player i gets a stream of payoffs(w i,n)n≥0.We define F i,S i and W i the present discounted values of starting conflict as afirst,second or simultaneous mover.More specifically,we define,F i=+∞n=0δn f i,n;S i=+∞n=0δn s i,n;W i=+∞n=0δn w i,n.6Throughout the paper F i,S i and W i will depend on the respective stocks of weapons k i and k−i of each player.More specifically,there are functions F,S and W such that,F i=F(k i,k−i),S i=S(k i,k−i),W i=W(k i,k−i).Whenever k i=k−i=k,we use the notation F i=F(k),S i=S(k)and W i=W(k).We maintain the following assumption.Assumption1Payoffs F i,S i and W i are increasing in k i and decreasing in k−i.Further-more,F(k),S(k)and W(k)are all decreasing in k.This is a fairly natural assumption:conditional on conflict,player i’s payoffis increasing in her own stock of weapons and decreasing in her opponent’s stock of weapons.Moreover, a symmetric increase in the amount of weapons makes conflict more painful on all sides. Throughout the paper,we discuss weapon stocks k i and k−i affect the sustainability of peace under different informational environments.In any period t,given continuation values(V i)i∈{0,1}upon joint cooperation,players can 6Note that trigger strategies in a repeated game are naturally mapped into this framework.Consider for instance,in the Prisoners’Dilemma,with stage game payoffs given byP APπ−cA b0whereπ<b and b−c<2πso that peace is efficient.Trigger strategies correspond to payoffs upon conflict f i,0=b,s i,0=−c,w i,0=0and f i,n=s i,n=w i,n=0for n>0.7be thought of as facing the one-shot game,P APπ+δV i S iA F i W iwhere payoffs are given for row player i.7This representation of payoffs allows us to identify two distinct motives for conflict.The payoffdifference F i−π−δV i corresponds to player i’s predatory incentives,that is,how much player i would gain from attacking a consistently peaceful opponent.When players expect permanent peace upon continuation,predatoryπ.The payoffdifference W i−S i corresponds to the preemptive incentives take the form F i−11−δincentives of player i,that is,how much player i would gain from attacking an opponent that is expected to attack.We make the following assumption.Assumption2(early mover advantage)For all i∈{1,2},F i>W i>S i. Assumption2simply states that if conflict occurs,there is an advantage to attacking early. This assumption is natural in many instances of conflict,including military conflict,conflict betweenfirms,or even conflict between individuals,as thefirst mover benefits from additional time to prepare her moves.Throughout the paper,we contrast a situation in which theflow benefits of peaceπare common knowledge,and a situation in which players make noisy but precise private assessments of the value ofπ.In thefirst case,common knowledge of payoffs allows players to coordinate their actions effectively and only predatory incentives matter for the sustainability of peace.Under incomplete information however,coordination becomes difficult as players attempt to second guess one another’s value for peace.In that case the sustainability of peace depends significantly on both predatory and preemptive incentives.Note that while we emphasize the players’uncertainty over the common returns to peace 7We look at a situation where the benefits of cooperationπare symmetric for the purpose of simplicity. Extending the model to a setting with asymmetric benefits presents no conceptual difficulty and simply adds to the notational burden.8π,our results would be identical if we consider uncertainty over the returns F from a surprise attack.8Indeed,it is uncertainty over predatory incentives F−π−δV i as a whole that drives our results.Note in addition that unfavorable economic shocks are in fact a major driver of conflict(see Miguel et al(2004)or Ciccone(2008)).2.2The Complete Information BenchmarkIn the benchmark complete information setting,payoffπisfixed and common knowledge among players.We denote byΓCI the corresponding dynamic game.Proposition1(cooperation under complete information)Peace is(permanently)sus-tainable in an equilibrium ofΓCI if and only if∀i∈{1,2},F i−11−δπ≤0.(1)This means that under complete information,the sustainability of peace depends only on the magnitude of predatory incentives.Preemptive incentives play no role as neither S i nor W i enter condition(1).Note that this condition is analogous to the condition for cooperation in a Prisoners’Dilemma under grim trigger strategies.We denote byπCI the smallest value ofπsuch that inequality(1)holds.Let us turn to the case of strategic risk.2.3Strategic RiskWe model strategic risk in equilibrium by allowing players to have different perceptions of their environment.Although strategies are common knowledge in equilibrium,the fact that perceptions are private implies that there is no common knowledge of what actions will be taken.This leads players to try to second guess each other’s next move in order to avoid suffering a surprise attack.This second guessing is closely related to the idea of“reciprocal 8See Chassang(2009)for a general framework in which perturbations can affect all entries of the payoffmatrix.9fear of surprise attacks”developed by Schelling(1960).We are ultimately interested in determining when such thought processes lead to an unraveling of peace.9 We consider an environment in which the returns to peace are not common knowledge. Specifically,we follow the framework of Chassang(2009)and consider the slightly perturbed exit game withflow payoffsP AP˜πt S iA F i W iwhere˜πt is an i.i.d.random variable withfinite variance,distribution g and support (−∞,+∞).The payoffof cooperation˜πt is not directly observable by the players when they make their decision at time t.Instead,players observe signals of the form x i,t=˜πt+σ i,t where{ i,t}i∈{1,2},t∈N is an i.i.d.sequence of centered errors with support[−1,1],andσ>0. For simplicity we assume that˜πt is observable in period t+1via theflow payoffs.Let us denote this game byΓσ,g.To perform a robustness check on the complete information environment we are interested in the sustainability of peace inΓσ,g asfirst,σgoes to0,and second,g approaches a point mass atπ.10This corresponds to an environment where players have approximately complete information about the state of the world,but remain uncertain about whether they are more or less optimistic than the other player.Analysis is facilitated by the fact that given a distribution g,asσbecomes small,gameΓσ,g admits a most peaceful equilibrium s Hσ,gwhichsustains the highest equilibrium values V Hσ,g .Equilibrium s Hσ,galso takes a simple thresholdform,i.e.,there exists(x Hi,σ,g)i∈{1,2}∈R2such that player i plays peace whenever she gets asignal x i,t≥x Hi,σ,gand attacks otherwise.119For a related model of reciprocal fears see Baliga and Sjostr¨o m(2004).10Note that the order of limits we take is important.By takingσto0first,we insure that the players always care about their private information,so that there is indeed second guessing and strategic risk.When we take the other order of limits,the players have such strong priors that they regard their private signals as completely noisy and we are essentially back in the complete information setting.11See the appendix for more formal statements and proofs.It is important to note that we do not restrict10Before characterizing this most peaceful equilibrium in the limit case where players have very precise information,it is useful to delineate why a small amount of incomplete infor-mation can radically affect equilibrium behavior.For this purpose let us focus on the case where payoffs and signalling structures are symmetric.In that setting,the most peaceful equilibrium is symmetric with both players using the same threshold x H.Withσsmall,σ,ghas little uncertainty about her a player that gets a signal well below or well above x Hσ,gopponent’s behavior.The likelihood of surprise attacks is small.However,when a player,then there is roughly probability a half that her opponent gets as a signal the threshold x Hσ,ggot a higher signal and probability a half that her opponent got a lower signal.This means that an equilibrium threshold must be such that at that state of the world,a player is willing to be peaceful even though there is probability roughly a half that her opponent will launch a surprise attack.Note that in aggregate,the overall probability of a surprise attack maybe vanishing.What matters is that conditional on being at an equilibrium threshold,there is a high likelihood of an attack.This is why a small amount of incomplete information can significantly affect the way players interact even though,in aggregate,surprise attacks are quite rare.We now characterize explicitly when peace can be sustained under strategic risk.For this purpose we introduce some notation.Given any pair V=(V i,V−i)of continuation values, we consider the following2×2game G(V)P APπ+δV i S iA F i W i,where payoffs are given for row player i.Following Harsanyi and Selten(1988),we say that attention to threshold-form strategies.Rather,we prove that gameΓσ,g admits a most peaceful equilibrium, and that this equilibrium is necessarily in threshold-form strategies.11(P eace,P eace)is risk-dominant in game G(V)if and only ifi∈{1,2}(π+δV i−F i)+>i∈{1,2}(W i−S i).Inversely,we say that(Attack,Attack)is risk-dominant if the opposite strict inequality holds. We also denote byV≡11−δπthe value of permanent peace.We can now state the main result of this section,which we use throughout the paper.Recall that V Hσ,gdenotes the highest equilibrium pair of values in gameΓσ,g.It is supported by the most cooperative equilibrium.Proposition2(cooperation under strategic risk)For any sequence{g n}n∈N such that for all n∈N,g n has support(−∞,+∞)and{g n}n∈N converges in mean to the unit mass atπ,the following hold:(i)Whenever(P eace,P eace)is risk-dominant in game G(V,V),then permanentpeace is sustainable under strategic risk,in the sense thatlim n→∞limσ→0V Hσ,g n=V,V.(ii)Inversely,whenever(Attack,Attack)is risk-dominant in game G(V,V),then peace is unsustainable under strategic risk,in the sense thatlim n→∞limσ→0V Hσ,g n=(W i,W−i).Proposition2provides a convenient criterion to check whether peace is sustainable under strategic risk.Point(i)shows that when(P eace,P eace)is risk-dominant in G(V,V),then the highest sustainable equilibrium value inΓσ,g converges to the value of permanent peace V,which implies that the most cooperative equilibrium ofΓσ,g sustains approximately per-12manent peace.Inversely,point(ii)shows that when(Attack,Attack)is risk dominant,then the values associated with the most cooperative equilibrium ofΓσ,g converge to the value of immediate conflict.This implies that permanent conflict is the only equilibrium sustainable under strategic risk.Altogether,points(i)and(ii)imply that peace is robust to strategicrisk if and only ifi∈{1,2}11−δπ−F i+>i∈{1,2}(W i−S i)(2)where(z)+≡max{0,z}.Let us denote byπSR the smallest value ofπsuch that(2)holds.12 Condition(2)shows that just as under complete information,it is necessary that both players’predatory incentives(F i−1π)be negative to sustain peace.13In addition,condition (2)emphasizes the role of preemptive incentives(W i−S i).The larger preemptive incentives are,the harder it is to sustain peace.When payoffs are symmetric,peace is sustainable under strategic risk if and only if F−11−δπ+W−S<0,i.e.peace is sustainable if and only if the sum of predatory and preemptive incentives is negative.As Sections3,4and5show,there will often be a conflict between minimizing predatory incentives and minimizing preemptive incentives.As a consequence,taking strategic risk seriously can refine in important ways our understanding of cooperation and conflict.From a modeling perspective,we consider the limit where the distribution g of returns from peace˜πt becomes concentrated around a given valueπboth for the purpose of tractabil-ity and because it allows us to focus exclusively on the role of preemptive incentives in determining the players’ability to cooperate.A drawback of taking this limit is that in our model,conflict either begins in thefirst period(if(P eace,P eace)is not risk-dominant in G(V,V)),or the likelihood of conflict infinite time is zero(if(P eace,P eace)is risk-dominant in G(V,V)).It is not difficult to resolve this problem since our analysis extends 12Note that the equilibrium in which players always attack is always robust to strategic risk.In fact,“attacking always”is an equilibrium ofΓσ,g for allσand all g.As Chassang(2009)notes,in games with an infinite horizon,the global games perturbation cannot be used as a trick to select a unique equilibrium. Rather,the global games perturbation serves as a model of strategic risk in equilibrium that introduces preemption as a motive for conflict.13Indeed,since W i−S i>0,Condition(2)holds only if11−δπ−F i>0for i∈{1,2}.13easily to circumstances where information is precise but the distribution g is not degenerate (see Chassang(2009)).In that case,the most peaceful equilibrium will still be in threshold strategies,but there will be some probability of conflict in each period depending on whether the realized return to peace˜πt is above the threshold or not.Most importantly,the equilib-rium threshold will still be determined by risk-dominance concerns and the broad qualitative points we make in the paper would be unchanged.However,because payoffs upon conflict would now enter continuation values upon peace and change the potential surplus available in the game,the analysis would become richer and obscure the role played by preemptive incentives.3Deterrence with Symmetric Weapon Stocks3.1General ResultsThis section investigates how a symmetric increase in weapon stocks affects the sustainability of peace by studying the comparative statics of thresholdsπCI andπSR.These thresholds correspond respectively to the minimumflow returns to peaceπnecessary for peace to be sustainable under complete information and under strategic risk.This implies that the lower πCI andπSR are,the easier it is to sustain peace.We say that weapons are deterrent if and only if the symmetric accumulation of weapons reduces the minimum value ofπrequired to sustain peace.The following proposition describes how the deterrent effect of weapons may differ across strategic settings.Recall that payoffs upon conflict F i,S i and W i depend on the players’respective weapon stocks,k i and k−i.In addition,when weapon stocks are symmetric,i.e. k i=k−i=k,then all payoffs upon conflict are decreasing in k.Proposition3(deterrence under complete and incomplete information)Consider a situation in which k i=k−i=k.We have that(i)πCI is always strictly decreasing in k.14。
Complexity Results for Triangular Sets´Eric SchostLaboratoire GAGE,´Ecole polytechnique,91128Palaiseau Cedex,FranceAbstractWe study the representation of the solutions of a polynomial system by triangular sets,and concentrate on the positive-dimensional case.We reduce to dimension zero by placing the free variables in the basefield,so the solutions can be represented by triangular sets with coefficients in a rational functionfield.We give intrinsic-type bounds on the degree of the coefficients in such a triangular set,and on the degree of an associated degeneracy hypersurface.Then we show how to apply lifting techniques in this context,and point out the role played by the evaluation properties of the input system.Our algorithms are implemented in Magma;we present three applications,rele-vant to geometry and number theory.Key words:triangular sets,complexity,symbolic Newton operator1IntroductionThis article studies the triangular representation of the solutions of a polyno-mial system.Ourfirst focus is on complexity results and algorithms;we also present a series of applications that were treated with these techniques.To make things clear,let usfirst display a concrete example of a triangular set. An example in Q[X1,X2].Consider the polynomial system in Q[X1,X2]:F1=−X31X2+2X21−4X1X22+2X1X2−2,F2=X21X2−X1+4X22−2X2.Email address:Eric.Schost@polytechnique.fr(´Eric Schost).Preprint submitted to Elsevier Science27March2003It admits the following Gr¨o bner basis for the lexicographic order X1<X2:T1=X21−2,T2=X22−14X1.Since T1is in Q[X1]and T2in Q[X1,X2],we say that(T1,T2)form a triangular set.In particular,T1describes the projection of the zero-set of(F1,F2)on the X1-axis.From thefield-theoretic point of view,the system(F1,F2)generates a prime zero-dimensional ideal,so Q→B:=Q[X1,X2]/(F1,F2)defines afield ex-tension.We let x1,x2be the images of X1,X2in B;then T1is the minimal polynomial of x1in Q→B and T2,seen in Q(x1)[X2],is the minimal polyno-mial of x2in Q(x1)→B.Generalization andfirst complexity considerations.Consider now an arbitraryfield K,K its algebraic closure,and a zero-dimensional variety W⊂A n(K)defined over K.For simplicity,we take W irreducible over K;then just as above,the ideal defining W admits the following Gr¨o bner basis for the lexicographic order X1<···<X n:T1(X1),T2(X1,X2),...T n(X1,...,X n),with T k in K[X1,...,X k],and monic in X k,for k≤n.We will use this as an intuitive definition of a triangular set for the rest of this informal introduction. Note that if W is not irreducible,its defining ideal might not have such a triangular family of generators:several triangular sets may be necessary.For k≤n,the family T1,...,T k describes the projection of W on the affine subspace of coordinates X1,...,X k.In particular,as above,T1is the mini-mal polynomial of X1modulo the ideal defining W.This close link between projections and triangular representations is central in what follows.Let us turn to complexity considerations.The product of the degrees of thepolynomials T k in their“main variable”Πk≤n deg Xk T k equals the number ofpoints in W,and bounds the total degree of each polynomial T k.Thus,in terms of degrees in the variables X1,...,X n,there is not much more to say. New questions arise when the basefield K is endowed with a“size”function: if K is a rational functionfield,we may consider the degree of its elements;2if K is a numberfield,we can talk about the height of its elements.In this context,it becomes natural to ask how the size of the coefficients in T1,...,T n relates to some invariants measuring the“complexity”of the variety W.In view of the above remarks,a more accurate question is actually,for k≤n,the relation between the size of the coefficients in T1,...,T k and the complexity of the projection of W on the subspace of coordinates X1,...,X k.In this article,we focus on this question in the functionfield case.Here is the concrete situation from where the question originates.Polynomial systems with parameters.A variety of problems can be described by polynomial systems involving free variables,or parameters.In such situations,we also often know that there are onlyfinitely many solutions for a generic choice of the parameters.In other words,we are considering systems that are zero-dimensional over the field of rational functions on some parameter space;triangular sets with ra-tional functions coefficients can then be used to represent their solutions.The following applications motivated this approach;they are detailed in Section8.•Modular equations.In Gaudry and Schost[2002],we propose a definitionof modular equations for hyperelliptic curves,with a view towards point-counting applications.For a given curve,these equations come from the resolution of zero-dimensional polynomial systems,as the minimal polyno-mial of one of the unknowns.Thus,they can be obtained from a triangular set computation,as in the introductory example.An interesting question is that of modular equations for a curve with generic coefficients,which can be precomputed and stored in a database. This was already done in the elliptic case,and is now done for afirst hy-perelliptic modular equation in the Magma package CrvHyp.This naturally raises the question of triangular sets with coefficients in a rational function field.•Curves with split Jacobian.Curves of genus2with(2,2)-split Jacobian are of interest in number theory:over Q,torsion,rank and cardinality records are obtained for such curves,see Kulesz[1995,1999],Howe et al.[2000]. Roughly speaking,these curves are characterized by the presence of elliptic quotients of degree2of their Jacobian.We studied such curves in Gaudry and Schost[2001],and showed that the elliptic quotients can be read offtriangular sets coming from the resolution of a suitable polynomial system.Classification questions require treating this question for curves with generic coefficients,which leads again to the problem of computing triangular sets over a rational functionfield.•Implicitization.Finally,we will show that the implicit equation of a parame-trized surface in R3can be obtained using the triangular representation.3Contrary to the above,this question is not a priori formalized in terms of a parametric system.Nevertheless,this question actually reduces to the com-putation of a minimal polynomial over the rational functionfield Q(x1,x2), which can be done using triangular sets.These examples share the following property:only a partial information,such as a specific eliminating polynomial,is really wanted.We now see how trian-gular sets can answer this question with good complexity.Overview of our results.The above discussion is formalized as follows: we consider a polynomial system F defined over afield K,depending on m parameters P1,...,P m and n unknowns X1,...,X n.Geometrically speaking, F defines a variety W of dimension m in A m+n(K)and generates a zero-dimensional ideal,when extended over thefield of rational functions on A m(K). Then its”generic solutions”can be represented by a family of triangular sets with coefficients in this rational functionfield.For this short overview,we assume that the generic solutions are represented by a single triangular set T1,...,T ing additional regularity hypotheses,we will answer the following questions:How do the degrees in this triangular set relate to geometric degrees?How accurately does this triangular set describe the solutions of the parametric system F?How fast can it be computed?•Degree bounds.The coefficients of T1,...,T n are rational functions in thefree variables P1,...,P m.Wefirst show that their degrees are bounded by intrinsic geometric degrees,that is,independently of the B´e zout number of the system F.Precisely,for k≤n,the coefficients of T1,...,T k have degree bounded in terms only of the degree of the projection W k of W on the space of coordinates P1,...,P m,X1,...,X k.The precise bound is of order (deg W k)k.•Geometric degree of the degeneracy locus.A triangular set with coefficients in a rational functionfield describes generic solutions.Thus,there is an open subset in the parameter space where none of the denominators of these ra-tional functions vanishes,and where their specialization gives a description the solutions of the parametric system F.We show that the locus where the specialization fails is contained in an hypersurface whose degree is quadratic in the geometric degree of W.Note the difference with the above degree bounds,which are not polynomial in this degree.The analysis of the probabilistic aspects of our algorithms are based on this result.•Algorithms.Triangular sets are useful for structured problems.For instance, all the above examples can be reduced to the computation of thefirst k polynomials T1,...,T k,for some k≤n.We give probabilistic algorithms for computing these polynomials,whose complexity is polynomial in the4size of the ing the above upper bound,the complexity actually depends on the degree of the projection W k of W on the space of coordinates P1,...,P m,X1,...,X k,but not on the degree of W itself.Note nevertheless that our complexity results comprise an additional fac-tor which is exponential in n,inherent to computations with triangular sets.Following the series of articles Giusti et al.[1995,1997,1998],Heintz et al.[2000],Giusti et al.[2001],Heintz et al.[2001],our algorithms rely on symbolic Newton lifting techniques and the Straight-Line Program repre-sentation of polynomials.Their practical behavior matches their good com-plexity,as they enabled to solve problems that were otherwise out-of-reach. Comparison with primitive elements techniques.This work is in the continuation of Schost[2003],which focuses on a representation by primitive element techniques,the geometric resolution,in a similar context.Caution must be taken when comparing the two approaches.They answer different questions;as such,their complexities cannot be compared directly,since they are stated in terms of different quantities.We use again the above notation:the geometric object of interest is a variety W defined by polynomials in K[P1,...,P m,X1,...,X n],and for k≤n,W k is its projection on the space of coordinates P1,...,P m,X1,...,X k.The degree bound of the coefficients in a geometric resolution is linear in the degree of W.This is to be compared with the results for the triangular representation,which are not polynomial in this degree.On the other hand, triangular sets take into account the degrees of the successive projections W k, which cannot be reached using a primitive element.These degrees can be arbitrarily smaller than the degree of W,making the interest of the triangular representation.Consider now the algorithmic aspect.The algorithm in Schost[2003]computes a parametric geometric resolution with a complexity that depends on the de-gree of W.The algorithms proposed here compute k polynomials T1,...,T k, for any given k≤n;their complexity depends on the degree of the correspond-ing projection W k of W on the space of coordinates(P1,...,P m,X1,...,X k), but not on the degree of W.Again,this suggests that triangular sets are of interest for problems with a structure,where projections might induce degree drops.We refer to Section8for a practical confirmation for several applica-tions.Related work.In dimension zero,a landmark paper for the triangular rep-resentation is Lazard[1992].Our definition of triangular sets is inspired by5the one given there,as is the treatment of more technical questions such as splitting and combining triangular sets.In arbitrary dimension,several notions of triangular sets and algorithms ex-ist,see Lazard[1991],Kalkbrener[1991],Maza[1997],Aubry[1999],Delli`e re [1999],Szanto[1999].For a comparison of some of these approaches,see Aubry et al.[1999];we also refer to the detailed survey of Hubert.Our choice to re-duce the question to dimension zero over afield of rational functions yields algorithms with good complexity,and easy to implement.Yet,our output is not as strong as for instance that of Lazard[1991],Maza[1997],Delli`e re[1999]: ours is only generically valid.Upper bounds on the degrees of the polynomials in a triangular set were given in Gallo and Mishra[1990]and Szanto[1999];we recall these results in the next section.In particular,the approach of Gallo and Mishra[1990] inspired Theorem1below.We also use results from Schost[2003],which follow notably Sabia and Solern´o[1996].Lifting techniques for polynomial systems were introduced in Trinks[1985], Winkler[1988].They were used again in the series of articles by Giusti,Heintz, Pardo and collaborators,Giusti et al.[1995,1997,1998],Heintz et al.[2000], Giusti et al.[2001],Heintz et al.[2001].The conjoint use of the Straight-Line Program representation led there to algorithms with the best known complexity for primitive element representations.The present work is in the continuation of the above;see also the survey of Pardo[1995]for a histori-cal presentation of the use of Straight-Line Programs in elimination theory. Finally,let us mention the results of Lecerf[2002],which extend lifting tech-niques to situations with multiplicities.We note that the article Heintz et al.[2000]precedes Schost[2003]and the present work,and considers similar questions of parametric systems.Never-theless,we noted in Schost[2003]that the geometric hypotheses made in that article are not satisfied in many“real life”applications,and this is again the case for the applications treated here.It should be noted that our complexity statements are of an arithmetic nature, that is,we only estimate the number of basefield operations.When the base field is the rationalfield,the notion of binary complexity will give a better description of the expected computation time.We have not developed this aspect,which requires arithmetic-geometric considerations.We refer to Krick and Pardo[1996],Giusti et al.[1997],Krick et al.[2001]where such ideas are presented.This work is based on a shorter version published in Schost[2002].The degree bounds given here are sharper.The whole analysis of the degeneracy locus and the subsequent error probability analyses for the algorithms are new.The6complexity results are now precisely stated in terms of basic polynomial and power series arithmetic.Acknowledgements.I wish to thank L.M.Pardo for his useful remarks on the first version of this paper.2Notation,Main ResultsTriangular sets in dimension zero.We first define triangular sets over a ring R .Our definition is directly inspired by that of reduced triangular sets given in Lazard [1992]:a triangular set is a family of polynomials T =(T 1,...,T n )in R [X 1,...,X n ]such that,for k ≤n :•T k depends only on X 1,...,X k ,•T k is monic in X k ,•T k has degree in X j less than the degree in X j of T j ,for all j <k .Let now K be a field,K its algebraic closure and W ⊂A n (K )a zero-dimensional variety.Recall that W is defined over K if its defining ideal in K [X 1,...,X n ]is generated by polynomials in K [X 1,...,X n ].In this case,a family {T 1,...,T J }of triangular sets with coefficients in K represents the points of W if the radical ideal defining W in K [X 1,...,X n ]is the intersection of the ideals generated by T 1,...,T J ,and if for j =j ,T jand T j have no common zero.In this situation,all ideals (T j )are radical by the Chinese Remainder Theo-rem.We then relate the degrees of the polynomials in the family {T 1,...,T J }and the cardinality of W :•If W is irreducible,the family {T 1,...,T J }is actually reduced to a sin-gle triangular set T =(T 1,...,T n )and the product Πk ≤n deg X k T k is the cardinality of W .Here,deg X k T k denotes the degree of T k in the variable X k .•If W is not irreducible,a family {T 1,...,T J }satisfying our conditions exists but is not unique [Lazard,1992,Proposition 2and Remark 1];now the sum j ≤J Πk ≤n deg X k T j k is the cardinality of W .Hereafter,note thatthe superscript in the notation T j k does not denote a j -th power.Note that it necessary to work over the algebraically closed field K ,or more generally to impose separability conditions,to obtain equalities as above,re-lating the degrees in the triangular sets T or {T 1,...,T J }and the number of7points in the variety W.The basic geometric setting.We now turn to more geometric considera-tions.All along this article,wefix afield K,K its algebraic closure,and work in the affine space A m+n(K).We denote by P=P1,...,P m thefirst m coor-dinates in A m+n(K)and by X=X1,...,X n the last n coordinates.We use the notion of geometric degree of an arbitrary affine variety(not necessarily irreducible,nor even equidimensional),introduced in Heintz[1983].In what follows,the affine space A m+n(K)is endowed with two families of projections.For k≤n,we defineµk andπk as follows;hereafter,p denotes a point in A m(K).µk:A m+n(K)→A m+k(K)πk:A m+k(K)→A m(K) (p,x1,...,x n)→(p,x1,...,x k)(p,x1,...,x k)→p.Note in particular thatπn maps the whole space A m+n(K)to A m(K).The main geometric object is a m-dimensional variety W⊂A m+n(K).Our first results are of an intrinsic nature,so we do not need an explicit reference to a defining polynomial system.The assumptions on W follow the description made in the introduction:Assumption1Let{W j}j≤J denote the irreducible components of W.We assume that for j≤J:(1)the imageπn(W j)is dense in A m(K).(2)the extension K(P1,...,P m)→K(W j)is separable.Assumption1.1implies that thefibers of the restriction ofπn to each compo-nent of W are genericallyfinite;this justifies treating thefirst m coordinates as distinguished variables and calling them parameters.Assumption1.2is of a more technical nature,and will help to avoid many difficulties;it is always satisfied in characteristic zero.Under Assumption1,we can define the generic solutions of the variety W. Let J⊂K[P,X]be the radical ideal defining W and J P its extension in K(P)[X].We call generic solutions of W the roots of J P,which are infinite number.We now refer to the previous paragraph,taking K=K(P),and for W the finite set of generic ing Assumption1.2,the ideal J P remains radical in K[X],so the generic solutions are indeed defined over K=K(P). Thus,they can be represented by a family of triangular sets in K(P)[X];our8purpose in this article is to study their complexity properties,and provide algorithms to compute with them.Let us immediately note some particular cases:•If W is irreducible,a single triangular set is enough to represent its generic solutions.•If W is defined over K,it can be written W=∪j≤J W j,where for all j, W j is defined over K,and the defining ideal of W j is prime in K[P,X]. Then the generic solutions of each W j are represented by a triangular set in K(P)[X];the generic solutions of W are represented by their reunion.Projections of W.Before presenting the main results,we introduce some notation related to W and its successive projections.Let k be in1,...,n.First of all,we denote by X≤k thefirst k variables X1,...,X k;if T is a triangular set,T≤k is the sub-family T1,...,T k.We denote by W k⊂A m+k(K)the closure ofµk(W),so in particular W n coincides with W.It is a routine check that for all k,W k satisfies Assumption1 as well.Let J k⊂K[P,X≤k]be the ideal defining W k,and J P,k its extension in K(P)[X≤k].Under Assumption1.1,J P,k coincides with J P∩K(P)[X≤k].Thus if the generic solutions of W are defined by a triangular set T,J P,k is generated by T≤k.For p in A m(K),we denote by W k(p)thefiberπ−1k (p)∩W k and by D k thegeneric cardinality of thefibers W k(p).Finally,let B k be the quotient K(P)[X≤k]/J P,k;by Assumption1.2,the ex-tension K(P)→B k is a product of separablefield ing the separability,B k has dimension D k,by Proposition1in Heintz[1983]. Degree bounds.With this notation,we now present our main results.We assume that the generic solutions of W are represented by a triangular set T=(T1,...,T n)in K(P)[X].In view of the above remarks,this is not a strong limitation:if this assumption is not satisfied,as soon as W is defined over K,the following upper bounds apply to all the K-defined irreducible components of W.As mentioned in the preamble,the degree bounds of T in the X variables areeasily dealt with:for all k≤n,the productΠi≤k deg Xi T i is the dimension ofB k over K(P),that is,the generic cardinality D k of thefibers W k(p).9We will thus concentrate on the dependence with respect to the P variables. For k≤n,the polynomial T k depends only on the variables X1,...,X k,and has coefficients in K(P)=K(P1,...,P m).It is then natural to relate the degrees of these coefficients to the degree of the projection of W on the space of coordinates P1,...,P m,X1,...,X k,that is,W k.This is the object of ourfirst theorem.In all that follows,we call degree of a rational function the maximum of the degrees of its numerator and denomi-nator.Theorem1Let W be a variety satisfying Assumption1,and suppose that the generic solutions of W are represented by a triangular set T in K(P)[X].For k≤n,all coefficients in T k have degree bounded by(2k2+2)k(deg W k)2k+1.This result improves those of Gallo and Mishra[1990]and Szanto[1999]for re-spectively Ritt-Wu’s and Kalkbrener’s unmixed representations.If W is given as the zero-set of a system of n equations of degree d,then Gallo-Mishra’s bound is2n(8n)2n d(d+1)4n2and Szanto’s is d O(n2).With this notation,the B´e zout inequality(Theorem1in Heintz[1983])implies that the degree of W k is at most d n for all k.Thus according to Theorem1, for k≤n,in a worst-case scenario the coefficients in the polynomial T k have degree bounded by(2k2+2)k d2kn+n.Hence the estimate is better for low indices k than for higher indices;this contrasts with the previous results,which gave the same bounds for all T k.For the worst case k=n,our estimates are within the class d2n2+o(n2),to be compared with Gallo and Mishra’s bound of d4n2+o(n2).Any of these bounds are polynomial in d n2;we do not know if this is sharp.More importantly,Theorem1reveals that the degrees of the coefficients of T are controlled by the intrinsic geometric quantities deg W k,rather than by the degrees of a defining polynomial system.For instance,this indicates a good behavior with respect to decomposition,e.g.into irreducible.Also,these degrees may be bounded a priori:in the example presented in Subsection8.3, the B´e zout bound is1024,but an estimate based on the semantics of the problem gives deg W k≤80.Degree of the degeneracy locus.We still assume that the generic solu-tions of W are represented by a triangular set T=(T1,...,T n)in K(P)[X]. Since the coefficients of T are rational functions,there exists an open subset of the parameter space where they can be specialized,and give a description of thefibers ofπn.Theorem2below gives an upper bound on the degree of an hypersurface where this specialization fails.10Theorem2Let W be a variety satisfying Assumption1,and suppose that the generic solutions of W are represented by a triangular set T in K(P)[X]. There exists a polynomial∆W∈K[P]of degree at most(3n deg W+n2)deg W such that,if p∈A m(K)does not cancel∆W:(1)p cancels no denominator in the coefficients of(T1,...,T n).We denoteby(t1,...,t n)⊂K[X]these polynomials with coefficients specialized at p.(2)(t1,...,t n)is a radical ideal.Let Z n⊂A n(K)be the zero-set of the poly-nomials(t1,...,t n);then thefiber W n(p)is{p}×Z n⊂A m+n(K).Just as Theorem1,this result is of an intrinsic nature,since it depends only on geometric quantities.Nevertheless,in strong contrast with the previous result,these bounds are polynomial in the geometric degree of W.In particular,Theorem2shows that the reunion of the zero-sets of all de-nominators of the coefficients of T is contained in an hypersurface of degree bounded polynomially in terms of the degree of W.Thus,the zero-set of any such denominator has degree bounded by the same quantity.Theorem1does not give such a polynomial bound for the degrees of the denominators.Were the upper bounds of Theorem1to be sharp,this would indicate that these denominators are(high)powers of polynomials of moderate degree.Algorithms.The above results are purely geometric,and independent of any system of generators.For algorithmic considerations,we now assume that W is given as the zero-set of a polynomial system F=F1,...,F n in K[P,X]. We make the additional assumption that the Jacobian determinant with re-spect to X is invertible on a dense subset of W.Then Assumption1is satisfied, and we consider the problem of computing triangular sets that represent the generic solutions of W.The underlying paradigm is that solving a zero-dimensional system over K by means of triangular sets is a well-solved task.Thus,the basic idea isfirst to specialize the indeterminates P in the system F,and solve the corresponding system in the remaining variables X,by means of triangular sets in K[X].A lifting process then produces triangular sets with coefficients in a formal power series ring,from which we can recover the required information.Ourfirst contribution treats the case when W is irreducible:its generic so-lutions are then represented by a single triangular set T=(T1,...,T n),and we propose a probabilistic algorithm that computes T1,...,T k for any k.If W is not irreducible,we compute the minimal polynomial of X1modulo the extended ideal(F1,...,F n)in K(P)[X],using similar techniques.We do not treat the general question of computing a whole family of triangular11sets when W is not irreducible.From the practical point of view,this might not be a strong restriction:our results cover all the applications that we had to treat.We use the following complexity notations:•We suppose that F is given by a Straight-Line Program of size L,and that F1,...,F n have degree bounded by d.•We say that f is in O log(g)if there exists a constant a such that f is in O(g log(g)a)—this is sometimes also expressed by the notation f∈O˜(g).•M(D)denotes the cost of the multiplication of univariate polynomials of degree D,in terms of operations in the base ring.M(D)can be taken in O(D log D log log D),using the algorithm of Sch¨o nhage and Strassen[1971].We denote by C0a universal constant such that for any ring R,any integer D and any monic polynomial T in R[X]of degree D,all operations(+,×) in R[X]/(T)can be done in at most C0M(D)operations,see Chapter9 in[von zur Gathen and Gerhard,1999].We assume that there exists constants C1andαsuch that M(D)M(D )≤C1M(DD )log(DD )αholds for all D,D .This assumption is satisfied for all commonly used multiplication algorithms.•M s(D,M)denotes the cost of M-variate series multiplication at precision D.This can be taken less than M((2D+1)M)using Kronecker’s substitu-tion.If the basefield has characteristic zero,this complexity becomes linear in the size of the series,up to logarithmic factors;see[Lecerf and Schost, 2003,Theorem1].We assume that there exists a constant C2<1such that M s(D,M)≤C2M s(2D,M)holds for all D and M.This is the case for all commonly used estimates,for instance for the ones mentioned above.Apart from the above constants,the complexities below are stated in terms of the degrees D k of the rational functions that appear in the output,and the number D n.This number was defined earlier as the generic cardinality of thefibers W n(p);it is thus the generic number of solutions of the parametric system F.Theorem3Assume that W is irreducible.Let p,p be in K m;assume that a description of the zeros of the systems F(p,X),F(p ,X)by triangular sets is known.For k≤n,let D k be the maximum of the degrees of the coefficients of T1,...,T k.Then T1,...,T k can be computed withinO log (nL+n3)(C0C1)n M(D n)M s(4D k,m)+km2D n M(D k)M s(4D k,m−1) operations in K.The algorithm chooses3m−1values in K,including the coordinates of p and p .IfΓis a subset of K,and these values are chosen inΓ3m−1,then the algorithm fails for at most50n(k2+2)3k d6kn+4n|Γ|3m−2 choices.12。
Reliability Engineering and System Safety 91(2006)992–1007Multi-objective optimization using genetic algorithms:A tutorialAbdullah Konak a,Ã,David W.Coit b ,Alice E.Smith caInformation Sciences and Technology,Penn State Berks,USA bDepartment of Industrial and Systems Engineering,Rutgers University cDepartment of Industrial and Systems Engineering,Auburn UniversityAvailable online 9January 2006AbstractMulti-objective formulations are realistic models for many complex engineering optimization problems.In many real-life problems,objectives under consideration conflict with each other,and optimizing a particular solution with respect to a single objective can result in unacceptable results with respect to the other objectives.A reasonable solution to a multi-objective problem is to investigate a set of solutions,each of which satisfies the objectives at an acceptable level without being dominated by any other solution.In this paper,an overview and tutorial is presented describing genetic algorithms (GA)developed specifically for problems with multiple objectives.They differ primarily from traditional GA by using specialized fitness functions and introducing methods to promote solution diversity.r 2005Elsevier Ltd.All rights reserved.1.IntroductionThe objective of this paper is present an overview and tutorial of multiple-objective optimization methods using genetic algorithms (GA).For multiple-objective problems,the objectives are generally conflicting,preventing simulta-neous optimization of each objective.Many,or even most,real engineering problems actually do have multiple-objectives,i.e.,minimize cost,maximize performance,maximize reliability,etc.These are difficult but realistic problems.GA are a popular meta-heuristic that is particularly well-suited for this class of problems.Tradi-tional GA are customized to accommodate multi-objective problems by using specialized fitness functions and introducing methods to promote solution diversity.There are two general approaches to multiple-objective optimization.One is to combine the individual objective functions into a single composite function or move all but one objective to the constraint set.In the former case,determination of a single objective is possible with methods such as utility theory,weighted sum method,etc.,but theproblem lies in the proper selection of the weights or utility functions to characterize the decision-maker’s preferences.In practice,it can be very difficult to precisely and accurately select these weights,even for someone familiar with the problem pounding this drawback is that scaling amongst objectives is needed and small perturbations in the weights can sometimes lead to quite different solutions.In the latter case,the problem is that to move objectives to the constraint set,a constraining value must be established for each of these former objectives.This can be rather arbitrary.In both cases,an optimization method would return a single solution rather than a set of solutions that can be examined for trade-offs.For this reason,decision-makers often prefer a set of good solutions considering the multiple objectives.The second general approach is to determine an entire Pareto optimal solution set or a representative subset.A Pareto optimal set is a set of solutions that are nondominated with respect to each other.While moving from one Pareto solution to another,there is always a certain amount of sacrifice in one objective(s)to achieve a certain amount of gain in the other(s).Pareto optimal solution sets are often preferred to single solutions because they can be practical when considering real-life problems/locate/ress0951-8320/$-see front matter r 2005Elsevier Ltd.All rights reserved.doi:10.1016/j.ress.2005.11.018ÃCorresponding author.E-mail address:konak@ (A.Konak).since thefinal solution of the decision-maker is always a trade-off.Pareto optimal sets can be of varied sizes,but the size of the Pareto set usually increases with the increase in the number of objectives.2.Multi-objective optimization formulationConsider a decision-maker who wishes to optimize K objectives such that the objectives are non-commensurable and the decision-maker has no clear preference of the objectives relative to each other.Without loss of generality, all objectives are of the minimization type—a minimization type objective can be converted to a maximization type by multiplying negative one.A minimization multi-objective decision problem with K objectives is defined as follows: Given an n-dimensional decision variable vector x¼{x1,y,x n}in the solution space X,find a vector x* that minimizes a given set of K objective functions z(x*)¼{z1(x*),y,z K(x*)}.The solution space X is gen-erally restricted by a series of constraints,such as g j(x*)¼b j for j¼1,y,m,and bounds on the decision variables.In many real-life problems,objectives under considera-tion conflict with each other.Hence,optimizing x with respect to a single objective often results in unacceptable results with respect to the other objectives.Therefore,a perfect multi-objective solution that simultaneously opti-mizes each objective function is almost impossible.A reasonable solution to a multi-objective problem is to investigate a set of solutions,each of which satisfies the objectives at an acceptable level without being dominated by any other solution.If all objective functions are for minimization,a feasible solution x is said to dominate another feasible solution y (x1y),if and only if,z i(x)p z i(y)for i¼1,y,K and z j(x)o z j(y)for least one objective function j.A solution is said to be Pareto optimal if it is not dominated by any other solution in the solution space.A Pareto optimal solution cannot be improved with respect to any objective without worsening at least one other objective.The set of all feasible non-dominated solutions in X is referred to as the Pareto optimal set,and for a given Pareto optimal set,the corresponding objective function values in the objective space are called the Pareto front.For many problems,the number of Pareto optimal solutions is enormous(perhaps infinite).The ultimate goal of a multi-objective optimization algorithm is to identify solutions in the Pareto optimal set.However,identifying the entire Pareto optimal set, for many multi-objective problems,is practically impos-sible due to its size.In addition,for many problems, especially for combinatorial optimization problems,proof of solution optimality is computationally infeasible.There-fore,a practical approach to multi-objective optimization is to investigate a set of solutions(the best-known Pareto set)that represent the Pareto optimal set as well as possible.With these concerns in mind,a multi-objective optimization approach should achieve the following three conflicting goals[1]:1.The best-known Pareto front should be as close aspossible to the true Pareto front.Ideally,the best-known Pareto set should be a subset of the Pareto optimal set.2.Solutions in the best-known Pareto set should beuniformly distributed and diverse over of the Pareto front in order to provide the decision-maker a true picture of trade-offs.3.The best-known Pareto front should capture the wholespectrum of the Pareto front.This requires investigating solutions at the extreme ends of the objective function space.For a given computational time limit,thefirst goal is best served by focusing(intensifying)the search on a particular region of the Pareto front.On the contrary,the second goal demands the search effort to be uniformly distributed over the Pareto front.The third goal aims at extending the Pareto front at both ends,exploring new extreme solutions.This paper presents common approaches used in multi-objective GA to attain these three conflicting goals while solving a multi-objective optimization problem.3.Genetic algorithmsThe concept of GA was developed by Holland and his colleagues in the1960s and1970s[2].GA are inspired by the evolutionist theory explaining the origin of species.In nature,weak and unfit species within their environment are faced with extinction by natural selection.The strong ones have greater opportunity to pass their genes to future generations via reproduction.In the long run,species carrying the correct combination in their genes become dominant in their population.Sometimes,during the slow process of evolution,random changes may occur in genes. If these changes provide additional advantages in the challenge for survival,new species evolve from the old ones.Unsuccessful changes are eliminated by natural selection.In GA terminology,a solution vector x A X is called an individual or a chromosome.Chromosomes are made of discrete units called genes.Each gene controls one or more features of the chromosome.In the original implementa-tion of GA by Holland,genes are assumed to be binary digits.In later implementations,more varied gene types have been introduced.Normally,a chromosome corre-sponds to a unique solution x in the solution space.This requires a mapping mechanism between the solution space and the chromosomes.This mapping is called an encoding. In fact,GA work on the encoding of a problem,not on the problem itself.GA operate with a collection of chromosomes,called a population.The population is normally randomly initia-lized.As the search evolves,the population includesfitterA.Konak et al./Reliability Engineering and System Safety91(2006)992–1007993andfitter solutions,and eventually it converges,meaning that it is dominated by a single solution.Holland also presented a proof of convergence(the schema theorem)to the global optimum where chromosomes are binary vectors.GA use two operators to generate new solutions from existing ones:crossover and mutation.The crossover operator is the most important operator of GA.In crossover,generally two chromosomes,called parents,are combined together to form new chromosomes,called offspring.The parents are selected among existing chromo-somes in the population with preference towardsfitness so that offspring is expected to inherit good genes which make the parentsfitter.By iteratively applying the crossover operator,genes of good chromosomes are expected to appear more frequently in the population,eventually leading to convergence to an overall good solution.The mutation operator introduces random changes into characteristics of chromosomes.Mutation is generally applied at the gene level.In typical GA implementations, the mutation rate(probability of changing the properties of a gene)is very small and depends on the length of the chromosome.Therefore,the new chromosome produced by mutation will not be very different from the original one.Mutation plays a critical role in GA.As discussed earlier,crossover leads the population to converge by making the chromosomes in the population alike.Muta-tion reintroduces genetic diversity back into the population and assists the search escape from local optima. Reproduction involves selection of chromosomes for the next generation.In the most general case,thefitness of an individual determines the probability of its survival for the next generation.There are different selection procedures in GA depending on how thefitness values are used. Proportional selection,ranking,and tournament selection are the most popular selection procedures.The procedure of a generic GA[3]is given as follows:Step1:Set t¼1.Randomly generate N solutions to form thefirst population,P1.Evaluate thefitness of solutions in P1.Step2:Crossover:Generate an offspring population Q t as follows:2.1.Choose two solutions x and y from P t based onthefitness values.ing a crossover operator,generate offspringand add them to Q t.Step3:Mutation:Mutate each solution x A Q t with a predefined mutation rate.Step4:Fitness assignment:Evaluate and assign afitness value to each solution x A Q t based on its objective function value and infeasibility.Step5:Selection:Select N solutions from Q t based on theirfitness and copy them to P t+1.Step6:If the stopping criterion is satisfied,terminate the search and return to the current population,else,set t¼t+1go to Step2.4.Multi-objective GABeing a population-based approach,GA are well suited to solve multi-objective optimization problems.A generic single-objective GA can be modified tofind a set of multiple non-dominated solutions in a single run.The ability of GA to simultaneously search different regions of a solution space makes it possible tofind a diverse set of solutions for difficult problems with non-convex,discon-tinuous,and multi-modal solutions spaces.The crossover operator of GA may exploit structures of good solutions with respect to different objectives to create new non-dominated solutions in unexplored parts of the Pareto front.In addition,most multi-objective GA do not require the user to prioritize,scale,or weigh objectives.Therefore, GA have been the most popular heuristic approach to multi-objective design and optimization problems.Jones et al.[4]reported that90%of the approaches to multi-objective optimization aimed to approximate the true Pareto front for the underlying problem.A majority of these used a meta-heuristic technique,and70%of all meta-heuristics approaches were based on evolutionary ap-proaches.Thefirst multi-objective GA,called vector evaluated GA (or VEGA),was proposed by Schaffer[5].Afterwards, several multi-objective evolutionary algorithms were devel-oped including Multi-objective Genetic Algorithm (MOGA)[6],Niched Pareto Genetic Algorithm(NPGA) [7],Weight-based Genetic Algorithm(WBGA)[8],Ran-dom Weighted Genetic Algorithm(RWGA)[9],Nondomi-nated Sorting Genetic Algorithm(NSGA)[10],Strength Pareto Evolutionary Algorithm(SPEA)[11],improved SPEA(SPEA2)[12],Pareto-Archived Evolution Strategy (PAES)[13],Pareto Envelope-based Selection Algorithm (PESA)[14],Region-based Selection in Evolutionary Multiobjective Optimization(PESA-II)[15],Fast Non-dominated Sorting Genetic Algorithm(NSGA-II)[16], Multi-objective Evolutionary Algorithm(MEA)[17], Micro-GA[18],Rank-Density Based Genetic Algorithm (RDGA)[19],and Dynamic Multi-objective Evolutionary Algorithm(DMOEA)[20].Note that although there are many variations of multi-objective GA in the literature, these cited GA are well-known and credible algorithms that have been used in many applications and their performances were tested in several comparative studies. Several survey papers[1,11,21–27]have been published on evolutionary multi-objective optimization.Coello lists more than2000references in his website[28].Generally, multi-objective GA differ based on theirfitness assign-ment procedure,elitisim,or diversification approaches.In Table1,highlights of the well-known multi-objective with their advantages and disadvantages are given.Most survey papers on multi-objective evolutionary approaches intro-duce and compare different algorithms.This paper takes a different course and focuses on important issues while designing a multi-objective GA and describes common techniques used in multi-objective GA to attain the threeA.Konak et al./Reliability Engineering and System Safety91(2006)992–1007 994goals in multi-objective optimization.This approach is also taken in the survey paper by Zitzler et al.[1].However,the discussion in this paper is aimed at introducing the components of multi-objective GA to researchers and practitioners without a background on the multi-objective GA.It is also import to note that although several of the state-of-the-art algorithms exist as cited above,many researchers that applied multi-objective GA to their problems have preferred to design their own customized algorithms by adapting strategies from various multi-objective GA.This observation is another motivation for introducing the components of multi-objective GA rather than focusing on several algorithms.However,the pseudo-code for some of the well-known multi-objective GA are also provided in order to demonstrate how these proce-dures are incorporated within a multi-objective GA.Table1A list of well-known multi-objective GAAlgorithm Fitness assignment Diversity mechanism Elitism ExternalpopulationAdvantages DisadvantagesVEGA[5]Each subpopulation isevaluated with respectto a differentobjective No No No First MOGAStraightforwardimplementationTend converge to theextreme of each objectiveMOGA[6]Pareto ranking Fitness sharing byniching No No Simple extension of singleobjective GAUsually slowconvergenceProblems related to nichesize parameterWBGA[8]Weighted average ofnormalized objectives Niching No No Simple extension of singleobjective GADifficulties in nonconvexobjective function space Predefined weightsNPGA[7]Nofitnessassignment,tournament selection Niche count as tie-breaker in tournamentselectionNo No Very simple selectionprocess with tournamentselectionProblems related to nichesize parameterExtra parameter fortournament selectionRWGA[9]Weighted average ofnormalized objectives Randomly assignedweightsYes Yes Efficient and easyimplementDifficulties in nonconvexobjective function spacePESA[14]Nofitness assignment Cell-based density Pure elitist Yes Easy to implement Performance depends oncell sizesComputationally efficientPrior information neededabout objective spacePAES[29]Pareto dominance isused to replace aparent if offspringdominates Cell-based density astie breaker betweenoffspring and parentYes Yes Random mutation hill-climbing strategyNot a population basedapproachEasy to implement Performance depends oncell sizesComputationally efficientNSGA[10]Ranking based onnon-dominationsorting Fitness sharing bynichingNo No Fast convergence Problems related to nichesize parameterNSGA-II[30]Ranking based onnon-dominationsorting Crowding distance Yes No Single parameter(N)Crowding distance worksin objective space onlyWell testedEfficientSPEA[11]Raking based on theexternal archive ofnon-dominatedsolutions Clustering to truncateexternal populationYes Yes Well tested Complex clusteringalgorithmNo parameter forclusteringSPEA-2[12]Strength ofdominators Density based on thek-th nearest neighborYes Yes Improved SPEA Computationallyexpensivefitness anddensity calculationMake sure extreme pointsare preservedRDGA[19]The problem reducedto bi-objectiveproblem with solutionrank and density asobjectives Forbidden region cell-based densityYes Yes Dynamic cell update More difficult toimplement than othersRobust with respect to thenumber of objectivesDMOEA[20]Cell-based ranking Adaptive cell-baseddensity Yes(implicitly)No Includes efficienttechniques to update celldensitiesMore difficult toimplement than othersAdaptive approaches toset GA parametersA.Konak et al./Reliability Engineering and System Safety91(2006)992–10079955.Design issues and components of multi-objective GA 5.1.Fitness functions5.1.1.Weighted sum approachesThe classical approach to solve a multi-objective optimization problem is to assign a weight w i to each normalized objective function z 0i ðx Þso that the problem is converted to a single objective problem with a scalar objective function as follows:min z ¼w 1z 01ðx Þþw 2z 02ðx ÞþÁÁÁþw k z 0k ðx Þ,(1)where z 0i ðx Þis the normalized objective function z i (x )and P w i ¼1.This approach is called the priori approach since the user is expected to provide the weights.Solving a problem with the objective function (1)for a given weight vector w ¼{w 1,w 2,y ,w k }yields a single solution,and if multiple solutions are desired,the problem must be solved multiple times with different weight combinations.The main difficulty with this approach is selecting a weight vector for each run.To automate this process;Hajela and Lin [8]proposed the WBGA for multi-objective optimization (WBGA-MO)in the WBGA-MO,each solution x i in the population uses a different weight vector w i ¼{w 1,w 2,y ,w k }in the calculation of the summed objective function (1).The weight vector w i is embedded within the chromosome of solution x i .Therefore,multiple solutions can be simulta-neously searched in a single run.In addition,weight vectors can be adjusted to promote diversity of the population.Other researchers [9,31]have proposed a MOGA based on a weighted sum of multiple objective functions where a normalized weight vector w i is randomly generated for each solution x i during the selection phase at each generation.This approach aims to stipulate multiple search directions in a single run without using additional parameters.The general procedure of the RWGA using random weights is given as follows [31]:Procedure RWGA:E ¼external archive to store non-dominated solutions found during the search so far;n E ¼number of elitist solutions immigrating from E to P in each generation.Step 1:Generate a random population.Step 2:Assign a fitness value to each solution x A P t by performing the following steps:Step 2.1:Generate a random number u k in [0,1]for each objective k ,k ¼1,y ,K.Step 2.2:Calculate the random weight of each objective k as w k ¼ð1=u k ÞP K i ¼1u i .Step 2.3:Calculate the fitness of the solution as f ðx Þ¼P K k ¼1w k z k ðx Þ.Step 3:Calculate the selection probability of each solutionx A P t as follows:p ðx Þ¼ðf ðx ÞÀf min ÞÀ1P y 2P t ðf ðy ÞÀf minÞwhere f min ¼min f f ðx Þj x 2P t g .Step 4:Select parents using the selection probabilities calculated in Step 3.Apply crossover on the selected parent pairs to create N offspring.Mutate offspring with a predefined mutation rate.Copy all offspring to P t +1.Update E if necessary.Step 5:Randomly remove n E solutions from P t +1and add the same number of solutions from E to P t +1.Step 6:If the stopping condition is not satisfied,set t ¼t þ1and go to Step 2.Otherwise,return to E .The main advantage of the weighted sum approach is a straightforward implementation.Since a single objective is used in fitness assignment,a single objective GA can be used with minimum modifications.In addition,this approach is computationally efficient.The main disadvan-tage of this approach is that not all Pareto-optimal solutions can be investigated when the true Pareto front is non-convex.Therefore,multi-objective GA based on the weighed sum approach have difficulty in finding solutions uniformly distributed over a non-convex trade-off surface [1].5.1.2.Altering objective functionsAs mentioned earlier,VEGA [5]is the first GA used to approximate the Pareto-optimal set by a set of non-dominated solutions.In VEGA,population P t is randomly divided into K equal sized sub-populations;P 1,P 2,y ,P K .Then,each solution in subpopulation P i is assigned a fitness value based on objective function z i .Solutions are selected from these subpopulations using proportional selection for crossover and mutation.Crossover and mutation are performed on the new population in the same way as for a single objective GA.Procedure VEGA:N S ¼subpopulation size (N S ¼N =K )Step 1:Start with a random initial population P 0.Set t ¼0.Step 2:If the stopping criterion is satisfied,return P t .Step 3:Randomly sort population P t .Step 4:For each objective k ,k ¼1,y K ,perform the following steps:Step 4.1:For i ¼1þðk 21ÞN S ;...;kN S ,assign fit-ness value f ðx i Þ¼z k ðx i Þto the i th solution in the sorted population.Step 4.2:Based on the fitness values assigned in Step 4.1,select N S solutions between the (1+(k À1)N S )th and (kN S )th solutions of the sorted population to create subpopulation P k .Step 5:Combine all subpopulations P 1,y ,P k and apply crossover and mutation on the combined population to create P t +1of size N .Set t ¼t þ1,go to Step 2.A similar approach to VEGA is to use only a single objective function which is randomly determined each time in the selection phase [32].The main advantage of the alternating objectives approach is easy to implement andA.Konak et al./Reliability Engineering and System Safety 91(2006)992–1007996computationally as efficient as a single-objective GA.In fact,this approach is a straightforward extension of a single objective GA to solve multi-objective problems.The major drawback of objective switching is that the popula-tion tends to converge to solutions which are superior in one objective,but poor at others.5.1.3.Pareto-ranking approachesPareto-ranking approaches explicitly utilize the concept of Pareto dominance in evaluatingfitness or assigning selection probability to solutions.The population is ranked according to a dominance rule,and then each solution is assigned afitness value based on its rank in the population, not its actual objective function value.Note that herein all objectives are assumed to be minimized.Therefore,a lower rank corresponds to a better solution in the following discussions.Thefirst Pareto ranking technique was proposed by Goldberg[3]as follows:Step1:Set i¼1and TP¼P.Step2:Identify non-dominated solutions in TP and assigned them set to F i.Step3:Set TP¼TPF i.If TP¼+go to Step4,else set i¼iþ1and go to Step2.Step4:For every solution x A P at generation t,assign rank r1ðx;tÞ¼i if x A F i.In the procedure above,F1,F2,y are called non-dominated fronts,and F1is the Pareto front of population P.NSGA[10]also classifies the population into non-dominated fronts using an algorithm similar to that given above.Then a dummyfitness value is assigned to each front using afitness sharing function such that the worst fitness value assigned to F i is better than the bestfitness value assigned to F i+1.NSGA-II[16],a more efficient algorithm,named the fast non-dominated-sort algorithm, was developed to form non-dominated fronts.Fonseca and Fleming[6]used a slightly different rank assignment approach than the ranking based on non-dominated-fronts as follows:r2ðx;tÞ¼1þnqðx;tÞ;(2) where nq(x,t)is the number of solutions dominating solution x at generation t.This ranking method penalizes solutions located in the regions of the objective function space which are dominated(covered)by densely populated sections of the Pareto front.For example,in Fig.1b solution i is dominated by solutions c,d and e.Therefore,it is assigned a rank of4although it is in the same front with solutions f,g and h which are dominated by only a single solution.SPEA[11]uses a ranking procedure to assign better fitness values to non-dominated solutions at underrepre-sented regions of the objective space.In SPEA,an external list E of afixed size stores non-dominated solutions that have been investigated thus far during the search.For each solution y A E,a strength value is defined assðy;tÞ¼npðy;tÞN Pþ1,where npðy;tÞis the number solutions that y dominates in P.The rank r(y,t)of a solution y A E is assigned as r3ðy;tÞ¼sðy;tÞand the rank of a solution x A P is calculated asr3ðx;tÞ¼1þXy2E;y1xsðy;tÞ.Fig.1c illustrates an example of the SPEA ranking method.In the former two methods,all non-dominated solutions are assigned a rank of1.This method,however, favors solution a(in thefigure)over the other non-dominated solutions since it covers the least number of solutions in the objective function space.Therefore,a wide, uniformly distributed set of non-dominated solutions is encouraged.Accumulated ranking density strategy[19]also aims to penalize redundancy in the population due to overrepre-sentation.This ranking method is given asr4ðx;tÞ¼1þXy2P;y1xrðy;tÞ.To calculate the rank of a solution x,the rank of the solutions dominating this solution must be calculatedfirst. Fig.1d shows an example of this ranking method(based on r2).Using ranking method r4,solutions i,l and n are ranked higher than their counterparts at the same non-dominated front since the portion of the trade-off surface covering them is crowded by three nearby solutions c,d and e. Although some of the ranking approaches described in this section can be used directly to assignfitness values to individual solutions,they are usually combined with variousfitness sharing techniques to achieve the second goal in multi-objective optimization,finding a diverse and uniform Pareto front.5.2.Diversity:fitness assignment,fitness sharing,and nichingMaintaining a diverse population is an important consideration in multi-objective GA to obtain solutions uniformly distributed over the Pareto front.Without taking preventive measures,the population tends to form relatively few clusters in multi-objective GA.This phenom-enon is called genetic drift,and several approaches have been devised to prevent genetic drift as follows.5.2.1.Fitness sharingFitness sharing encourages the search in unexplored sections of a Pareto front by artificially reducingfitness of solutions in densely populated areas.To achieve this goal, densely populated areas are identified and a penaltyA.Konak et al./Reliability Engineering and System Safety91(2006)992–1007997。
Macroeconomic Dynamics and Credit Risk:A Global Perspective∗M.Hashem PesaranUniversity of Cambridge and USCTil Schuermann†Federal Reserve Bank of New York and Wharton Financial Institutions CenterBjörn-Jakob Treutler‡Mercer Oliver Wyman and Otto Beisheim Graduate School of Management,WHUScott M.WeinerBalliol College,University of OxfordApril12,2005AbstractThis paper presents a new approach to modeling conditional credit loss distributions.Asset value changes offirms in a credit portfolio are linked to a dynamic global macroeconometric model,allowingmacro effects to be isolated from idiosyncratic shocks from the perspective of default(and hence loss).Default probabilities are driven primarily by howfirms are tied to business cycles,both domestic andforeign,and how business cycles are linked across countries.We allow forfirm-specific business cycleeffects and the heterogeneity offirm default thresholds using credit ratings.The model can be used,forexample,to compute the effects of a hypothetical negative equity price shock in South East Asia on theloss distribution of a credit portfolio with global exposures over one or more quarters.We show that theeffects of such shocks on losses are asymmetric and non-proportional,reflecting the highly non-linearnature of the credit risk model.Keywords:Risk management,economic interlinkages,loss forecasting,default correlationJEL Codes:C32,E17,G20∗In preparing this version we have benefited greatly from comments and suggestions by Carlo Favero,Mark Flannery,Monika Piazzesi and three anonymous referees.We would also like to acknowledge helpful comments from Roland Demmel,Joshua Rosenberg,Jan-Hendrik Schmidt and Jim Wilcox,as well as participants at the C.R.E.D.I.T.conference,Venice,September 2003,the IGIER-PIER conference,Milan,October2003,and seminar participants at the Judge Institute of Management, Cambridge University,University of Southern California,the Federal Reserve Bank of San Francisco,Rutgers University and Baruch College.We would also like to thank Sam Hanson for his excellent research assistance with estimating the default thresholds.†Any views expressed represent those of the author only and not necessarily those of the Federal Reserve Bank of New York or the Federal Reserve System.‡Any views expressed represent those of the author only and not necessarily those of Mercer Oliver Wyman.1IntroductionRisk management in general and credit risk analysis in particular has been the focus of extensive research in the past several years.Credit risk is the dominant source of risk for banks and the subject of strict regulatory oversight and policy debate.More recently,the proposal by the Bank for International Settlements(BIS)to reform the regulation of bank capital for credit risk(known as the New Basel Accord,or BIS2)has initiated debates on a number of issues in the literature.1 One of the issues under discussion centers on the effects of business cycles,especially of severe economic downturns,on bank risk and value-at-risk capital requirements(Carpenter,Whitesell and Zakrajšek2001,Carey2002,Allen and Saunders2004).However,this discussion has been taking place largely without the benefit of an explicit model that links the loss distribution of a bank’s credit portfolio to the evolution of directly observable macroeconomic factors at national and global levels.Given the increasing interdependencies in the global economy,risk managers of commercial and central banks alike may well be interested in questions like“What would be the impact on the credit loss distribution of a given bank(or banks)in a given region if there were large unfavorable shocks to equity prices,GDP or interest rates in that or other regions?”The purpose of this paper is to show how global macroeconometric models can be linked to firm specific return processes which are an integral part of Merton-type credit risk models so that quantitative answers to such questions can be obtained.We propose a combined model of credit losses contingent on the macroeconomy that is able to distinguish between default(and loss)due to systematic versus idiosyncratic(orfirm specific)shocks,providing an explicit channel for modeling default correlations.This enables us to conduct simulation experiments on the effect of changes in observable macroeconomic dynamics on credit risk.In providing such a linkage,the main conceptual challenge is to allow forfirm-specific business cycle effects and the heterogeneity of default probabilities acrossfirms.Standard credit risk models pioneered by Vasicek(1987)and elaborated in Vasicek(1991,2002)and Gordy(2003),adapt the option-based approach of Merton(1974)and allow for business cycle effects generally via one or more unobserved systematic risk factors.They assume that the processes generating asset values and the default thresholds are homogeneous acrossfirms.The parameters of the loss distribution are then identified byfixing the cross-firm correlation of asset returns and the mean default rate of the credit portfolio.Operational versions of the Vasicek model,e.g.by KMV,allow forfirm heterogeneity by making use of balance sheets,income statements and other similar reports issued by thefirm.This process inevitably involves a certain degree of subjective evaluation,however, and the outcome is generally proprietary information.21For details of BIS2see BIS(2001,2004),and for an account of the debates see,for example,Jones and Mingo (1998),and Altman,Bharath and Saunders(2002).2Credit portfolio models also differ in the way they model changes to thefirms’value.Some models operate on a mark-to-market basis by looking at the change of market value of credit assets based on credit migration and the termExamples of credit risk portfolio models in the professional literature include Gupton,Finger and Bhatia’s(1997)CreditMetrics,KMV’s PortfolioManager,and the actuarial approach employed by CSFB’s CreditRisk+(Credit Suisse First Boston1997)where the key risk driver is the variable mean default rate in the economy.Wilson’s(1997a,b)model(CreditPortfolioView)is an exception. He allows for the macroeconomic variables to influence afirm’s probability of default using a pooled logit specification.However,because the defaults are grouped,typically by industry,and modeled at the(single country)national level,anyfirm-specific heterogeneity is lost in the estimation process. For detailed comparisons,see Koyluoglu and Hickman(1998),Crouhy,Galai and Mark(2000), Gordy(2000)and Saunders and Allen(2002).In this paper we depart from the literature in two important respects.First,we model individual firm returns(taken as proxies for changes in asset values)in terms of a number of directly observable contemporaneous risk factors,such as changes in equity indices,interest rates,inflation,real money balances,oil prices and output,both domestic and foreign.In this way we allow for the possible differential impacts of macroeconomic factors on the evolution offirm’s asset values,and as such their default probabilities.Second,using historical observations on mean returns,volatility and default frequencies offirms for a particular credit rating,we computefirm-specific default threshold-equity ratios under the assumption that twofirms with identical credit ratings are likely to have similar default threshold-equity ratios.Thus we are able to provide an empirical implementation of the Merton model using only two pieces of publicly available information for eachfirm,namely market returns and credit ratings,in a multi-country setting.The problem of obtaining accurate information about the health of afirm,while not new,is particularly relevant for modelingfirms’bankruptcy or default.Our approach has the advantage that it does not rely onfirm-specific accounting data which are at best noisy and at worst biased due to the information asymmetries between company managers(agents)and share/debt holders (principals).Rating agencies are likely to have access to private information about thefirm’s past performance and its current management,in addition to public information from balance sheets and company reports,in arriving at theirfirm-specific credit ratings.In the pursuit of better ratings, companies have more of an incentive to reveal(some of)their private information to the credit rating agencies than to their debt-holders,very much in the same spirit that second—hand car dealers have the incentive to reveal information about the cars they offer for sale by the duration of the guarantees and other after-sales services that they provide.Moreover,basing the analysis strictly on accounting data would make it difficult to harmonize information across different accounting standards and bankruptcy codes from different countries,a source of heterogeneity presumably addressed by rating agencies.structure of credit spreads(CreditMetrics).Others focus on predicting default losses(so-called default mode models such as CSFB’s CreditRisk+).Yet there are other approaches that allow for both(e.g.KMV’s PortfolioManager, Wilson’s CreditPortfolioView).In short,in our framework the portfolio loss distribution is driven byfirms’credit ratings and how their returns are tied to business cycles,both domestic and foreign,and how business cycles are linked across countries.This is in contrast to credit risk analyses that explicitly focus on modeling of individualfirm defaults,using panel probit or logit specifications(Altman and Saunders1997, Lennox1999).Since defaults are rare,to obtain sensible estimates these applications tend to impose strong homogeneity assumptions on the parameters,which could bias the estimates.For instance, it is impossible to allow for anyfirm-specific(e.g.fixed)effects.The probit/logit approach is also difficult to adapt for the analysis of multi-period credit loss distributions,whilst our approach can be readily extended for such purposes.3To link thefirm-specific returns to business cycle factors we shall make use of the global vector autoregressive(GVAR)macroeconometric model recently developed by Pesaran,Schuermann and Weiner(2004)—hereafter PSW.This model is composed of vector error-correcting models(VECM) estimated for individual countries(or regions),which are then combined into a global model that takes account of both intra-and inter-country/regional interactions.The model uses domestic macroeconomic variables such as GDP,inflation,the level of short term interest rates,exchange rate,equity prices(when applicable)and real money balances.These are related to corresponding foreign variables constructed exclusively to match the international trade pattern of the country under consideration.Because of the global nature of the model,we can analyze how a shock to one specific macroeconomic variable affects other macroeconomic variables,even(and especially) across countries,as well as shocks to risk factors,e.g.oil prices,affecting all regions.We examine the credit risk of afictitious corporate loan portfolio and its exposure to a wide range of observable risk factors in the global economy.We model afirm’s probability of default as a function of those risk factors but assume for simplicity that loss given default is an exogenously given random variable whose specific parameterization can vary by ing thefirm-specific return regressions and the GVAR model,single-and multi-period credit loss distributions of a given portfolio are then obtained through Monte Carlo simulations.Our baseline expected losses are quite reasonable when compared with actual industry loan charge-offs.For example,expected loss over the course of four quarters is about58bp(basis points)of exposure,compared with89bp,the average net charge-offs(loans charged offless amount recovered over total loans)for the U.S.banking industry from1987to2003.When compared with the actual industry charge-offs matched by our forecast horizon,namely2000Q1,the difference is even smaller:those were56bp(at an annual rate).Much of the fat-tailedness of our loss distribution is,however,due to the relatively small number offirms(119in our portfolio)which entails a substantial degree of diversifiable idiosyncratic risk.Once this is controlled for(by including‘copies’of the existingfirms within the portfolio),the EL to VaR multiples are in line with those obtained by others(e.g.Carey2002who has about500exposures).For instance,the tail values at99%and 3A recent exception is Duffie and Wang(2003)who forecast default intensities over multiple periods.99.5%are around three to four times expected losses.Moreover,when we impose extreme shocks such as those seen during the Great Depression,VaR is more than triple the baseline scenario, also consistent with Carey’s results.Wefind further that symmetric shocks to the observable risk factors do not result in correspondingly symmetric loss outcomes reflecting the nonlinear nature of the credit risk model.In attempting to provide a formal link between credit risk and the macro-economy we have been forced to make many difficult choices.First,we confined our analysis to publicly traded companies with a sufficiently long credit rating history.We assume that this credit rating is a sufficient summary statistic of unconditional default risk,meaning that we take credit ratings as the business cycle-neutral,‘common currency’of default risk across different geographies,legislations and accounting standards.But we allow forfirm-specific conditional default probabilities over the course of the business cycle.To do this we need three different tools:(i)a model of the systematic (macroeconomic)risk factors,(ii)firm returns and how they are linked to those factors,and(iii)firm default thresholds.The GVAR satisfies thefirst requirement,and the link tofirms is done throughfirm-level return regressions by allowing the loadings on the macro-variables to befirm specific.The default thresholds are identified by assuming that they are the same within a rating category.4Clearly,other modeling strategies and identification schemes can be adopted.The present paper demonstrates that such an approach to credit risk modeling is in fact feasible.Our model is particularly suited for an international and multi-factor interpretation of the standard corporatefinance view offirm risk:total risk is the sum of systematic and idiosyncratic (i.e.firm-specific)risk.The GVAR is ostensibly a global model of systematic risk and its dynamics. Having a model of those factor dynamics can go a long way to understandingfirm risk(and return)characteristics and to address specific risk management related questions.One which we find particularly valuable is the ability to rank-order possible shock scenarios.Given a particular portfolio of credit exposures,is a1σshock(one standard error shock)to Japanese money supply more damaging(or beneficial,depending on the sign of the shock)than a1σshock to South East Asian or U.S.equity markets?What will the portfolio loss distribution look like one year from now?What if the portfolio changes?Such counterfactual questions are central to policy analysis, be it by commercial or central bankers who might wish to investigate the impact on a representative bank portfolio in their country of various economic shocks in other countries.If the model is not compact enough,it cannot be practically used in this repetitive fashion.The remainder of the paper is as follows:Section2provides an overview of the alternative 4The bankruptcy models of Altman(1968),Lennox(1999)and Shumway(2001)generatefirm specific default forecasts,as does the industry model by KMV(Kealhofer and Kurbat(2002)).However,all of these studies impose more significant parameter homogeneity than we do,and they focus on just one country at a time(the U.S.and U.K in this list),and thus do not address the formidable challenges of point in time bankruptcy forecasting with a multi-country portfolio.approaches to credit portfolio modeling and puts forward our proposed approach.Section3briefly discusses the GVAR model and shows how it is linked to the credit risk model.Mathematical expressions for the conditional one-period and multi-period loss distributions of a given credit portfolio under various shock scenarios are also obtained.Section4provides an empirical analysis of the impact of the different types of shocks(to output,money supply,equity and oil prices)on the loss distribution.Section5offers some concluding remarks.2Credit Portfolio ModelingCredit risk modeling is concerned with the tail properties of the loss distribution for a given portfolio of credit assets such as loans or bonds,and attempts to provide quantitative analysis of the extent to which the loss distribution varies with changes tofirm/industry-specific,national and global risk factors.It can be approached from the perspective of the individual loans that make up the portfolio,or it could be addressed by considering the return on the loan portfolio directly.In this paper we follow the former approach and simulate the portfolio loss distribution from the bottom up by considering how individualfirms default.Broadly speaking,there are two important variables describing asset/firm level credit risk:the probability of default(P D)and the loss given default(LGD).5Most of the work on P D and LGD has been done without explicit conditioning on business cycle variables;exceptions include Carey(1998),Frye(2000)and Altman,Brady,Resti and Sironi(2002).These studiesfind,perhaps not surprisingly,that losses are indeed worse in recessions.Tapping into information contained in equity returns(as opposed to credit spreads from debt instruments),Vassalou and Xing(2004) show that default risk varies with the business cycle.6Carey(2002),using re-sampling techniques, shows that mean losses during a recession such as1990/91in the U.S.are about the same as losses in the0.5%tail during an expansion.Bangia et al.(2002),using a regime switching approach,find that capital held by banks over a one-year horizon needs to be25-30%higher in a recession that in an expansion.In this paper we shall consider the loss distribution of the credit portfolio of afinancial institution such as a bank by conditioning on observable macroeconomic variables or factors.The conditional loss distribution allows for the effect of business cycle variations and captures such effects at a global level by explicitly taking account of the heterogeneous interconnections and interdependencies that exist between national and international factors.5The New Basel Accord explicitly mentions two additional variables:exposure at default and maturity.As these affect credit risk only moderately(and are often taken to be non-stochastic),our discussion will focus on the P D and LGD which are the two dominant determinants of the credit loss distribution.6See also the survey by Allen and Saunders(2004).2.1A Merton-Based Model of DefaultFollowing Merton(1974),afirm is expected to default when the value of its assets falls below a threshold value determined by its callable liabilities.The lender is effectively writing a put option on the assets of the borrowingfirm.If the value of thefirm falls below a certain threshold,the owners will put thefirm to the debt-holders.7Default,as considered by the rating agencies and banks,typically constitutes non-payment of interest or a coupon.8Thus there are three aspects which require modeling:(i)the evolution offirm value,(ii)the default threshold,and in a portfolio context,(iii)return correlations acrossfirms in the portfolio. We discuss thefirst two aspects in this section,while modeling of return correlations is treated in Section3.2.In Merton-type portfolio models,such as KMV,asset value and asset volatility are typically derived from balance sheet data as well as observable equity returns and(estimated) return volatility(see Kealhofer and Kurbat2002).The default threshold in these models is typically taken to be short term debt plus a proportion of long term debt.Asset value,asset volatility and the default threshold are then used to determine the distance from default.In what follows we advance an alternative approach where instead of using balance sheet data we make use offirm credit ratings.Consider afirm j in country or region i having asset values V ji,t at time t,and an outstanding stock of debt,D ji,t.Under the Merton(1974)model default occurs at the maturity date of the debt,t+H,if thefirm’s assets,V ji,t+H,are less than the face value of the debt at that time, D ji,t+H.This is in contrast with thefirst-passage models where default would occur thefirst time that V ji,t falls below a default boundary(or threshold)over the period t to t+H.9The default probabilities are computed with respect to the probability distribution of asset values at the terminal date,t+H in the case of the Merton model,and over the period from t to t+H in the case of thefirst-passage model.The Merton approach may be thought of as a European option and thefirst-passage approach as an American option.Our approach can be adapted to both of these models,but in what follows we focus on Merton’s specification.The value of thefirm at time t is the sum of debt and equity,namelyV ji,t=D ji,t+E ji,t,with D ji,t>0,(1) 7For a discussion of the power of Merton default prediction models see Falkenstein and Boral(2001)and Gemmill (2002)whofind that the Merton model generally does well in predicting default(Falkenstein and Boral)and credit spreads(Gemmill).Duffee(1999)points out that due to the continuous time diffusion processes underlying the Black Scholes formula,short-term default probabilities may be underestimated.8A similar default condition is used by regulators,e.g.in the New Basel Accord.See Section III.F,§146in BIS (2001).9Thefirst-passage approach is discussed in Black and Cox(1976).For a review see,for example,Duffie and Singleton(2003,Section3.2).More recent modeling approaches also allow for strategic default considerations,as in Mella-Barral and Perraudin(1997).or alternatively,V ji,t ji,t =1+E ji,tji,t.(2)Conditional on time t information,default will take place at time t+H ifV ji,t+H≤D ji,t+H,or using(2)ifE ji,t+HD ji,t+H≤0.(3) Equation(3)is restrictive in that it requires equity values to be negative before default occurs.Aside from non-trivial practical considerations having to do with arriving at an independent estimate of V ji,t,there are several reasons behind relaxing this condition.Because default is costly and violations to the absolute priority rule in bankruptcy proceedings are so common,in practice shareholders have an incentive to put thefirm into receivership even before the equity value of thefirm hits zero.10In fact,several authors have found that in65%to80%of bankruptcies,even shareholders receive something without debt-holders necessarily having been fully paid off(see,for instance,Eberhart and Weiss1998,and references therein).Moreover,we see in practice that equity values remain positive for insolventfirms.Similarly,the bank might also have an incentive of forcing thefirm to default once thefirm’s equity falls below a non-zero threshold,as well as an incentive to bypass the costly proceedings by agreeing to terms that yield positive value to the shareholders themselves.11The value of equity incorporates not only the asset values,but an option value that afirm in default may in fact recover before creditors take control of these assets.Finally,default in a credit relationship is typically a weaker condition than outright bankruptcy.An obligor may meet the technical default condition,e.g.a missed coupon payment,without subsequently going into bankruptcy.This distinction is particularly relevant in the banking-borrower relationship we seek to characterize.12In what follows we assume that default takes place if0<E ji,t+H<C ji,t+H,(4) where C ji,t+H is a positive default threshold which could vary over time and with thefirm’s par-ticular characteristics(such as region or industry sector).Natural candidates include quantitative factors such as leverage,profitability,firm age and perhaps size(most of which appear in models of firm default),as well as more qualitative factors such as management quality.13Obviously some of 10See,for instance,Leland and Toft(1996)who develop a model where default is determined endogenously without imposing a positive net worth condition.11For a treatment of this scenario,see Garbade(2001).12An excellent example of the joint borrower-lender decision process is given by Lawrence and Arshadi(1995).13For models of bankruptcy and default at thefirm level,see,for instance,Altman(1968),Lennox(1999),Shumway (2001),Chava and Jarrow(2004),Hillegeist,Keating,Cram and Lundstedt(2004),as well as a survey by Altman and Saunders(1997).these factors will be easier to observe and measure than others.The observable accounting-based factors are at best noisy and at worst could be biased,highlighting the information asymmetry between managers(agents)and share/debt-holders(principals).14Although our objective is not to build a default model per se,we face the same measurement difficulties and information asymmetries.To overcome them,we make use of the credit rating of afirm which we denote by R.15This will help us specifically in estimating the default thresholds needed in the determination of the default probabilities.Naturally,rating agencies have access to,and presumably make use of,private information about thefirm to arrive at theirfirm-specific credit rating,in addition to incorporating public information such as balance sheet information and,of course,equity returns.Thus we make the assumption that rating agencies benchmark their ratings on past returns and volatilities of allfirms that have been rated R in the past.Consider then a particular R−ratedfirm at time t,and assume that in arriving at their rating the credit rating agency uses the following standard geometric random walk model of equity values:ln(E R,t+1)=ln(E R t)+µR+σRηR,t+1,ηR,t+1∼IIDN(0,1),(5) with a non-zero drift,µR,and idiosyncratic Gaussian innovations with a zero mean andfixed volatility,σR.16We assume that conditional on data at time t,afirm’s rating does not change over the horizon(t,t+H),namelyln(E R,t+H)=ln(E R t)+HµR+σRHX s=1ηR,t+s,and by(4)default occurs ifln(E R,t+H)=ln(E R t)+HµR+σRHX s=1ηR,t+s<ln(C R,t+H),(6)or if the H-period change in equity value or return falls below the log-threshold-equity ratio:lnµE R,t+H E R t¶<lnµC R,t+H E R t¶.(7) Equation(7)tells us that the relative(rather than absolute)decline infirm value must be large enough over the horizon H to result in default,meaning it is independent of the size of thefirm. Firm size is an input to the credit rating determination;a smallfirm would need a larger equity cushion to withstand a given shocks than a largefirm with the same rating.14With this in mind,Duffie and Lando(2001)allow for the possibility of imperfect information about thefirm’s assets and default threshold in the context of afirst-passage model.15R may take on values such as’Aaa’,’Aa’,’Baa’,...,’Caa’in Moody’s terminology,or’AAA’,’AA’,’BBB’,...,’CCC’in S&P’s terminology.16Clearly non-Gaussian innovations can also be considered.But for quarterly data that we shall be working with Gausssian innovations seems a goodfirst approximation.Under the assumption that the evolution offirm equity value follows(5),ln(E R,t+H/E R t)may be approximated by the cumulative returns so that(7)can be re-written asHµR+σRHX s=1ηR,t+s<lnµC R,t+H R t¶.Therefore,the default probability for the R−ratedfirms at the terminal date t+H is given byπR(t,H)=ΦR,t+H /E R t RσR√H,(8) whereΦ(·)is the standard normal cumulative distribution function.Denote the H-period forward log threshold-equity ratio to beλR(t,H)=ln(C R,t+H/E R t)so thatλR(t,H)=HµR+Q R(t,H)σR√whereQ R(t,H)=Φ−1[πR(t,H)]is the quantile associated with the default probabilityπR(t,H).An estimate ofλR(t,H)can now be obtained using past observations on returns,r R t= ln(E R,t/E R,t−1),and the empirical default frequencies,ˆπR(t,H),of R−ratedfirms over a given period of say t=1,2,...,T.17Denoting the estimates ofµR andσR byˆµR,andˆσR,respectively, we haveˆλR(t,H)=HˆµR+ˆQ R(t,H)ˆσR√H,(9) whereˆµR andˆσ2R are the mean and standard deviation of returns offirms with rating R over the sample period,and18ˆQR(t,H)=Φ−1[ˆπR(t,H)].(10) The estimates ofˆµR andˆσR can also be updated using a rolling window of size7-8years(the average length of the business cycle).In practice,ˆπR(t,H)might not provide a reliable estimate ofπR(t,H)as it is likely to be based on very few defaults over any particular period(t,t+H).One possibility would be to use an average estimate ofλR(t,H)obtained over a reasonably long period of10to20years(on a rolling basis).For example,based on the sample observations t=1,2,...,T,we would haveˆλR(H)=HˆµR+ˆQ R(H)ˆσR√H,(11) 17An important source of heterogeneity is likely the large variation in bankruptcy laws and regulation across countries.However,by using rating agency default data,we use their homogeneous definition of default and are thus not subject to these heterogeneities.18In practice where there are many R-ratedfirms in a given period,average returns across all R-ratedfirms can be used to estimateˆµR.The computation ofˆσ2R is more involved and is described in a note available from the authors on request.。