The inverse of the cumulative standard normal probability function
- 格式:pdf
- 大小:154.41 KB
- 文档页数:13
Effects of longevity and dependency rates on saving and growth: Evidence from a panel of cross countries∗Hongbin LiChinese University of Hong KongJie Zhang†National University of Singapore and University of QueenslandJunsen ZhangChinese University of Hong Kong20October2006AbstractWhile earlier empirical studies found a negative saving effect of old-age dependency rates without considering longevity,recent studies have found that longevity has a positive effect ongrowth without considering old-age dependency rates.In this paper,wefirst justify the relatedyet independent roles of longevity and old-age dependency rates in determining saving andgrowth by using a growth model that encompasses both neoclassical and endogenous growthmodels as special ing panel data from a recent World Bank data set,we thenfind thatthe longevity effect is positive and the dependency effect is negative in savings and investmentregressions.The estimates indicate that the differences in the demographic variables acrosscountries or over time can well explain the differences in aggregate savings rates.We alsofindthat both population age structure and life expectancy are important contributing factors togrowth.JEL classification:E21;J10;O16Keywords:Longevity;Old-age dependency;Saving;Growth∗We would like to thank Lant Pritchett(editor)and two anonymous referees for very helpful comments.†Department of Economics,National University of Singapore,Singapore117570.Email:ecszj@.sg;Fax: (65)6775-26461.IntroductionThe last century has seen dramatic increases in life expectancy around the world(e.g.Lee and Tuljapurkar,1997).This has been accompanied by low fertility in industrial countries,causing serious population aging and higher old-age dependency rates.Both individuals and governments are increasingly concerned about the effects of aging,though their concerns differ.Individuals are more concerned about increased longevity because it affects their ownfinancial and labor market plan(Hurd1997),whereas governments are more concerned about old-age dependency as an aspect of population aging(Weil1997).Yet it is important to realize that individual aging leads to population aging(holding other things constant).The significance of these demographic changes has attracted a great deal of attention among economists(see,e.g.,Bos and von Weizsacker,1989; Culter,Poterba,Sheiner and Summers,1990;Lee and Skinner,1999;Lagerl¨o f,2003).One of the issues at the heart of the related empirical investigations is the effect of demographic changes on national savings,investment,and economic growth.There are two separate sets of related studies in the literature.One set of them is concerned with the effect of population depen-dency rates on aggregate savings,following the seminal work of Leff(1969)thatfinds a significant negative effect of the old-age dependency rate on the aggregate saving rate.For example,Edwards (1996)finds evidence that supports Leff’s early results.However,Adams(1971),Gupta(1971), Goldberger(1973),and Ram(1982,1984)present cases in which the dependency effect on savings may be insignificant or even positive.1This debate has concentrated on one important aspect of the issue:the relationship between age and savings.However,the entire debate and related empirical work have ignored another important aspect of the issue:the relationship between savings and expected life span.What happens to savings regressions if life expectancy and old-age dependency are jointly considered?In contrast to the studies on aggregate saving rates,the other set of the related studies has paid little attention to dependency rates and has instead focused on the effect of longevity in investment1See the replies from Leff(1971,1973,1984),and the more recent work of Mason(1988),Brander and Dowrick (1994),Weil(1994),Kelley and Schmidt(1995),Higgins and Williamson(1997),and Lee,Mason and Miller(2000).1and growth regressions.It has typically found a positive effect of longevity on investment and growth(e.g.Ehrlich and Lui,1991;Barro and Sala-i-Martin,1995,Ch.12).2The strong positive effect of longevity on growth is interpreted by Barro and Sala-i-Martin as a reflection of growth enhancing factors(in addition to good health itself)such as good work habits and high levels of skill.We will offer a different interpretation by giving a more direct role for longevity in life-cycle optimization.In general,this set of studies has omitted dependency rates in growth regressions. Due to this omission,it cannot capture an important aspect of the issue:removing a group of workers from production and adding them to the dependent population will obviously change both the level and growth rate of output per capita.At a theoretical level,the longevity versus dependency effects we have discussed here simply reflect two aspects of the life-cycle hypothesis.On the one hand,individuals save more when they expect to live longer.On the other hand,when the population becomes older,dissavers increase in number relative to savers.Intuitively,both middle-aged and old-aged individuals contribute to aggregate savings.With increasing life expectancy,at a given point in time,middle-aged individuals save more and raise aggregate savings.However,increasing life expectancy also implies more old people who dissave and reduce aggregate savings.Hence,the effect of rising longevity can only be estimated with the population age structure held constant,and vice versa.In this paper,wefirst construct a simple growth model with overlapping generations that encompasses both neoclassical and endogenous growth models as special cases,to shed light on the explicit and separate roles of life expectancy,dependency rates,and fertility(or population growth). We show that rising longevity(life expectancy)raises the saving rate at both the household level and aggregate level,and that it raises the growth rate of output per capita.However,a rising old-age dependency rate lowers the aggregate saving rate,whereas a rising total dependency rate reduces the growth rate of per capita output.Although demographers have long noticed the importance of both fertility and longevity to the dependency ratio,economists have largely ignored one of the2Ehrlich and Lui(1991)used mortality,which is similar to the use of life expectancy.Recent theoretical studies have investigated the effect of life expectancy(or mortality)on capital accumulation and growth.See,for example, Skinner(1985),Ehrlich and Lui(1991)and de la Croix,and Licandro(1999).2factors in their analysis.Thus,we contribute to the economic literature by taking into account these realistic demographic factors.We then carry out a panel study using the World Bank’s World Development Indicators2005 dataset,which contains more substantial information for over two hundred countries from1960to 2004than previously used data sets.3Consistent with our intuitive and theoretical predictions, ourfixed effects estimations show that the longevity effect is positive and the dependency effect is negative in the saving,investment,and growth regressions.Thefindings are generally robust when we add other determinants of savings,investment,and growth.The remainder of this paper is organized as follows.Section2justifies the roles of life expectancy, population growth,fertility,and the population age structure in the determination of aggregate savings and per capita income growth in a simple growth model with overlapping generations. Section3describes the data and the empirical specifications.Section4reports the regression results.Thefinal section provides concluding remarks.2.A simple theoretical model and testable implicationsFollowing the life-cycle hypothesis,we use a simple overlapping generations model that links savings, investment,and growth to longevity,population growth,and population age structures.The simple model is constructed only to capture our main points and is not meant to be comprehensive.In this model,agents live for three periods:making no choices in childhood,working in middle age, and living in retirement in old age.The middle-age population has a mass L t in period t.Fertility is exogenous at a rate of N t=L t+1/L t.All workers have a probability of P to survive to old age and a death rate of1−P at the end of their middle age.The size of the old population at time t is L t−1P t−1,and the size of the total population is L t(1+N t)+L t−1P t−1.A middle-aged agent inelastically supplies one unit of labor endowment to work and earns a wage income of W t.The wage income is divided between middle-age consumption c t and savings3Other popularly used macro data sets do not have observations for the saving rates,such as those extended from Summers and Heston(1991),which adjust for cross-country differences in the cost of living by using observed prices of goods and services.Barro and Sala-i-Martin(1995)found that growth regressions from both the World Bank data and the Summers-Heston data gave very similar results.3s t for old-age consumption z t+1.There is a perfect annuity market that channels the savings of workers tofirms and distributes the savings plus interest of those who die to old-aged survivors. Hence,every worker expects to receive an annuity income of R t+1L t s t/(L t P t)in old age(where R is the interest factor)that is conditional on survival.Thus,the lifetime budget constraint of a worker is c t=W t−s t and z t+1=R t+1s t/P t.The preference is assumed to be U t=log c t+P tβlog z t+1whereβis the subjective discount factor.This simple version of preferences in a standard overlapping generations model will yield rich specifications of the saving rate and growth equations,and avoid unnecessary complexity.The solution to this simple problem isc t=Γc,t W t,Γc,t=11+P tβ,(1)s t=Γs,t W t,Γs,t=P tβ1+P tβ,(2)z t+1=βR t+1c t.(3)As implied by the life-cycle hypothesis,a higher rate of survival raises the fraction of income for savingsΓs but reduces the fraction of income for middle-age consumptionΓc.In the Appendix we will discuss whether this positive correlation between the saving rate and the rate of survival remains when survival depends on current consumption as in Gersovitz(1983).Note thatΓc+Γs=1.In this household problem,we focus on savers(middle-aged agents) and dissavers(old-aged agents)and thus ignore child consumption and child mortality.Including child consumption and child mortality usually gives a negative role of the number of children in the saving equation.In the later empirical analysis,a lagged-fertility-rate variable will be used to partly control for any child effects.The production function is Y t=AKα(L t˜kθt)1−α,A>0,0<α<1,and0≤θ≤1,where K is capital L labor,and˜k the average level of capital per worker in the economy.Whenθ=0, the model is neoclassical without sustainable growth in the long run,whereas whenθ=1(or close to1)it is an AK model with sustainable long-run growth in the spirit of Romer(1986).4In equilibrium k t=˜k t=[Y t/(AL t)]1/[α+θ(1−α)].For simplicity,we assume a100%depreciation of capital within one period in production.Factors are paid by their marginal products:W t=(1−α)Akα+θ(1−α)t and R t=αAk(1−α)(θ−1)t.Output is equal to the sum of labor income and capitalincome:Y t=L t W t+K t R t.As capital income goes to old-age survivors,we can express aggregate old-age consumption as L t−1P t−1z t=K t R t.The aggregate net saving that is denoted by S t,is defined as aggregate output less aggregate consumption,that is,S t=Y t−L t c t−L t−1P t−1z t.Substituting young-age consumption c t=W t−s t and aggregate old-age consumption L t−1P t−1z t=K t R t into the aggregate saving equation,we have S t=Y t−L t(W t−s t)−K t R t=L t s t as Y t=L t W t+K t R t.Thus,we obtain the result that the aggregate net saving is equal to the sum of savings by workers.4However,output is equal to aggregate spending:Y t=L t c t+L t−1P t−1z t+K t+1or K t+1= Y t−L t c t−L t−1P t−1z bining this accounting identity on the spending side with the aggregate saving equation in a closed economy,we have S t=K t+1=L t s t,or k t+1=K t+1/L t+1=s t/N t where capital per worker in the next period depends positively on savings but negatively on fertility today as in a standard neoclassical growth model.In equilibrium,capital per worker evolves according tok t+1=A(1−α)Γs,t 1N tkα+θ(1−α)t.(4)Clearly,whenθ=1(close to1),the level of capital per worker will never(take many periods to) converge,and thus there will be long-run growth as in Romer(1986).However,regardless of the value ofθ,the rate of capital accumulation(and hence growth)on the transitional equilibrium path is determined by both the saving rate and the exogenous fertility rate.As a higher rate of survival4One might mistakenly define the aggregate net saving as St=L t s t−L t−1P t−1z t when z t is regarded as dissavings by the elderly.This alternative definition would change S t from‘total output less total consumption’to‘total wage income less total consumption’,because from L t s t=L t W t−L t c t,we would have S t=L t W t−L t c t−L t−1P t−1z t.It is important to realize that dissaving in old age(z)is alreadyfinanced by capital income according to L t−1P t−1z t=K t R t or z t=R t s t−1/P t−1,and hence the net savings in old age(i.e.,capital income less old-age consumption)is zero for every old survivor.As a result,the aggregate saving S comes only from workers’savings in such an overlapping generations framework.However,the aggregate savings that is determined by workers only is an end result.During the process of reaching this end,the aggregate saving is also affected by the ratio of old to worker population,through the mechanism that the amount of capital stock that is accumulated by the elderly in the past determines output and savings together with the number of workers today.5raises the fraction of income for savingsΓs,t in(2),it will promote capital accumulation per workeraccording to(4).By accelerating capital accumulation,the increase in the rate of survival will also reduce the interest rate in the neoclassical model of growth withθ∈(0,1),according to theinterest-rate equation R t+1=αAk−(1−α)(1−θ)t+1.The ratio of aggregate savings to output(the aggregate saving rate hereafter)is˜Γs,t≡S t/Y t= (L t y t−L t c t−L t−1P t−1z t)/(L t y t)=1−(1−α)Γc,t−(L t−1P t−1/L t)(z t/y t)where y t=Y t/L t is output per worker.Thefirst part1−(1−α)Γc,t=α+(1−α)Γs,t is a positive function of the rate of survival that reflects the life-cycle hypothesis.That is,life expectancy contributes to the aggregate saving rate through influencing each individual’s saving rate.The last term has two factors that reduce the aggregate saving rate through raising aggregate old-age consumption to output:(L t−1P t−1/L t)(z t/y t),that is,the ratio of old to middle-age population and the ratio of consumption per old-aged agent to output per worker.Intuitively,holding consumption per old-aged agent constant,an increase in the old-age dependency rate increases the fraction of output for old-age consumption and hence reduces the aggregate saving rate,as an implication of the life-cycle hypothesis that was focused on in previous saving regressions.However,holding the dependency rate constant,an increase in the ratio of consumption per old-aged agent to output per worker has a similar effect on the aggregate saving rate.Furthermore,the ratio(z t/y t)can be expressed as αN t−1/P t−1;5namely,the ratio of consumption per old-aged agent to output per worker is higher if previous fertility is higher(i.e.,lower labor productivity today)or if the previous rate of survival is lower(i.e.,a higher rate of return on savings today).Based on the above discussion,the ratio of aggregate savings to output becomes˜Γs,t=α+(1−α)P tβ1+P tβ−αNt−1P t−1Lt−1P t−1L t.(5)Thus,the aggregate saving rate is an increasing function of the rate of adult survival(P t),but is a decreasing function of both previous fertility(N t−1)and the old-age dependency rate(L t−1P t−1/L t).5From(1)and(3),zt/y t=βR tΓc,t−1W t−1/y t.Expressing R,W,and y as functions of capital per worker k andusing(4),we have z t/y t=Aβα(1−α)Γc,t−1kα+θ(1−α)t−1/k t=α(βΓc,t−1/Γs,t−1)(L t/L t−1)where L t/L t−1=N t−1.The ratioβΓc,t−1/Γs,t−1is simply equal to1/P t−1by(1)and(2).6How do these demographic factors affect per capita income growth?In this model,the growth rate of per capita income is equal to g t ≡{Y t /[L t (1+N t )+L t −1P t −1]}/{Y t −1/[L t −1(1+N t −1)+L t −2P t −2]}−1and the growth rate of the total population is equal to n t ≡[L t (1+N t )+L t −1P t −1]/[L t −1(1+N t −1)+L t −2P t −2]−1.It is easy to see that g t =(k t /k t −1)α+θ(1−α)(L t /L t −1)[1/(1+n t )]−1where k t /k t −1=A (1−α)Γs,t −1(L t −1/L t )k −(1−α)(1−θ)t −1by (4).Note that k t −1=[Y t −1/(AL t −1)]α+θ(1−α).Moreover,define the ratio of the middle-age to total population by φt =L t /[L t (1+N t )+L t −1P t −1],which is the inverse of the total dependency rate 1−φt (i.e.,the sum of children and old-aged agents relative to the total population).Using these definitions,the log version of the growth equation is given by:log(1+g t )=Λ+[α+θ(1−α)][log Γs,t −1−log(1+n t )]+(1−α)(1−θ)log φt−(1−α)(1−θ)log Y t −1L t −1(1+N t −1)+L t −2P t −2,(6)where Λ≡log A +[α+θ(1−α)]log(1−α)is a constant.From (6),the growth rate of per capita income is positively related to the previous saving rate Γs,t −1=P t −1β/(1+P t −1β)(or previous life expectancy in a reduced form),and to the ratio of middle-age to the total population.However,it is negatively related to population growth and previous per capita income.Among the three demographic factors,high previous life expectancy means high previous savings and investment.Hence,there is a high capital stock per worker and high output per worker today,when holding other factors constant.Rapid population growth means a large current population relative to the past population,which leads to lower current output per capita if the population age structure remains the same for a given level of aggregate capital stock at the beginning of the current period.Given the size of the total population,if children and old agents account for a large portion of the total population (or high total dependency)today relative to the past,then output per capita should be low today relative to the past.The relationships in equations (5)and (6)capture the entire dynamic path of the model,and are valid for all times.Moreover,these relationships are derived from the decisions of both households and firms,as in recent studies on savings and growth.Although our simple theoretical model is7very much in line with the recent new growth literature on savings and growth,6our explicit joint consideration of longevity,the age structure,and fertility in affecting savings and growth seems to be novel and is itself an important contribution to the literature.3.Empirical specifications and dataThe empirical specifications for saving or investment and growth equations are based on(5)and(6) which illuminate the explicit roles of life expectancy,population growth,fertility,the population age structure,and previous income per capita.As life expectancy at time t reflects the up to date information on the rate of survival,it will replace P t in(5)and(6).Consequently,the saving rate equation is specified as(aggregate saving/output)it =a0+a1(life expectancy)it+a2(old to work age population ratio)it +a3(fertility)i,t−1+u it,(7)where a linear functional form is assumed;subscript i refers to a country and subscript t refers to a period.According to the theoretical discussion,we expect that a1>0,a2<0,and a3<0,i.e.a positive longevity effect,a negative old-age dependency effect,and a negative fertility effect.The investment equation shares the same specification with the saving equation in a closed-economy model;in open economies,this saving rate equation specification may apply well in the investment equation if savings and investment are closely related.The growth equation is based on(6):log(1+growth rate)it=b0+b1log(GDP per capita)i,t−1+b2log(1+population growth)it+b3log(work age to total population ratio)it +b4log(life expectancy)i,t−1+u it.(8)6An alternative approach in the literature seems to emphasize how exogenous growth and age structures affect the saving rate(see,e.g.,Mason,1988,Higgins and Williamson,1997,and Lee et al.,2000).They clearly showed that exogenous technological progress would affect saving rates.In this paper,we follow the more recent new growth literature of Romer,Lucas,Ehrlich,and Barro in which exogenous factors determine growth determinants,such as saving rates,fertility,and human capital that in turn affect growth.The former approach has put emphasis on savings,whereas the latter approach has focused on sustainable growth as the ultimate variable to be determined in the model.Our major focus is on the separate effects of several demographic factors on saving and growth rates when they are controlled for jointly.8We expect that b1<0,as in the literature on growth regressions,and according to our theoretical predictions,b2<0,b3>0,and b4>0,i.e.a negative population growth effect,a negative total dependency effect,and a positive longevity effect.In this growth equation,the log of previous life expectancy has replaced the log of the previous saving rate that is chosen by households(not the aggregate saving rate)that is present in the theoretical growth equation(6).In comparison to the previous studies,a distinctive feature of our specifications of the saving, investment,and growth equations is that we consider the population age structure,population growth(or fertility),and life expectancy jointly.To facilitate the comparison of our results with those in the previous work,we will also consider specifications where one of the life expectancy and population age structure variables is excluded from the saving,investment,or growth equation.To control for unobservable country and cyclical effects,we employ the country and timefixed-effects model for both(7)and(8).7The data used here are based on the World Bank’s World Development Indicators2005for the period from1960to2004,which cover over200countries.The dataset contains a wide range of variables,such as the ratio of investment to GDP,the growth rate,school enrollment,and demo-graphic statistics.Regarding the population age structure,data are available for the proportions of the population under age15,between15and64,and aged65or above.The age group between15 and64is chosen to be the middle-age population,as in the literature on savings and dependency rates.The old-age dependency rate refers to those over middle age relative to the middle-age pop-ulation,and the total dependency rate is the sum of those under age15and those over middle age relative to the total population.In addition to these variables,the World Bank data set also has information on the ratio of aggregate savings to output that is of particular interest here.The World Bank data are annual observations with missing information for some variables that are used here.For some demographic variables,their observations are usually updated less frequently than those from National Accounts.We construct a panel data set by taking afive-year7We have also conducted GMM estimations.However,the instrumental variables(IVs)(the lagged values and differences of the lagged independent variables in Tables2-4)in the GMM estimations do not pass the Hansen over-identification restriction test.Generally speaking,it is difficult tofind valid IVs for cross-country growth regressions.9average from the annual observations in the World Bank’s data set.8The advantage offive-year averages is to reduce the short-term cyclical influence on growth,savings,and investment.To make each periodfive years,we use the data from1963to2003,including eight5-year periods.Due to missing values,we use149countries and779observations for the savings regressions,149countries and775observations for the investment regressions,and150countries and793observations for the growth regressions.As the equations for savings,investment,and growth include one-period lagged variables,all regressions actually use seven periods of the averaged data.In the full sample,the mean values(standard deviations)of life expectancy,the old-age de-pendency rates,the aggregate saving rates,and the annual growth rates are62years(11.63),0.10 (0.06),0.165(0.15),and0.015(0.038),respectively.There was a great deal of variation across countries.Life expectancy during the period from1993to1998was below50years in24coun-tries,between50and70years in67countries,and above70years in66countries.The old-age dependency rates ranged from0.02to0.28;the mean values of the old-age dependency rates were 0.05in the41countries with low life expectancy,0.08in the19countries with mid life expectancy, and0.15in the31countries with high life expectancy.The mean values of the saving rates and the growth rates in the countries with low life expectancy(0.06and0.015respectively)were much lower than in the countries with high life expectancy(0.21and0.024respectively).Fertility rates were much higher in low life expectancy countries than in high life expectancy countries:5.89,3.73 and2.12in countries with low,mid,and high life expectancy,respectively.Variation over time was relatively moderate,apart from a sharp decline in fertility.From the period from1963to68to the period from1998to2003,life expectancy increased by about10years,the old-age dependency rate increased by about2percentage points,fertility declined by more than40%,and the saving (growth)rates declined from18%to16%(0.027%to0.018%).9Summary statistics are provided in Table1.8In this paper,all economic variables arefirst normalized to the local2000price level,and then converted into the2000US dollar using the2000exchange rate.There is a second way of conversion:economic variables arefirst converted into the US dollar by the current exchange rate,and then normalized to the2000US price level.We use thefirst way of conversion to avoid the complication of exchange rate variation.We also run regressions using the second way of conversion(not reported in tables),andfind the results are qualitatively similar.9There was remarkably high growth in the1960s,in contrast to the slow growth in the following three decades.104.Estimation results4.1.Saving regressionsThe saving regressions are reported in Table2.We begin with a saving regression on two explana-tory variables:previous fertility and life expectancy.10These two variables were used in previous studies on investment(e.g.,Barro and Sala-i-Martin,1995).The result is reported in Regression (1)in Table2:previous fertility has a negative effect on the aggregate saving rate,whereas life expectancy has a significant positive effect.In Regression(2),we use previous fertility and the old to middle-age population ratio but drop life expectancy as in Leff(1969)and Ram(1982).It shows that the estimated coefficient on previous fertility is about the same in magnitude as that in Regression(1).The estimated coefficient on the old-age dependency rate is not significantly different from zero,which may lead to the conclusion that old-age dependency does not affect the aggregate saving regression as in Ram(1982,1984).In Regression(3),that follows our correct specification,we include both life expectancy and old-age dependency.The estimated coefficient on life expectancy is nearly the same as in Regression (1)where the old-age dependency variable is absent.The old-age dependency ratio continues to have a negative yet insignificant effect on savings.In Regression(4),we add another important determinant of savings-the lagged log per capita GDP.It appears that the saving rate increases with per capita GDP,which has a highly significant coefficient as suggested in the model in the Appendix whereby survival increases with consumption. Interestingly,when controlling for per capita GDP,the effect of fertility becomes much smaller.This suggests that much of the effect of fertility is actually the income effect.In other words,countries with high income levels have lower fertility.An even more remarkably different result occurs with the coefficient on the old-age dependency rate:it is significantly negative in Regression(4)as in Leff(1969).The absolute value of this coefficient in Regression(4)more than doubles that in10In the literature on empirical growth,most studies use one-period lagged fertility as an independent variable(see e.g.,Levine and Renelt(1992),Bloom and Williamson(1998),Islam(1995),and Li and Zhang(2006)).We have also experimented by using10-year and15-year lagged fertility variables.The primary results do not change with these further lags.11。
Uniform (Continuous) DistributionQuantile-Quantile Plots Empirical Cumulative Distribution Function (CDF)Hidden Markov Model Functions New Functions for Extreme Value DistributionsStatistics Toolbox wblcdfWeibull cumulative distribution function (cdf)SyntaxP = wblbcdf(X, A, B)[P, PLO, PUP] = wblcdf(X, A, B, PCOV, alpha)DescriptionP = wblbcdf(X, A, B) computes the cdf of the Weibull distribution with scale parameter A and shape parameter B, at each of the values in X. X, A, and B can be vectors, matrices, or multidimensional arrays that all have the same size. A scalar input is expanded to a constant array of the same size as the other inputs. The default values for A and B are both 1. The parameters A and B must be positive.[P, PLO, PUP] = wblcdf(X, A, B, PCOV, alpha) returns confidence bounds for P when the input parameters A and B are estimates. PCOV is the 2-by-2 covariance matrix of the estimated parameters. alpha has a default value of 0.05, and specifies 100(1 - alpha)% confidence bounds. PLO and PUP are arrays of the same size as P containing the lower and upper confidence bounds.The function wblcdf computes confidence bounds for P using a normal approximation to the distribution of the estimateand then transforms those bounds to the scale of the output P. The computed bounds give approximately the desired confidence level when you estimate m u, sigma, and PCOV from large samples, but in smaller samples other methods of computing the confidence bounds might be more accurate.The Weibull cdf isExamplesWhat is the probability that a value from a Weibull distribution with parameters a = 0.15 and b =0.8 is less than 0.5?probability = wblcdf(0.5, 0.15, 0.8)probability =0.9272How sensitive is this result to small changes in the parameters?[A, B] = meshgrid(0.1:0.05:0.2,0.2:0.05:0.3);probability = wblcdf(0.5, A, B)probability =vartestn wblfitwblcdf wblinvStatistics Toolbox wblinvInverse of the Weibull cumulative distribution functionSyntaxX = wblinv(P, A, B)[X, XLO, XUP] = wblinv(P, A, B, PCOV, alpha)DescriptionX = wblinv(P, A, B) returns the inverse cumulative distribution function (cdf) for a Weibull distribution with scale parameter A and shape parameter B, evaluated at the values in P. P, A, and B can be vectors, matrices, or multidimensional arrays that all have the same size. A scalar input is expanded to a constant array of the same size as the other inputs. The default values for A and B are both 1.[X, XLO, XUP] = wblinv(P, A, B, PCOV, alpha) returns confidence bounds for X when the input parameters A and B are estimates. PCOV is a 2-by-2 matrix containing the covariance matrix of the estimated parameters. alpha has a default value of 0.05, and specifies 100(1 - a lpha)% confidence bounds. XLO and XUP are arrays of the same size as X containing the lower and upper confidence bounds.The function wblinv computes confidence bounds for X using a normal approximation to the distribution of the estimatewhere q is the P th quantile from a Weibull distribution with scale and shape parameters both equal to 1. The computed bounds give approximately the desired confidence level when you estimate mu, sigma, and PCOV from large samples, but in smaller samples other methods of computing the confidence bounds might be more accurate.The inverse of the Weibull cdf isExamplesThe lifetimes (in hours) of a batch of light bulbs has a Weibull distribution with parameters a = 200 and b = 6. What is the median lifetime of the bulbs?life = wblinv(0.5, 200, 6)life =188.1486What is the 90th percentile?life = wblinv(0.9, 200, 6)life =wblfit wbllikewblinv wblpdfwbllike wblplotwblpdf wblrndwblplot wblstatwblrnd wishrnd。
统计题(Statistical questions)0. what is the standardization law? What is the significance of standardization? What criteria are used to meet the material requirements and the direct method?The standard method is two or more than two of total numbers were compared in order to eliminate the internal structure of the different effects of the uniform standards, different groups of internal mean method after adjustment were compared.When the number of observation cases in each group is large enough and the number of positive cases is known, the direct method can be used1 the significance of medical reference value range and the method of determining itThe meaning is the fluctuation range of data determined by a certain probability, as the reference index to judge the normal and abnormal in clinicMethods: normal distribution method, steps: 1> normality test, 2> if normal, calculated mean X and standard deviation S, 3> estimation range of reference value x (-) + ua/2S2 how do four table data undergo hypothesis testing?1. establish hypothesis testing and establish inspection levelH0: the sample rate of the two classified variables is the sameH1: the data rate of the two classified variables is differentA=0.052. compute test statisticsA., if n>=40, but with 1<T<5, requires continuity correction of X2 valuesB. If n<40 or T<=1, the exact probability method3. determine the P value and draw inferences3, talk about your understanding of hypothesis testing P valuesThe meaning of the P value is the probability that the total random sampling from H0 is equal to and equal to or equal to or equal to the probability of the test value obtained not less than the existing sample. If: p<=a, then the conclusion rejected H0 accept H1, according to a test level, get "difference is statistically significant" conclusion. On the other hand, p>a, to H0, according to the a test standard, "no significant difference" conclusion, but not "no difference" conclusion, only "according to the test results, still can't believe that there is difference between" conclusion, because H0 is not equal to refuse to accept H0.4, briefly describe the differences and connections between type I errors and type II errors, and understand the practical significance of these two types of errors.The type I error refers to the establishment of the H0 I actually refused to make mistakes, "abandoning the truth" the probability of a, the type II error refers to the error from the "pseudo" received the H0 was not made, the probability of B. When the sample is n, the smaller the A, the bigger the B; on the contrary, the larger the A, the smaller the B; if the application should focus on reducing a, then take the a=0.05; if the focus is reduced by B, take a=0.10 or 0.20, or even higher.5. briefly describe the relation between normal distribution, two term distribution and Poisson distribution.Normal distribution: the estimation of the frequency distribution of the normal distribution of continuous random variablesTwo distribution: there were only two possible outcomes in each trial, and they were antagonistic; each experiment was independent and independent of the other test resultsPoisson distribution: a discrete distribution of a single parameter, representing the average number of times in a unit, time, or spaceWhen the n is larger or the PI is not close to 0 or 1, the two term distribution can be regarded as approximately normal distributionPoisson can be considered as a limit of the two distribution, that is, when Pi is small and n tends to infinity, the two termdistribution is approximately Poisson distribution, and when >=20, we can approximate it by normal distribution6. what are the indicators that describe the trend of concentration? What are the similarities and differences in its scope of application?Mean: normal or approximately normal distributionGeometric mean: geometric series or lognormal distributionMedian: the data is skewed distribution; irregular distribution; uncertain data (opening data) at one or both ends.7 what is hypothesis testing and can be illustrated by examplesHypothesis testing is the use of small probability apagoge thought, make the first two opposite to the general characteristics of statistical assumptions, and then calculate the test statistic in H0 was established under the condition that the probability of getting the value of P, a compared to determine the statistical inference process of H0 is established and the probability of a predetermined value1. establish hypothesis testing and establish inspection levelH0: the sample rate of the two classified variables is the sameH1: the data rate of the two classified variables is differentA=0.052. compute test statistics3. determine the P value and draw inferences8 talk about your assumptions about the conclusions of the testsThe conclusion is based on the principle of hypothesis testing a small probability event may not occur for the actual test, therefore rejected when testing hypotheses may make the type I error, when the tested hypothesis is likely to make the type II error.9. compare the similarities and differences between standard and standard errorsThe standard deviation of individual differences in size of n = S, and the average combination can make a reference rangeStandard error of sampling error size n = - 0 and mean confidence interval can be calculated with the overall mean number10, how can sampling errors be controlled or reduced in sampling studies?A reasonable sampling design increases the sample size11., what is sampling error? Why is sampling error inevitable in sampling studies?Sample and sample statistics by sampling. The difference caused by sample statistics and overall parameters. Because the individual differences exist, the research object is part of the total, so this part of the overall results and the differences between the results of color is inevitable.12. can we say that the smaller the p value of the hypothesis test is, the greater the difference between the two overall indicators? Why?No, because of differences in size and overall index P value of size is not exactly the same size except with the overall difference is related to the size of the.P value, and the sampling error is related to the size of the overall difference, the same, the sampling error of different sizes, the P will not the same size, the sampling error of actual work is mainly reflected in the sample size.13 in the rank sum test, why does the same data appear between the different groups, and the average rank is not necessary in the same set of data?The basic idea of rank test is: if the establishment of the H0, when N1 and N2 were determined after sample content for rank and T N1 samples and the average rank and should not vary greatly; if the disparity is that two of the overall distribution. Without the calculation of the average rank will not affect the rank in the same set of data however, different groups have the same data to calculate the average rank will affect the T valueCharacteristics of 14.t distributionThe 1.t distribution is a unimodal distribution with 0 as the center and the left and right sides symmetrical;2.t distribution curve is a cluster of curves, and its morphological change is related to the degree of freedom v. The degrees of freedom of V is small, the T value is more dispersed, the curve is flat; degree of freedom V increased gradually, t distribution gradually approaching the standard normal distribution. When v= is infinite, the t distribution becomes a standard normal distribution.Characteristics of 15 and Poisson distribution.1.Poisson distribution is a single parameter discrete distribution, whose parameter is mu, which represents the average number of events occurring in unit time or space, also known as intensity parameter.The variance sigma 2 of the 2.Poisson distribution is equal to the mean mu, that is, sigma 2= MuThe 3.Poisson distribution is asymmetric, and is skewed distribution when it is not large. It is rapidly approaching the normal distribution with the increase of mu. Generally speaking, when =20, we can consider approximately normal distribution, and Poisson distribution data can be processed by normal distribution.The cumulative probabilities of the 4.Poisson distribution are two kinds of cumulative left and right. The number of eventsoccurring in a unit, time, or space.The shape of the 5.Poisson distribution depends on the size of the mu. The value is small, distribution is partial, with Mu increases, the more symmetrical distribution, when =20, the distribution is close to normal distribution, when =50, can be considered Poisson distribution normal distribution N (U, U), according to the normal distribution.16, what are the characteristics of the sampling distribution of the sample mean?Among the 100 sample numbers, there were differences in the mean of each sample, but the mean of each sample fluctuated around the population mean.The distribution curve of sample number is middle height, both sides are low, left and right symmetry, and approximately obeys normal distribution.The standard deviation of sample mean is obviously smallerWhat are the characteristics of the 17 and t distributions compared with the U distribution?1., the two are unimodal distribution, t distribution curve to t=0 as the center, the left and right sides symmetryThe peak value of 2.t distribution curve is lower, and the tail curve is higher. It shows that the individual of far t value is relatively more, and the smaller the degree of freedom is,the more obvious the situation is3. as the degree of freedom increases, the t distribution is more and more close to the standard normal distribution. When the v= is infinite, the limit distribution of the t distribution is the standard normal distribution18, why do we assume that the conclusion of the test can not be absolute?The hypothesis test, when P refused to accept alpha Ho, H1 Ho, this is not completely true only existing sample information does not support the Ho; when the O> alpha, to Ho, but not that of Ho fully established. In a word, whether we reject Ho or not, we make mistakes. Therefore, the conclusion of statistics is probabilistic and can not be absolute.19 what is the difference between a completely randomized design and a randomized block design?A completely randomized design with randomized method to control the error variance, that by randomization, variation between subjects distributed in each processing level is random, each level was tested in the treated before is exactly the same, can be attributed to the differences between the experimental results do not affect the same treatment. A completely randomized design hypothesis can balance the differences between subjects by randomization, but in fact the experimental results often include individual differences, and when we can be eliminated, the results tend to be more accurate.The random block design uses the method of group grouping to separate the variance caused by irrelevant variables. The randomized block design should be as homogeneous as possible within the group, making the difference of experimental results better attributed to the influence of different treatments.IxJ 20 factorial design with different randomized design: IxJ is applied to pick up the object of two kinds of processing factors, which belongs to the multi factor experimental design, analysis of effect of different factors on test effect by variance, interaction exists; randomized design was applied to the subjects of a region and processing factors. Group factors are often subjects certain characteristics of the object itself, is to improve the balance between the two groups and the establishment of the non treatment factors, is the group condition, using variance analysis to compare the differences among groups onlyThe type of chi square test and its application conditions are discussedOne. Four chi square data chi square test: two rates or two constituent ratio comparison. 1) completely random design four lattice table, data chi square test 2) paired design, four grid table, data chi square testTwo. Row X list data chi square test: comparison of multiple rates or constituent ratios. 1) comparison of multiple sample rates 2) comparison of two groups or groups of constituent ratios 3) data association testsThree. Multiple comparisons of rates: ask for a difference between the overall rates, using 22 comparisons between ratesFour. Frequency distribution goodness of fit test: the sample content n is large enough, and the theoretical frequency is not less than 5; if the degree of freedom is =1, continuity correction should be doneFive. A linear trend test is used to analyze whether there is a linear trend between multiple percentages and hierarchical variables22, the difference between parameter test and nonparametric test, and what are their advantages and disadvantages?The parameter test is a method that assumes that the random sample comes from a known distribution of population and that two or more population parameters are the same. In practice, many data do not meet the requirements of parameter statistics, then the parameter test can not be used.The nonparametric test does not explicitly limit the parameters of the population distribution in the testing hypothesis, nor does it involve the overall distribution of the samples. There are some: a wide range of use, less restrictive conditions, robustness23. Briefly describe the applicability of nonparametric testsIt can be used not only for the comparison and analysis of hierarchical data, but also for exploratory studies withextreme skewness, small sample population variance, general distribution pattern unknown, and data analysis without precise representation. But if the data is suitable for parameter testing, the non parametric test is used to analyze the loss information24. The difference and relation of line correlation and rank correlation are briefly describedDifferences: different data requirements; different application purposes; different indicators; different calculations; different range of values; different units.Relation: 1 and two have the same theoretical basis, and obtain the parameter estimate according to the least square principle;2. The same coefficient data, the regression coefficient B and the correlation coefficient r, the sign is consistent.3. The regression coefficient B is equivalent to the hypothesis test of correlation coefficient R. 4, the correlation was explained by regression.25. What problems should we pay attention to in the linear regression analysis?LINE four condition26. Briefly describe the differences and relations between linear regression and linear correlationThe analysis can describe the strain Y changes with the change of the independent variable X and linear regression, but didnot describe the close degree of the relationship between the two. Correlation analysis, no independent variables and variables should be divided, it only study the extent and nature of any two variables, or a variable and multiple variables the correlation between the degree, can not use one or more variables to predict, control changes in another variable.27 what are the indicators that describe the trend of concentration? What are the differences in its scope of application?Mean: normal or approximately normal distributionGeometric mean: geometric series or lognormal distributionMedian: skewed distribution of data rooms; irregular distribution; uncertain data at one or both ends (opening data).28 what is hypothesis testing? Examples can be illustrated.Firstly, to test the hypothesis, then random sampling was carried out under this assumption, the two extreme cases and statistics probability calculated, if the probability is small, then reject the hypothesis, if the probability is not a small probability, accept the hypothesis that the process became a hypothesis test.29 please talk about the hypothesis test conclusion.With the assumption that the inspection concluded according tothe principle of small probability time of the actual test can not occur, so when rejecting the hypothesis testing may reverse the type I error, when the tested hypothesis is likely to make the type II error.30 how can we control or reduce the sampling gap in sampling studies?A reasonable sampling design increases the sample size31 what is the sampling error? Why is sampling error inevitable in sampling studies?The difference between the sample statistics and the sample statistics, the sample statistics and the population parameters caused by the sampling. Because individual differences are objective, the object of study is a part of the whole, so it is inevitable that the results of this part are different from the overall results32 can we say that the smaller the p value of the hypothesis test is, the greater the difference between the two overall indicators? Why?No, because the magnitude of the difference between the p value and the overall index is not exactly the same. In addition to the size and overall size differences about the p value, and the sampling error is related to the size of the overall difference, the same, the sampling error of different sizes, the P will be different, the size of the sampling error in the actual work is mainly reflected in the sample size.33 in the rank sum test, why do the same data be given to the average rank sum between the different groups, while the same data in the same group do not have to calculate the average rank?Such rank does not affect the calculation of the rank sum of the two groups, or does not bias the calculation of the rank sum of the two sets.。
A Local Generalized Method of Moments EstimatorArthur Lewbel∗Boston CollegeJune2006AbstractA local Generalized Method of Moments Estimator is proposed for nonparametrically estimatingunknown functions that are defined by conditional moment restrictions.JEL Codes:C14,C13.Keywords:Generalized Method of Moments,Nonparametric Estimation,Semiparametric Estimation,Conditional Moments.Assume we observe a sample of observations of a random vector W.Let Z and X be random subvectors of W,so each contains some or all of the elements of W.Hansen’s(1982)Generalized Method of Moments (GMM)estimates a vector of parametersθfrom unconditional moments E d g(θ0,W)e=0given a known vector valued function g.Asymptotically efficient estimates ofθgiven conditional moments E[g(θ,W)|Z=z)]=0can be obtained using variants of GMM or empirical likelihood such as Newey(1993), Donald,Imbens,and Newey(2003),Otsu(2003),or Dominguez and Lobato(2004).Suppose that,in place of a parameter vectorθ,the goal is estimation of a vector of unknown functions q(X)where E[g(q0(X),W)|Z=z)]=0.Resulting nonparametric conditional moment estimators include Carrasco and Florens(2000),Newey and Powell(2003),and Ai and Chen(2003).These estimators require verification of relatively elaborate sets of regularity conditions.∗Department of Economics,Boston College,140Commonwealth Avenue,Chestnut Hill,MA02467USA.Tel:(617)-552-3678fax:(617)-552-2308email:lewbel@ url:/~lewbel1This paper focuses on the special case of the general nonparametric conditional moment problem where X and Z are the same,so we wish to estimate unknown functions q(Z)whereE[g(q0(Z),W)|Z=z)]=0.(1)While estimators such as Ai and Chen(2003)and other can be applied in this case,the local GMM estimator proposed here exploits the assumption that the unknown functions q depend on the same variables as the conditional moments,resulting in a short,simple set of regularity conditions.This local GMM estimator is straighforward to implement and is a natural extension of ordinary GMM estimation of the same model when Z is discrete.An example application is the nonparametric probit model W=I[q0(Z)+e≥0],where I()denotes the indicator function that equals one if its argument is true and zero otherwise,q0(z)is an unknown function to be estimated,and e is a standard normal independent of Z,or has some other known distribution.Then estimation could be based on equation(1)where the function g is g(q,w)=w−F e(q)where F e is the known cumulative distribution function of−e.Nonparametric censored or truncated regression would have a similar form.The local gmm estimator could also be used for estimation of Euler equations,which are mean zero conditional on information in a given time period,and may have parameters q0(z),such as preference parameters,that are unknown functions of observables.Another application is Lewbel(2006), who provides examples of a treatment model that generates conditional moments in the form of equation (1).The local GMM estimator provided here is very similar to the local nonlinear least squares estimator of Gozalo and Linton(2000),and in fact an early draft of their paper included a GMM variant.In their applications a local version of a parametric model is estimated,which converges at rate root n if the model is correct,but remains consistent as a nonparametric regression if the parametric model is misspecified (analogous to local polynomial regression,which is the special case of their estimator in which the para-metric model is a polynomial).In the present paper no parametric model is proposed.Instead,the intended estimand from the outset is the unknown vector of functions q0(z).To motivate the estimator,considerfirst the case where Z is discretely distributed,or more specifically, that Z has one or more mass points and we only wish to estimate q0(z)at those points.Letθz0=q0(z).If the distribution of Z has a mass point with positive probability at z,thenE[g(θz,W)I(Z=z)]E[g(θz,W)|Z=z]=2so equation(1)holds if and only if E[g(θz0,W)I(Z=z)]=0.It therefore follows that ifθz0is identified from the moment conditions(1),then we may under standard regularity conditions estimateθz0=q0(z)bythe ordinary GMM estimatorEθz=arg minθz n;i=1g(θz,W i) I(Z i=z) n n;i=1g(θz,W i) I(Z i=z)(2)for some sequence of positive definite n.If n is a consistent estimator of z0=E[g(θz0,W)g(θz0,W) I(Z= z)]−1,then standard efficient GMM gives√n(Eθz−θz0)→d N 0,v E t∂g(θz0,W)I(Z=z)∂θz u z0E t∂g(θz0,W)I(Z=z)∂θz uw−1Continue to assume that equation(1)holds and identifies q0(z)where g is known and q0is unknown, but now assume that Z is continuously distributed.The proposed local GMM estimator consists of applying equation(2)to this case of continuous Z by replacing averaging over just observations Z i=z with local averaging over observations Z i in the neighborhood of z.Assumption A1.Let Z i,W i,i=1,...,n,be an independently,identically distributed random sample of observations of the random vectors Z,W.The d vector Z is continuously distributed with density function f(Z).For given point z in the interior of supp(Z)having f(z)>0and a given vector valued function g(q,w)where g(q(z),w)is twice differentiable in the vector q(z)for all q(z)in some compact set (z), there exists a unique q0(z)∈ (z)such that E[g(q0(z),W)|Z=z]=0.Let n be afinite positive definite matrix for all n,as is =plim n→∞ n.Assumption A1provides the required moment condition structure and identification for the model,and Assumption A2below provides conditions for local averaging.Defineε[q(z),W],V(z),and R(z)by ε[q(z),W]=g(q(z),W)f(z)−E[g(q(z),W)f(Z)|Z=z]V(z)=E dε(q0(z),W)ε(q0(z),W)T|Z=z eR(z)=E t∂g[q0(z),W]∂q0(z)Tf(Z)|Z=z uAssumption A2.Letηbe some constant greater than2.Let K be a nonnegative symmetric kernel func-tion satisfying5K(u)du=1and5||K(u)||ηdu isfinite.For all q(z)∈ (z),E[||g(q(z),W)f(Z)||η|3Z=z],V(z),R(z),and V ar[[∂g(q(z),W)/∂q(z)]f(Z)|Z=z]arefinite and continuous at z andE[g(q(z),W)f(Z)|Z=z]isfinite and twice continuously differentiable at z.DefineS n(q(z))=1n;i=1g[q(z),W i]K t z−Z i uwhere b=b(n)is a bandwidth parameter.The proposed local GMM estimator isE q(z)=arg inf q(z)∈ (z)S n(q(z))T n S n(q(z))(3)THEOREM1:Given Assumptions A1and A2,if the bandwidth b satisfies nb d+4→0and nb d→∞,then E q(z)is a consistent estimator of q0(z)with limiting distribution(nb)1/2[E q(z)−q0(z)]→d N v0,(R(z)T R(z))−1R(z)T V(z) R(z)(R(z)T R(z))−1=K(u)2du]wTheorem1assumes a bandwidth rate that makes bias shrink faster than variance,and so is not meansquare optimal.One could instead choose the mean square optimal rate where nb d+4goes to a constant,but the resulting bias term would then have a complicated form that depends on the kernel regression biasesin both S n(q0(z))and its derivative with respect to q0(z),among other terms.Applying the standard two step GMM procedure,we mayfirst estimate H q(z)=arg inf q(z)∈ (z)S n(q(z))T S n(q(z)), then let n be the inverse of the sample variance of S n(H q(z))to get =V(z)−1,making(nb)1/2[E q(z)−q0(z)]→d N v0,(R(z)T R(z))−1=K(u)2du]wwhere R(z)can be estimated usingR n(z)=1n;i=1∂g[E q(z),W i]∂q(z)K t z−Z i uAt the expense of some additional notation,the two estimators(2)and(3)can be combined to handle Z containing both discrete and continuous elements,by replacing the kernel function in S n with the product of a kernel over the continuous elements and an indicator function for the discrete elements.4PROOF OF THEOREM1:DefineS n(q(z))=∂S n(q(z))∂q(z)T=1dn;i=1∂g[q(z),W i]∂q(z)TK t z−Z i uQ n(q(z))=S n(q(z))T n S n(q(z))Let S0(q(z))=plim n→∞S n(q(z))and similarly for S n and Q n.Assumptions A1and A2give sufficient conditions for consistency of these kernel estimators,so these probability limits exist andS0(q(z))=E[g(q(z),W)f(z)|Z=z]Q0(q(z))=S0(q(z))T S0(q(z)).Now consider consistency of E q(z).We have pointwise convergence of S n(q(z))to S0(q(z))and compact-ness of (z).It is also the case that|S n(q(z))|=O p(1),since|S n(q(z))|is a kernel estimator,and standard conditions have been provided for its consistency,that is,plim|S n(q(z))|=E[|∂g(q(z),W)/∂q(z)|f(Z)| Z=z].This suffices for stochastic equicontinuity,and therefore we have the uniform convergenceplim supq(z)∈ (z)|S n(q(z))−S0(q(z))|=0.It follows that Q n(q(z))also converges uniformly to Q0(q(z)).The assumptions provide compactness of (z)and imply continuity of Q0(q).The quadratic form of Q0is uniquely maximized at S0(q0(z))=0 and hence at q(z)=q0(z),so the standard conditions for consistency plim E q(z)=q0(z)are satisfied.For the limiting distribution,Taylor expanding thefirst order conditions as in standard GMM givesS n(E q(z))T n d S n(q0(z))+S n(H q(z))(E q(z)−q0(z))e=0where H q(z)lies between E q(z)and q0(z).By consistency of E q,the uniform convergence of S n,and using R(z)=S 0(q0(z)),this simplifies toR(z)T d S n(q0(z))+R(z)(E q(z)−q0(z))e=o p(1)Solving for E q(z)−q0(z)and multiplying by(nb)1/2gives(nb)1/2(E q(z)−q0(z))=(R(z)T R(z))−1R(z)T (nb)1/2S n(q0(z))+o p((nb)1/2).5Now S0(q0(z))=0and standard kernel regression limiting distribution theory gives(nb)1/2S n(q0(z))→d N[0,V(z)=K(u)2du]and the theorem follows.ReferencesAi,C.and X.Chen(2003),"Efficient Estimation of Models With Conditional Moment Restrictions Containing Unknown Functions,"Econometrica,71,1795-1844.Carrasco,M.and J.P.Florens(2000),"Generalization of GMM to a Continuum of Moment Conditions," Econometric Theory,16,797-834.Dominguez,M.and I.Lobato,(2004),"Consistent Estimation of Models Defined by Conditional Mo-ment Restrictions,"Econometrica,72,1601-1615.Donald,S.G.,G.W.Imbens,and W.K.Newey(2003),"Empirical Likelihood Estimation and Consis-tent Tests With Conditional Moment Restrictions,"Journal of Econometrics,117,55-93.Florens,J.-P.and L.Malavolti(2003),"Instrumental Regression with Discrete Variables,"unpublished manuscript..Hansen,L.,(1982),"Large Sample Properties of Generalized Method of Moments Estimators,"Econo-metrica,50,1029-1054.Lewbel,A.,(2006),"Estimation of Average Treatment Effects With Misclassification,"Econometrica, forthcoming.Newey,W.K.(1993),"Efficient Estimation of Models With Conditional Moment Restrictions,"in Handbook of Statistics,vol.11,ed.by G.S.Maddala,C.R.Rao,and H.D.Vinod,Amsterdam:North Holland,chapter16.Newey,W.K.and J.L.Powell,(2003),"Instrumental Variable Estimation of Nonparametric Models," Econometrica,711565-1578.Otsu,T.,(2003),"Penalized Empirical Likelihood Estimation of Conditional Moment Restriction Mod-els with Unknown Functions,"unpublished manuscript.6。
Copula’s Conditional Dependence Measures for Portfolio Management and Value at RiskDean FantazziniDepartment of EconomicsChair for Economics and Econometrics,University of Konstanz(Germany)Universitatsstrasse10,Box D124,Zip code78457Email:Dean.Fantazzini@uni-konstanz.deAbstractTraditional portfolio theory based on multivariate normal distribution assumes that investors can benefit from diversification by investing in assets with lower correlations.However,this is not what happens in reality,since it is quite easy to seefinancial markets with different correlations but almost the same numbers of market crashes(if we define market crash as when returns are in their lowest q th percentile).In a similar fashion,recent empirical studies show that in volatile periodsfinancial markets tend to be characterized by different level of dependence than occurs in quiet periods.In order to take into account this reality,we resort to copula theory and its conditional dependence measures,like Kendall’s Tau and Tail dependence.The former satisfies most of the desired properties that a dependence measure must have and it can detect non-linear association that correlation cannot see.Tail dependence refers to the dependence that arises between random variables from extreme observations.We consider a portfolio made up of the five most important future contracts actually traded in American markets and we take into consideration the most volatile period of the last decade,that is between March13th2000until June9th2000.We show how these conditional dependent measure can be easily implemented both in the traditional mean-variance framework and in multivariate VaR estimation,with a significant improvement over traditional multivariate correlation analysis.1IntroductionTraditional portfolio theory based on multivariate normal distribution assumes that investors can benefit from diversification by investing in assets with lower correlations.However this is not what happens in reality,since it is quite easy to seefinancial markets with different correlations but almost the same numbers of market crashes(if we define market crash as an event when returns are in their lowest q th percentile).Correlation is a good measure of dependence in multivariate normal distributions but it has several shortcomings:a)The variances of the random variables must befinite for the correlation to exist,and for fat-tailed distributions this cannot be the case(a bivariate t-distribution with2 degrees of freedom,for example);b)Independence between two random variables implies that linear correlation is zero,but the converse is true only for a multivariate normal distribution.This does not hold when only the marginals are Gaussian while the joint distribution is not normal,because correlation reflects linear association and not non-linear dependency;c)Correlation is not invariant to strictly monotone transformations.This is because it depends not only on the joint distribution but also on the marginal distributions of the considered variables,so that changes of scales or other transformations in the marginals have an effect on correlation.In order to overcome these problems we can resort to copula theory,since copulae capture those properties of the joint distribution which are invariant under strictly increasing transformation.A common dependence measure that can be expressed as a function of copula parameters and is scale invariant is Kendall’s tau.It satisfies most of the desired properties that a dependence measure must have(see Nelsen1999)and it measures concordance between two random variables:concordance arises if large values of one variable are associated with large values of the other,and small ones occur with small values of the other;if this is not true the two variables are said to be discordant.It is for this reason that concordance can detect nonlinear association that correlation cannot see.As asset log return distributions are not normally distributed,the minimization of the portfolio’s variance do not minimize portfolio risk and produce the wrong capital allocation.New risk measures have been proposed to obtain better capital allocations,but at the cost of simplicity and computa-tional tractability:this is why most applied professionals skip them and prefer to rely on previous methods,similar to otherfinancialfields(just think back to the Black&Sholes pricing formula and Garch(1,1),which are still by far the most used models for option pricing and volatility forecasting).In order to satisfy this demand for understandable models,we propose here to use Kendall’s tau dependence measure within the traditional mean-variance framework,in the place of the correlation coefficients:this solution has the advantage of keeping the model tractable but at the same time considering the non-linear dependency among the considered variables.In a similar fashion,recent empirical studies show that in volatile periodsfinancial markets tends to be characterized by different level of dependence than occurs in quiet periods.In order to take into account this reality,we propose to use the concept of Tail dependence,which refers to the dependence that arises between random variables from extreme observations.An important feature of copulae is that they allow for different degrees of tail dependence:Upper tail dependence exists when there is a positive probability of positive outliers occurring jointly,while lower tail dependence is symmetrically defined as the probability of negative outliers occurring jointly.What we propose is a direct consideration of this concept in VaR models by means of copula theory,as tail dependence coefficients can be calculated as simple functions of copulae parameters: if we follow the well-known RiskMetrics multiple positions VaR model,tail dependence coefficients can be used in the place of linear correlation coefficients.The same approach can be followed with Kendall’s Tau,too.What we do in this work is to analyze thefive most important future contracts actually traded in American markets(SP500,Dow Jones,Nasdaq100,Euro Dollar,T Bond Note)with high frequency data sampled at5–minutes frequency,taking into consideration the most volatile period of the last decade,that is between March13th2000till June9th2000.This period of time witnessed the falling of worldfinancial markets following the burst of the high-tech bubble,with big intraday draw down returns.Thus,this sample is perfectly suited to highlight the importance of the copula-based dependence approach compared to the traditional correlation analysis.We build up a portfolio and a multiple VaR position following both two approaches,using the initial part of the sample to estimate the assets’weights and the95%(99%)VaR,and the remaining part to compare the out-of-sample performances of the two approaches,both in terms of risk-returns measures and number of VaR exceedances of the effective portfolio losses.We show that our approach outperform the correlation based one both in terms of portfolio results and VaR back-testing.The rest of the paper is organized as follows.In Section2we provide an outline of copula theory while in Section3we present its conditional dependence measures,that is Kendall’s Tau and Tail Dependence;in Section4we introduce the main methods for copula estimation while in section 5we show how to use copula dependence measure within portfolio management and multivariate VaR estimation.Section6presents the empirical results on the asset allocation problem and VaR estimation for a portfolio offive future contracts actually traded in American markets.We conclude in Section7.2Copula Theory2.1Unconditional CopulasAn n-dimensional copula is basically a multivariate cumulative distribution function with uniform distributed margins in[0,1].Let consider X1,...X n to be random variables,W the conditioning variable and H the joint distribution function,and we further assume that the distribution function H has all the required derivatives,we have:Definition1(Unconditional copula):The unconditional copula of(X1,...X n),where X1∼F1,...X n∼F n,and F1...F n are continuous,is the joint distribution function of U1≡F1(X1) ...U n≡F n(X n).The variables U1...U n are the‘probability integral transforms’of X1...X n which follow a Uni-form(0,1)distribution,regardless of the original distribution,F.Thus a copula is a joint distribution of Unif(0,1)random variables.Proposition1(Properties of a unconditional copula):A copula is a function C of n vari-ables on the unit n-cube[0,1]n with the following properties:1.The range of C(u1,u2,...,u n)is the unit interval[0,1];2.C(u1,u2,...,u n)=0if any u i=0,for i=1,2,...,n.3.C(1,...,1,u i,1,...,1)=u i,for all u i∈[0,1]4.C(u1,u2,...,u n)is n-increasing in the sense that for every a≤b in[0,1]n the measure∆C b a assigned by C to the n-box[a,b]=[a1,b1]·...·[a n,b n]is nonnegative,that is∆C b a=(εi,...,εn)∈{0,1}n (−1)ni=1εiC(ε1a1+(1−ε1)b1,...,εn a n+(1−εn)b n)≥0This definition shows that C is a multivariate distribution function with uniformly distributed margins.Copulae have many useful properties,such as uniform continuity and(almost everywhere) existence of all partial derivatives,just to mention a few(see Nelsen(1999),Theorem2.2.4and Theorem2.2.7).Moreover,it can be shown that every copula is bounded by the so-called Fr´e chet-Hoeffding bounds,max(u1+...+u n–n+1,0)≤C(u1,u2,...,u n)≤min(u1,...,u n)which are commonly denoted by W and M.In two dimensions,both of the Fr´e chet-Hoeffding bounds are copulas themselves,but as soon as the dimension increases,the Fr´e chet-Hoeffding lower bound W is no longer n-increasing.However,the inequality on the left-hand side cannot be improved,since for any u from the unit n-cube,there exists a copula C u such that W(u)=C u(u)(see Nelsen(1999), Theorem2.10.12).We now present the Sklar’s theorem,which justifies the role of copulas as dependence functions: Theorem1:(Sklar’s theorem):Let H denote a n-dimensional distribution function with margins F1...F n.Then there exists a n-copula C such that for all real(x1,...,x n)H(x1,...,x n)=C(F(x1),...,F(x n))If all the margins are continuous,then the copula is unique.Moreover,the converse of the above statement is also true and it is the most interesting for multivariate density modeling,since it implies that we may link together any n≥2univariate distributions,of any type(not necessar-ily from the same family),with any copula in order to get a valid bivariate or multivariate distribution.Corollary1:Let F(−1)1,...,F (−1)n denote the generalized inverses of the marginal distribution functions,then for every(u1,...,u n)in the unit n-cube,exists a unique copula C:[0,1]×...×[0,1]→[0,1]such thatC(u1,...,u n)=H(F(−1)1(u1),...,F (−1)n(u n))See Nelsen(1999)for a proof,Theorem2.10.9and the references given therein.From this corollary we know that given any two marginal distributions and any copula we have a joint distri-bution.A copula is thus a function that,when applied to univariate marginals,results in a proper multivariate pdf:since this pdf embodies all the information about the random vector,it contains all the information about the dependence structure of its ing copulas in this way splits the distribution of a random vector into individual components(marginals)with a dependence structure(the copula)among them without losing any information.It is important to highlight that this theorem does not require F1and F n to be identical or even to belong to the same distribution family.By applying Sklar’s theorem and using the relation between the distribution and the density function,we can derive the multivariate copula density c(F1(x1),,...,F n(x n)),associated to a copula function C(F1(x1),,...,F n(x n)):f(x1,...,x n)=∂n[C(F1(x1),...,F n(x n))]∂F1(x1),...,∂F n(x n)·ni=1f i(x i)=c(F1(x1),...,F n(x n))·ni=1f i(x i)wherec(F1(x1),...,F n(x n))=f(x1,...,x n)ni=1f i(x i)·,(1)The copula density will be later used to estimate its parameters to real market data.2.2Conditional CopulasTime series analysis is often interested in random variables conditioned on some variables,this is why we now discuss how the existing results in copula theory may be extended to allow for conditioning variables.We consider here the bivariate case since it will be later used for empirical analysis.We will furthermore assume that the dimension of the conditioning variable,W,is1.Definition2(Conditional copula):The conditional copula of(X1,X2)|W,where X1|W∼F1and X2|W∼F2,is the conditional joint distribution function of U1≡F1(X|W)and U2≡F2(Y|W)given W.A two-dimensional conditional copula is derived from any distribution function such that the conditional joint distribution of thefirst two variables given the remaining variables is a copula for all values of the conditioning variables.It is simple to show that it has the following properties: Proposition1(Properties of a conditional copula):A two-dimensional conditional cop-ula has the following properties:1.It is a function C:[0,1]×[0,1]×W→[0,1]2.C(u1,0|w)=C(0,u2|w)=0,for every u1,u2in[0,1]and each w∈W3.C(u1,1|w)=u1and C(1,u2|w)=u2,for every u1,u2in[0,1]and each w∈W4.C(u1,u2|w)is grounded and n-increasing.For the proof see Patton(2003).It is easy to observe that in this case a two-dimensional condi-tional copula is the conditional joint distribution of two conditionally Uniform(0,1)random variables.Theorem2(Sklar’s theorem for continuous conditional distributions):Let F1be the conditional distribution of X1|W,F1be the conditional distribution of F1|W,and H be the joint conditional distribution of(X1,X2)|W.Assume that F1and F2are continuous in x1and x2.Then there exists a unique conditional copula C such thatH(x1,x2|w)=C(F1(x1|w),F2(x2|w)|w),Conversely,if we let F1be the conditional distribution of X1|W,F2be the conditional distri-bution of X2|W,and C be a conditional copula,then the function H defined by equation(1.3)is a conditional bivariate distribution function with conditional marginal distributions F and G(Patton, 2003).The Sklar’s theorem for conditional distributions implies that the conditioning variable(s), W,must be the same for both marginal distributions and the copula:if we do not use the same conditioning variable for F,G and C,the function H will not be,in general,a joint conditional distribution function.The only case whenˆH will be the joint distribution of(X1,X2)|(W1,W2)is when F(x1|W1)=F(x1|W1,W2)and F(x2,|W2)=F(x2,|W1,W2),that is when some variables affect the conditional distribution of one variable but not the other.It is straightforward to see that we can derive an equivalent to Corollary1for the conditional case,so that it is possible to extract the implied conditional copula from any bivariate conditional distribution.2.3Elliptical CopulasThe class of elliptical distributions provides useful examples of multivariate distributions because they share many of the tractable properties of the multivariate normal distribution.Furthermore, they allow to model multivariate extreme events and forms of non-normal dependencies.Elliptical copulas are simply the copulas of elliptical distributions.If we follow Fang,Kotz and Ng(1987),we have the following definition:Definition3(Elliptical distribution):let X be a n-dimensional vector of random variables and,for someµ∈ n and some n x n nonnegative definite symmetric matrixΣ,the characteristic functionϕX−µis a function of the of the quadratic form t TΣt,then X has an elliptical distribution with parameters(µ,Σ,ϕ)and we write X∼E n(µ,Σ,ϕ).We present now two copulae belonging to the elliptical family and that will be later used in empirical applications,the Gaussian and T-copula.2.3.1Gaussian copulaThe Gaussian copula is the copula of the multivariate normal distribution.In fact,the random vector X=(X1,...,X n)is multivariate normal if the univariate margins F1,...,F n are Gaussians,and the dependence structure among the margins is described by the following copula functionC(u1,...,u n;Σ)=ΦK(Φ−1(u1),...,Φ−1(u n);Σ)(2) whereΦk is the standard multivariate normal distribution function with linear correlation matrixΣandΦ−1is the inverse of the standard univariate Gaussian.When n=2,expression(??)can be written as:C(u1,u2;ρ)=Φ−1(u)−∞Φ−1(u)−∞12π2exp−(r2−2ρrs+s2)2(1−ρ2)dr ds,whereρis the linear correlation coefficient between the two random variables.The copula density is derived by applying equation(??):c(Φ(x1),...,Φ(x n))=f Gaussian(x1,...,x n)ni=1f Gaussiani(x i)=1(2π)n/2|Σ|1/2exp−12x Σ−1xni=11√2πexp−12x2i=1|Σ|1/2exp−12ζ (Σ−1−I)ζwhereζ=(Φ−1(u1),...,Φ−1(u K)) is the vector of the Gaussian univariate inverse distribution func-tions,and u i=Φ(x i).2.3.2T-copulaThe copula of the multivariate standardized t-Student distribution is the t-Student copula and is defined as follows:C(u1,...,u n;Σ,ν)=T R,ν(t−1ν(ˆu1),...,t−1ν(ˆu n)) (3) where TΣ,νis the standardized multivariate Student’s t distribution function,Σis the correlation matrix,νare the degrees of freedom,t−1ν(u)denotes the inverse of the Student’s t cumulative distribution function.When n=2,expression(??)can be written as:C Student st(u,v,ρ)=t−1ν(u)−∞t−1ν(v)−∞12π√2exp1+r2−2ρrs+s2)2(1−ρ2)dr ds,whereρis the linear correlation coefficient between the two random variables.The copula density is again derived by applying equation(??):c(t(υx1),...,t(υx n))=f Student(x1,...,x n)ni=1f Studenti(x i)=|Σ|−1/2Γυ+n2Γυ2Γυ2Γυ+12n1+ζ Σ−1ζυ−υ+n2ni=11+ζ2i2−υ+12,(4)whereζ=(t−1υ(u1),...,t−1υ(u K)) is the vector of the T-student univariate inverse distribution func-tions and.2.4Archimedean CopulasArchimedean copulae provide analytical tractability and a large spectrum of different dependence measure.These copulae can be used in a wide range of applications for the following reasons:a)The ease with which they can be constructed;b)The many parametric families of copulas belonging to this class;c)The great variety of different dependence structures;d)The nice properties possessed by the members of this class.An Archimedean copula can be defined as follows:Definition4(Archimedean copula):let consider a functionϕ:[0;1]→[0;1]which is continuous,strictly decreasingϕ’(u)<0,convexϕ”(u)>0,and for whichϕ(0)=∞andϕ(1)= 0.We then define the pseudo inverse ofϕ[−1]:[0;∞]→[0;1]such that:ϕ[−1](t)=ϕ−1(t)for0≤t≤ϕ(0)0forϕ(0)≤t≤∞Asϕis convex,the function C:[0;1]2→[0;1]defined asC(u1,u2)=ϕ−1[ϕ(u1)+ϕ(u2)](5) is an Archimedean copula andϕis called the“generator”of the copula.Moreover,ifϕ(0)=∞, the pseudo inverse describes an ordinary inverse function(that isϕ[−1]=ϕ(−1))and we callϕandC ,a strict generator and a strict Archimedean copula,respectively.The multivariate extension can be found in Embrechtrs,Lindskog and McNeil (2001)as well as in Joe (1997):for all n ≥2,the function C :[0;1]n →[0;1]defined as C (u 1,...,u n )=ϕ−1[ϕ(u 1)+...+ϕ(u n )],is an n -dimensional Archimedean copula if and only if ϕ−1is completely monotone on[0,∞).We present now some of the most important multivariate Archimedean copulas.1)Clayton (or Cook Johnson)copula :it corresponds to family B4of Joe(1997).Let consider the generator ϕ(t)=(t −α-1)/α,with α∈[-1,∞)\{0}and inverse ϕ−1(t)=(1+t)–1/α.By using (??)we get,C (u 1,...,u n )=max ⎡⎣N j =1u −αj −n +1 −1/α,0⎤⎦However,if α¿0then we have ϕ(0)=∞,and the above expression becomeC (u 1,...,u n )=N j =1u −αj −n +1 −1/α2)Gumbel copula:it corresponds to family B6of Joe(1997).The generator is ϕ(t)=(-ln t )α,with α≥1and the inverse is ϕ−1(t)=exp {-t 1/α}.We have,C (u 1,...,u n )=exp −[(−ln u 1)α+...+(−ln u N )α]1/α 3)Survival (or rotated)Gumbel copula:The rotation allows us to take a copula exhibit-ing greater dependence in the negative (positive)quadrant and create one with greater dependence in the positive (negative)quadrant.The distribution function in this case isC (u 1,...,u n )=Nj =1u j −n +1+exp −[(−ln(1−u 1)α+...+(−ln(1−u N )α]1/α 3Dependence MeasuresTraditional portfolio theory based on multivariate normal distribution assumes that investors can benefit from diversification by investing in assets with lower correlations;however,correlation is a good measure of dependence in multivariate normal distributions but it presents several shortcomings:a)The variances of X 1and X 2must be finite for the correlation to exist,and for fat-tailed distribu-tions this cannot be the case (a bivariate t -distribution with 2degrees of freedom,for example);b)Independence between two random variables implies that linear correlation is zero,but the converse is true only for a multivariate normal distribution.This does not hold when only the marginals are Gaussian while the joint distribution is not normal,because correlation reflects linear association and not non-linear dependency ;c)Correlation is not invariant to strictly monotone transformations.This is because it depends not only on the joint distribution but also on the marginal distributions of the considered variables,so that changes of scales or other transformations in the marginals have an effect on correlation.d)Given the marginal distributions F 1and F 2for two random variables X and Y ,all linear correlations between –1and +1cannot be attained through suitable specification of the joint distribution F (see H¨o ffding,1940).Moreover,there are also statistical problems with correlation,as a single observation can have an arbitrarily high influence on the linear correlation.Thus linear correlation is not a robust measure.3.1Kendall’s TauIn order to avoid the above problems,we have to turn to rank correlation;however,wefirst need to define the concepts of concordance and discordance:Definition5(Concordance):Observations(x i;y i)and(x j;y j)are concordant,if x i<y i and y i<y i,or if x i>x j and y i>y j.Analogously,(x i;y i)and(x j;y j)are discordant if x i<y i and y i>y j or if x i>x j and y i<y j.Alternatively,(x i;y i)and(x j;y j)are concordant,if(x i-x j)(y i−y j)>0and discordant if(x i x j)(y i y j)<0.Concordance arises if large values of one variable are associated with large values of the other, and small ones occur with small values of the other;if this is not true the two variables are said to be discordant.It is for this reason that concordance can detect nonlinear association that correlation cannot see.If we have(X1;Y1)and(X2;Y2)independent and identically distributed random vectors,the population version of Kendall’s tauτ(X;Y)rank correlation is defined byτ(X,Y)=P[(X1−X2)(Y1−Y2)>0]−P[(X1−X2)(Y1−Y2)<0]that is,τ(X,Y)is an estimate of the probability of concordance minus the probability of discor-dance.The sample version of the measure of dependence known as Kendall’s tau is defined in terms of concordance as follows:Definition6(Kendall’s Tau-sample version):Let us denote a random sample of n observations{(X1,Y1);(X2,Y2),....,(X n;Y n)}from a vector(X,Y)of continuous random variables.There are distinct pairs(X i;Y i)and(X j;Y j)of observations in the sample,and each pair is either concordant or discordant;if we call c the number of concordant pairs,and d the number of discordant pairs,then an estimate of Kendall’s rank correlation for the sample is given byτ=c−dc+d =(c−d)/n2,which can be alternatively expressed as follows ˆτ=n2−1i<jsign[(X i−X j)(Y i−Y j)]τ(X,Y)can be considered a measure of the degree of monotonic dependence between X and Y,whereas linear correlation measures the degree of linear dependence only.The generalisation of Kendall’s Tau to n¿2dimensions is analogous to the procedure for linear correlation,where we have a n x n matrix of pairwise correlation.The main properties of this dependence measure are reported here:Theorem3(Kendall’s Tau properties):Let X and Y be random variables with continuous distributions F1and F2,joint distribution H and copula C.The following are true:(1)τ(X,Y)=τ(Y,X).(2)If X and Y are independent thenτ(X,Y)=0.(3)–1≤τ(X,Y)≤+1(4)For T:R→R strictly monotonic on the range of X,τ(X,Y)satisfiesτ(T(X),Y)=τ(X,Y)if T increasing −τ(X,Y)if T decreasingProof:(1)to(3)follows from the fact thatτ(X,Y)is a measure of concordance,while for (4)see Embrechts,McNeil and Straumann(1999)and references therein.Kendall’s Tau can be expressed in terms of copula functions,thus simplifying calculus.Proposition 2(Kendall’s Tau for copulae):τ(X,Y )=410 10C (u,v )dC (u,v )−1Proof:See Nelsen (1999).Evaluating Kendall’s Tau requires the evaluation of a double integral and for elliptical distri-butions this is not an easy task:this problem was solved by Lindskog,McNeil,and Schmock (2002)who proved this theorem:Proposition 3(Kendall’s Tau for elliptical distributions):Let X ∼E n (µ,Σ,ϕ).If i ,j ∈{1,...,p }satisfy P {X i =µi }¡1and P {X j =µj }¡1,thenτ(X i ,X j )= 1− x ∈R(P {X i =x })2 2πarcsin ρij (6)where the sum extends over all atoms of the distribution of X i and ρij is the correlation coefficient.If in addition rank(Σ)≥2,then (??)simplifies toτ(X i ,X j )= 1−(P {X i =µi })2 2πarcsin ρij which further simplifies if P {X i =µi }=0,τ(X,Y )=2πarcsin ρxy (7)Proof:See Lindskog,McNeil,and Schmock (2002)For an Archimedean copula the situation is simpler,in that τ(X,Y )can be evaluated directly from the generator of the copula.Archimedean copulas are easy to work with because expressions with a function of only one argument (the generator)can often be employed rather than expressions with a function of two arguments (the copula).Proposition 4(Kendall’s Tau for Archimedean Copulae):Let X and Y be random variables with an Archimedean copula C having generator ϕ.The population version of τ(X,Y )for X and Y is given byτ(X,Y )=1+410ϕ(t )ϕ (t )dt Proof:See Genest and MacKay (1986)Straightforward calculations show that for Clayton and Gumbel copulae we have the following results:Clayton:τ(X,Y )=α/(α+2)Gumbel =Rotated gumbel =τ(X,Y )=1-α−13.2Tail DependenceAs we previously saw,linear correlation present many problems,mainly when we work with heavy-tailed distributions.Copulas functions present the nice property to be able to model Tail dependence:this measure refers to the dependence that arises between random variables from extreme observa-tions;it is easy to see that in volatile periods financial markets tends to be characterized by different level of dependence than occurs in quite periods.An important feature of copulae is that they allowfor different degrees of tail dependence:Upper tail dependence exists when there is a positive prob-ability of positive outliers occurring jointly,while lower tail dependence is symmetrically defined as the probability of negative outliers occurring jointly.Definition7(Upper tail dependence):Let(X1,X2)be a bivariate vector of continu-ous random variables with marginal distribution functions F1and F2,the coefficient of upper tail dependence of X and Y is:λU=limu→1Pr[X2>F−12(u)|X1>F−11(u)]=limu→1Pr[X1>F−11(u)|X2>F−12(u)]=limu→1Pr[X2>F−12(u),X1>F−11(u)]Pr[X1>F−11(u)]=limu→11−Pr[X1≤F−11(u)]−Pr[X2≤F−12(u)]+Pr[X2≤F−12(u),X1≤F−11(u)]1−Pr[X1≤F−11(u)]=limu→1(1−2u+C(u,u))(1−u)=λUprovided a limitλU∈[0,1]exists.Here,F−1(u)=inf{x|F(x)≥u},u∈(0,1).X1and X2are said to be asymptotically dependent in the upper tail ifλU∈(0,1]and asymp-totically independent ifλU=0.Upper tail dependence exists when there is a positive probability of positive outliers occurring jointly.Definition8(Lower tail dependence):Let(X1,X2)be a bivariate vector of continu-ous random variables with marginal distribution functions F1and F2,the coefficient of lower tail dependence of X1and X2is:λL=limu→0Pr[X2≤F−12(u)|X1≤F−11(u)]=limu→0Pr[X1≤F−11(u)|X2≤F−12(u)]=limu→0C(u,u)u=λLprovided a limitλL∈[0,1]exist.X1and X2are said to be asymptotically dependent in the lower tail ifλL∈(0,1]and asymptotically independent ifλL=0.Lower tail dependence exists when there is a positive probability of negative outliers occurring jointly.3.2.1Tail dependence for Elliptical copulae.When dealing with copulae without closed form expressions,such as the Gaussian and the Student’s t copula,an alternative version is provided by Embrechts,Lindskog and McNeil(2003).First by applying De l‘Hopital theorem,we obtain:λU=limu→1(1-2u+C(u,u))/(1-u)=−limu→1−2+∂∂x1C(x1,x2)|x1=x2=u+∂∂x2C(x1,x2)|x1=x2=vThen,by the definition of copula function and by the expression of the copula density,we have:P(V≤v|U=u)=∂∂u C(u,v),P(U≤u|V=v)=∂∂vC(u,v)P(V>v|U=u)=1−∂∂u C(u,v),P(U>u|V=v)=1−∂∂vC(u,v)Rearranging the expression above we obtain:λU=limu→1[P(V>u|U=u)]+limu→1[P(U>u|V=u)]In the case of exchangeable copulas(i.e.C(u,v)=C(v,u))we can simplify as follows:λU=2limu→1[P(V>u|U=u)]。
4Variables The word variable is derived from the Latin variare, to vary. A variable gives us a method for associating a concept like income with a set of data. Variables are used together with operators in the syntactical portion of our specification language (e.g.,death*birth).In older statistical graphics systems, a variable refers to a column in a rect-angular cases-by-variables file. Many statistical packages assume rows are cases or observations that comprise instances or samples of a variable repre-sented by the column. There is nothing in our definition of a variable which requires it to represent a row or column, however. The only requirement is that the variable mapping function return a single value in the range for every in-dex. This generality is especially important for graphing geometric and spatial data (Cressie, 1991). But it affects the way we approach other data as well. In a landmark book, now out of print and seldom read by statisticians, Coombs (1964) examined the relationship between structural models and patterns of data. Like Guttman (1971, 1977), Coombs believed that the prevalent practice of modeling based on cases-by-variables data layouts often prevents research-ers from considering more parsimonious structural theories and keeps them from noticing meaningful patterns in their data.In computer languages, a variable is a symbol for a data structure or con-tainer. The data contained in the structure are assumed to vary from some state of a program to another. The index function for a variable in a computer lan-guage is often called an address. Some computer languages type variables (logicals, strings, integers, reals, etc.) by confining their ranges to one of these or some other data types. Typing can prevent undefined or nonsensical results with some operations (adding apples and oranges), but there is nothing in our definition of a variable to require typing based on data classes. Some opera-tions on variables do not require type compatibility. We do require that scales (see Chapter 6) map sets of variables to a common range, however. Otherwise, some coordinate transformations covered in Chapter 9 would not be possible.56 4 Variables4.1TransformsTransforms are transformations on variables. One purpose of transforms is to make statistical operations on variables appropriate and meaningful. Another is to create new variables, aggregates, or other types of summaries.Table 4.1 shows several variable transforms. This sample list is intended to cover the examples in this book and to be a template for designing the sig-nature and behavior of new functions. We omit ordinary algebraic operators and standard functions here, but will use them elsewhere in the book.Table 4.1 Variable TransformsThe mathematical functions return values that are case-by-case transfor-mations of the values in x, where x is a varset. The log function returns natural logarithms. The exponential and trigonometric functions are standard. The sign() function returns 1 if a value of x is positive and –1 if it is negative. The pow() function transforms x to the p th power of itself.The statistical functions compute basic statistics. The mean,median, and mode fill each value in the column with the corresponding column summary statistic. The residual function returns the residuals of a regression of y on x. It has several methods for regression type, with linear being the default. The sort function orders all variables in a variable set according to the sorted order of x. The rank function fills a column with the ranks of x, assigning fractional ranks when there are ties. The prank function fills each case in the column with the value (i–.5)/n for i= 1, ... , n cases assuming (as if) the column were sorted on x. The cut function cuts a sorted column into k groups, replacing the value of the argument with an integer from 1 to k. The zinv function replaces a value with the inverse normal cumulative distribution function value. The lag function replaces value x i with x i-1, setting the first value in the column to missing. It has an optional second argument, a positive or negative integer4.2 Examples57 specifying the amount and direction of shifting. The grpfn function is a routinethat evaluates the function f named between the quotes separately for each of the groups specified by the g argument in the parameter list.The multivariate functions use more than one variable to construct a newvariable. The sum function fills a column with the sum of values of its argu-ments. The diff function does the same for differences. The prod and quotient functions compute products and ratios. The influence function computes an in-fluence function (see Figure 10.34). The miss function imputes missing values assuming a function named f (see Figure 20.1).4.2ExamplesThe following examples have been constructed to show that transforms are more powerful than simple recodings of variables. Together with data func-tions, transforms can help us create graphics that employ structures unavail-able to ordinary graphics systems.4.2.1 SortingWe once met a statistician who worked for the FBI. He had helped uncoverthe Chicago Machine’s voting fraud in the 1970’s. We were curious about the methods the Federal team had used to expose the fraud, so we asked him about discriminant analysis, logistic regression, and other techniques the statisti-cians might have used. He replied, “We sorted the voter tape and looked for duplicate names and addresses.” This statistical methodology may have been inspired by the fabled Chicago Machine slogan, “Vote early and often.”Sorting is one of the most elementary and powerful methods of statistical and graphical analyses. A sort is a one-to-one transformation. When we use position to represent the values of a variable, we are implicitly sorting thosevalues. Sorting variables displayed by position not only reveals patterns but also makes it easier to make comparisons and locate subsets. Sorting categor-ical variables according to the values of associated numerical variables can it-self constitute a graphical method.Figure 4.1 shows an example. The data are numbers of arrests by sex re-ported to the FBI in 1985 for selected crimes. The graphic displays the differ-ences in proportions of each crime committed by males and females. To create the graphic, we must standardize the data within crime category (dividing by the row totals). The first two TRANS statements accomplish this. The next TRANS statement creates an mf varset consisting of the difference in crime pro-portions. We sort this varset and then plot it against the crime categories. The pattern of the dots indicates that males predominate in almost every crime cat-egory except vice and runaways. The largest biases are in the violent crimes. Rape, not surprisingly, is almost exclusively male. For brevity, we have omit-ted the GPL for the bipolar arrow annotation at the bottom.584 Variables TRANS :total =sum (male, female )TRANS :m =quotient (male, total )TRANS :f =quotient (female, total )TRANS :mf =diff (m, f )TRANS :mf =sort (mf )ELEMENT : point (position (mf*crime ))Figure 4.1 Gender differences in crime patterns4.2.2 Probability PlotsProbability plots compare the n ordered values of a variable to the n -tiles of a specified probability distribution. If we have 100 values, for example, we base a normal probability plot on the pairs (x 1,z 1), (x 2,z 2), ... , (x 99,z 99), where x i are the data values ordered smallest to largest and z i is the lower 100D percent-age point of the standard normal probability distribution. To fit all 100 values,we change the computation of D from i /100 to (i -.5)/100. If a probability plot follows roughly a diagonal straight line, then we infer that the shape of our sample distribution is approximately normal.Figure 4.2 shows a probability plot of military expenditures from the countries data. Our first transformation is prank (), a proportional-rank that computes the value (i –.5)/n , i = 1 , ... n , corresponding to the values x i after an ascending sort. Instead of reordering the data, however, prank () returns the values in the original data sequence. Next, the zinv () function computes the values of the cumulative standard normal distribution corresponding to the points produced by prank (). In other words, we apply the zinv function to the prank () values to get the theoretical normal variables we plot the data againstPRUH IHPDOH PRUH PDOH4.2 Examples59 (students usually have difficulty with this two-fold indirection). The left panel of Figure 4.2 shows the raw military values and the right panel shows the ploton a log10 scale. Logging straightens out the plot, which is another way of say-ing that it transforms the distribution from a positively skewed shape to a more normal one.A probability plot can also be produced through a probability scale trans-formation. We will show an example in Figure 6.13. We compute prank() and then rescale the D values through a probability scale transformation. The method in this chapter produces a new variable zinv(), while the scale methodin Chapter 6 changes only the scale on which prank() is plotted.TRANS:alpha=prank(military)TRANS:z=zinv(alpha)ELEMENT: point(position(military*z))TRANS:alpha=prank(military)TRANS:z=zinv(alpha)SCALE: log(dim(1),base(10))ELEMENT: point(position(military*z))Figure 4.2Probability plots of military expenditures4.2.3 Aggregating VariablesFigure 4.3 shows a box plot (see Section 8.1.1.6) of the birth rate data. Thisplot divides a variable into a set of fractiles and then displays the variation in each fractile against the median value of the variable within that fractile. It can be used to determine whether we need a transformation for heteroscedastic data, in which the spread increases (or decreases) proportionally to some pow-er of the location. See Section 15.3.1.1 for a similar plot.60 4 VariablesThe grpfun() function computes a function (median) on a variable (birth) within each separate group defined by a grouping variable (quartile). Notice that the frame crosses the same variable (birth) with itself, but the position() function for the box graph selects the values of birthquart to fix the location of each box.TRANS:quartile =cut(birth, 4)TRANS:birthquart =grpfun(birth,quartile,"median")ELEMENT: schema(shape(shape.box), position(birthquart*birth)) ArrayFigure 4.3 Box plot of a variable against its quartile medians4.2.4 Regression ResidualsFigure 4.4 shows a linear regression residuals plot. The independent variable in the regression is the birth rate variable from the countries data and the de-pendent variable is death rate. The residuals are calculated from this regres-sion, standardized, and then used as the dependent variable in the frame model for Figure 4.4. The U-shaped pattern of residuals suggests that the linear mod-el is not a good representation of the relationship between death rates and birth rates for the countries.One might ask whether statistical procedures have a useful role within a graphical system. Two alternatives are traditional graphics packages, which receive pre-calculated data from spreadsheets and statistics packages, and tra-ditional statistics packages, which send their calculations to basic graphics sub-systems. Both of these are compromises, however. Obviously, a graphics system should not incorporate the entire range of procedures within a statisti-cal package; this would reduce its focus. However, there are analyses that canbe performed only when graphics and statistics are intimately tied.4.3 Sequel61Exploratory statistical packages such as Data Desk implement such a model. Embedding statistical procedures such as regression and correlation in a graphics system provides functions that are unavailable in other software. In the end, the grammar of graphics has much to do with the grammar of statis-tics. We will discuss this further at the end of Chapter 7.TRANS:residual =residual.linear.student(birth, death)ELEMENT: point(position(birth*residual)) ArrayFigure 4.4 Studentized residual plot4.3SequelNow that we have variables to describe relationships through links to data, we need a system for expressing those relationships. The next chapter covers theformal definitions for varset algebra.。
面对倒挂的收益率曲线全球如何看待美国倒挂的收益率曲线,它将对金融市场带来何等影响?Ahead of the CurveBy John Rubino译文源于CFA Magazine /March-April 2007在全球经济衰退的过程中,我们发现了一件有意思的事情(事实上,我们发现了很多很多有意思的事情)。
虽然欧洲与美国短期利率陡然增高,美国国债出现了完全倒挂的收益率曲线,但是全球经济依然繁荣。
(见表1)股票价格在上升,并购活动不断创记录,债券市场规模也飞速发展。
热钱正不断从一种令人吃惊的投资渠道转向另一种。
它们创造着财富,也带来了烦恼,不过主要以创造财富为主。
对于任何领域的投资人来说,当今的世界就是一个很大的糖果商店。
然而这些现象从历史上来看似乎不应该如此。
根据美联储经济学家Arturo Estrella和Frederic Mishkin的研究,美国完全倒挂的收益率曲线(也就是三个月期限的国债的收益率要高于十年期国债)应该预示着衰退,而不是繁荣。
这两名经济学家在一份2005年的报告中指出:“从1950年起,收益率曲线判断正确了几乎所有的经济衰退,除了有一个‘错误’的信号,那是在1967年的信贷紧缩与生产力下降之前。
”收益率曲线的理论逻辑十分简单:银行是现代经济体系中货币流动性的传统来源。
它们的运作模式是通过短期定期存款和存款证向公众借钱,然后以长期的汽车或房屋抵押贷款的方式向公众贷款。
所以,利率差异越大,也就是长期与短期的利率差额越大,银行越赚钱,并且会有更大的积极性去借出资金。
但是,如果把收益率曲线反转,所谓的利率差异就会消失,银行会发现它们在储蓄上所付出的利息要高于它们从贷款上得到的利润。
因此,它们就开始惜贷,最终导致经济增长停止。
对于全球经济来说,美国倒挂的收益率曲线是特别不利的,因为美国的消费是出口型经济体比如亚洲、拉丁美洲和中东等地区经济发展的关键驱动。
美国的信贷紧缩对于全球的每个人来说都是一种痛。
Aabscissa 横坐标absence rate 缺勤率absolute number 绝对数absolute value 绝对值accident error 偶然误差accumulated frequency 累积频数alternative hypothesis 备择假设analysis of data 分析资料analysis of variance(ANOV A) 方差分析arith-log paper 算术对数纸arithmetic mean 算术均数assumed mean 假定均数arithmetic weighted mean 加权算术均数asymmetry coefficient 偏度系数average 平均数average deviation 平均差Bbar chart 直条图、条图bias 偏性binomial distribution 二项分布biometrics 生物统计学bivariate normal population 双变量正态总体Ccartogram 统计图case fatality rate(or case mortality) 病死率census 普查chi-sguare(X2) test 卡方检验central tendency 集中趋势class interval 组距classification 分组、分类cluster sampling 整群抽样coefficient of correlation 相关系数coefficient of regression 回归系数coefficient of variability(or coefficieut of variation) 变异系数collection of data 收集资料column 列(栏)combinative table 组合表combined standard deviation 合并标准差combined variance(or poolled variance) 合并方差complete survey 全面调查completely correlation 完全相关completely random design 完全随机设计confidence interval 可信区间,置信区间confidence level 可信水平,置信水平confidence limit 可信限,置信限constituent ratio 构成比,结构相对数continuity 连续性control 对照control group 对照组coordinate 坐标correction for continuity 连续性校正correction for grouping 归组校正correction number 校正数correction value 校正值correlation 相关,联系correlation analysis 相关分析correlation coefficient 相关系数critical value 临界值cumulative frequency 累积频率Ddata 资料degree of confidence 可信度,置信度degree of dispersion 离散程度degree of freedom 自由度degree of variation 变异度dependent variable 应变量design of experiment 实验设计deviation from the mean 离均差diagnose accordance rate 诊断符合率difference with significance 差别不显著difference with significance 差别显著discrete variable 离散变量dispersion tendency 离中趋势distribution 分布、分配Eeffective rate 有效率eigenvalue 特征值enumeration data 计数资料equation of linear regression 线性回归方程error 误差error of replication 重复误差error of type II Ⅱ型错误,第二类误差error of type I Ⅰ型错误,第一类误差estimate value 估计值event 事件experiment design 实验设计experiment error 实验误差experimental group 实验组extreme value 极值Ffatality rate 病死率field survey 现场调查fourfold table 四格表freguency频数freguency distribution 频数分布GGaussian curve 高斯曲线geometric mean 几何均数grouped data 分组资料Hhistogram直方图homogeneity of variance 方差齐性homogeneity test of variances 方差齐性检验hypothesis test 假设检验hypothetical universe 假设总体Iincidence rate 发病率incomplete survey 非全面调检indepindent variable 自变量indivedual difference 个体差异infection rate 感染率inferior limit 下限initial data 原始数据inspection of data 检查资料intercept 截距interpolation method 内插法interval estimation 区间估计inverse correlation 负相关Kkurtosis coefficient 峰度系数Llatin sguare design 拉丁方设计least significant difference 最小显著差数least square method 最小平方法,最小乘法leptokurtic distribution 尖峭态分布leptokurtosis 峰态,峭度linear chart 线图linear correlation 直线相关linear regression 直线回归linear regression eguation 直线回归方程link relative 环比logarithmic normal distribution 对数正态分布logarithmic scale 对数尺度lognormal distribution 对数正态分布lower limit 下限Mmatched pair design 配对设计mathematical statistics 数理统计(学)maximum value 极大值mean 均值mean of population 总体均数mean square 均方mean variance 均方,方差measurement data 讲量资料median 中位数medical statistics 医学统计学mesokurtosis 正态峰method of least squares 最小平方法,最小乘法method of grouping 分组法method of percentiles 百分位数法mid-value of class 组中值minimum value 极小值mode 众数moment 动差,矩morbidity 患病率mortality 死亡率Nnatality 出生率natural logarithm 自然对数negative correlation 负相关negative skewness 负偏志no correlation 无相关non-linear correlation 非线性相关non-parametric statistics 非参数统计normal curve 正态曲线normal deviate 正态离差normal distribution 正态分布normal population 正态总体normal probability curve 正态概率曲线normal range 正常范围normal value 正常值normal kurtosis 正态峰normality test 正态性检验nosometry 患病率null hypothesis 无效假设,检验假设Oobserved unit 观察单位observed value 观察值one-sided test 单测检验one-tailed test 单尾检验order statistic 顺序统计量ordinal number 秩号ordinate 纵坐标Ppairing data 配对资料parameter参数percent 百分率percentage 百分数,百分率percentage bar chart 百分条图percentile 百分位数pie diagram 园图placebo 安慰剂planning of survey 调查计划point estimation 点估计population 总体,人口population mean 总体均数population rate 总体率population variance 总体方差positive correlation 正相关positive skewness 正偏态power of a test 把握度,检验效能prevalence rate 患病率probability 概率,机率probability error 偶然误差proportion 比,比率prospective study 前瞻研究prospective survey 前瞻调查public health statistics 卫生统计学Qquality eontrol 质量控制quartile 四分位数Rrandom 随机random digits 随机数字random error 随机误差random numbers table 随机数目表random sample 随机样本random sampling 随机抽样random variable 随机变量randomization 随机化randomized blocks 随机区组,随机单位组randomized blocks analysis of variance 随机单位组方差分析randomized blocks design 随机单位组设计randomness 随机性range 极差、全距range of normal values 正常值范围rank 秩,秩次,等级rank correlation 等级相关rank correlation coefficent 等级相关系数rank-sum test 秩和检验rank test 秩(和)检验ranked data 等级资料rate 率ratio 比recovery rate 治愈率registration 登记regression 回归regression analysis 回归分析regression coefficient 回归系数regression eguation 回归方程relative number 相对数relative ratio 比较相对数relative ratio with fixed base 定基比remainder error 剩余误差replication 重复retrospective survey 回顾调查Ridit analysis 参照单位分析Ridit value 参照单位值Ssample 样本sample average 样本均数sample size 样本含量sampling 抽样sampling error 抽样误差sampling statistics 样本统计量sampling survay 抽样调查scaller diagram 散点图schedule of survey 调查表semi-logarithmic chart 半对数线图semi-measursement data 半计量资料semi-guartile range 四分位数间距sensitivity 灵敏度sex ratio 性比例sign test 符号检验significance 显著性,意义significance level 显著性水平significance test 显著性检验significant difference 差别显著simple random sampling 单纯随机抽样simple table 简单表size of sample 样本含量skewness 偏态slope 斜率sorting data 整理资料sorting table 整理表sources of variation 变异来源square deviation 方差standard deviation(SD) 标准差standard error (SE) 标准误standard error of estimate 标准估计误差standard error of the mean 均数的标准误standardization 标准化standardized rate 标化率standardized normal distribution 标准正态分布statistic 统计量statistics 统计学statistical induction 统计图statistical inference 统计归纳statistical map 统计推断statistical method 统计地图statistical survey 统计方法statistical table 统计调查statistical test 统计表statistical treatment 统计检验stratified sampling 统计处理stochastic variable 分层抽样sum of cross products of 随机变量deviation from mean 离均差积和sum of ranks 秩和sum of sguares of deviation from mean 离均差平方和superior limit 上限survival rate 生存率symmetry对称(性)systematic error 系统误差systematic sampling 机械抽样Tt-distribution t分布t-testt检验tabulation method 划记法test of normality 正态性检验test of one-sided 单侧检验test of one-tailed 单尾检验test of significance 显著性检验test of two-sided 双侧检验test of two-tailed 双尾检验theoretical frequency 理论频数theoretical number 理论数treatment 处理treatment factor 处理因素treatment of date 数据处理two-factor analysis of variance 双因素方差分析two-sided test 双侧检验two-tailed test 双尾检验type I error 第一类误差type II error 第二类误差typical survey 典型调查Uu test u检验universe 总体,全域ungrouped data 未分组资料upper limit 上限Vvariable 变量variance 方差,均方variance analysis 方差分析variance ratio 方差比variate 变量variation coefficient 变异系数velocity of development 发展速度velocity of increase 增长速度Wweight 权数weighted mean 加权均数Zzero correlation 零相关unit onedivision of labor 劳动分工commodity money 商品货币legal tender 法定货币fiat money 法定通货a medium of exchange交换媒介legal sanction法律制裁face value面值liquid assets流动资产illiquidl assets非流动资产the liquidity scale 流动性指标real estate 不动产checking accounts,demand deposit,checkable deposit 活期存款time deposit 定期存款negotiable order of withdrawal accounts 大额可转让提款单money market mutual funds 货币市场互助基金repurchase agreements 回购协议certificate of deposits存单bond 债券stock股票travelers\'checks 旅行支票small-denomination time deposits小额定期存款large-denomination time deposits大额定期存款bank overnight repurchase agreements 银行隔夜回购协议bank long-term repurchase agreements 银行长期回购协议thrift institutions 存款机构financial institution 金融机构commercial banks商业银行a means of payment 支付手段a store of value储藏手段a standard of value价值标准unit tworeserve 储备note 票据discount贴现circulate流通central bank 中央银行the Federal Reserve System联邦储备系统credit union 信用合作社paper currency 纸币credit creation 信用创造branch banking 银行分行制unit banking 单一银行制out of circulation 退出流通capital stock股本at par以票面价值计electronic banking电子银行banking holding company 公司银行the gold standard金本位the Federal Reserve Board 联邦储备委员会the stock market crash 股市风暴reserve ratio 准备金比率unit threedeficit 亏损roll展期wholesale批发default不履约auction拍卖collateralize担保markup价格的涨幅dealer交易员broker经纪人pension funds 养老基金face amount面值commerical paper商业票据banker\'s acceptance银行承兑汇票Fed fund 联邦基金eurodollar欧洲美元treasury bills 国库券floating-rate 浮动比率fixed-rate 固定比率default risk 拖欠风险credit rating信誉级别tax collection税收money market货币市场capital market资本市场original maturity 原始到期期限surplus funds过剩基金回复引用举报返回顶部2#admin发表于2006-11-24 08:49 |只看该作者unit four管理员premium 升水 discount 贴水par 平价deficit 赤字future 期货capital movements 资本流动foreign exchange dealings 外汇交易balance of payment 国际收支eurodollar market 欧洲美元市场spot rate 即期汇率 forward rate 远期汇率 cross rate 交叉汇率 arbitrage transation 套汇交易space arbitrage 地点套汇 time arbitrage 时间套汇 interest arbitrage 套利 direct quotation 直接标价法indirect quotation 间接标价法decimal system 十进制 long position 多头 short position 空头 swedish kronor 瑞典克郎 Sfr 瑞士法郎DM 德国马克FFr 法国法郎Dkr 丹麦克郎Nkr 挪威克郎Yen 日元Can $加拿大元 £英镑Lit 意大利里拉 Aus $澳大利亚元 DG 荷兰盾BF 比利时法郎unit fivethe society for worldwidetelecommunication(SWIFT)环球银行金融电讯协会the clearing houseinterbank paymentssystem(CHIPS)纽约银行同业清算系统over-the-counter market场外交易市场invoice发票,发货单portfolio债务,投资组合turnover总成交额not-for-profitcooperative非盈利性组织triangular arbitrage三角套汇unit sixquota 配额guaratee保函fixed exchange rate固定汇率balance of paymentdeficit国际收支逆差international reserve国际储备credit tranche drawing信贷份额借款credit tranche信贷份额credit tranche facilities信贷份额贷款便利international payment国际收支buffer stock缓冲存货extended facilities补偿信贷便利government borrowing国债;政府借款price fluctuation价格波动,价格涨落export earning 出口收益enlarged access policy 延期进入政策credit policy信用政策组,债务调整Bretton Woods Agreement布雷顿森林协议International MonetaryFund 国际货币基金组织International Bank forResonstruction andDevelopment(IBRD)国际复兴与开发银行InternationalDevelopment Association(I.D.A.)国际开发协会International FincanceCorporation(I.F.C.)国际金融公司financial intermediary金融中介concessional terms特惠条件trade credit商业信贷earning capacity收益能力Bank for InternationalSettlements(B.I.S.)国际清算银行financial settlement财务清算接着来:unit sevensyndication辛迪加underwrite包销,认购hedge对冲买卖、套期保值innovation到期交易spread利差principal本金swap掉期交易eurobond market 欧洲债券市场euronote欧洲票据Federal Reserve Bank(FRB)联邦储备银行unsecured credit无担保贷款定期支付存款lead bank牵头银行neogotiabletime deposit议付定期存款inter-bank money market银行同业货币市场medium term loan 中期贷款syndicated credit银团贷款merchant bank商业银行portfolio management 有价债券管理lease financing租赁融资note issurance facility票据发行安排bearer note不记名票价underwriting facility包销安排floating-rate note 浮动利率票据bond holder债券持持有者London Interbank OfferedRate(LIBOR)伦敦同业优惠利率back-up credit line备用信贷额promissorynote(P.N..p/n)本票revolving cerdit 循环信用证,即revolving letterof creditnon interest-bearingreserves无息储备金interest rate controls 利率管制interest rate ceiling 利率上限interest rate floor 利率下限deposit insurance 存款保险。
European Journal of Scientific ResearchISSN 1450-216X Vol.27 No.4 (2009), pp.548-553© EuroJournals Publishing, Inc. 2009/ejsr.htmThe Comparison Logit and Probit Regression Analyses inEstimating the Strength of Gear TeethA.A. ShariffCentre For Foundation Studies In Science, University of Malaya50603 Kuala Lumpur, MalaysiaE-mail: asma@.myA. ZaharimFaculty of Engineering and Built EnvironmentK. SopianSERI University Kebangsaan Malaysia, Bangi, Selangor, MalaysiaAbstractLogit and probit are two regression methods which are categorised under Generalized Linear Models. Both models can be used when the response variables in theanalyses are categorical in nature. For the case of the strength of gear teeth data, it can be interms of counted proportions, such as r teeth fail out of n teeth tested. In this paper, the twomodels, logit and probit are discussed and the methods of analysis are compared forsimulated data sets obtained from experimental procedure called staircase design (SCD)experiment. For the analysis, the response variable is the proportion failing and theexplanatory variable is the corresponding load. The analysis is also compared with theexplanatory variable of logarithm of load. The population distributions of strengthsconsidered are normal and Weibull distribution and 1000 SCD experiments are simulated.The sampling distributions of the various estimators are then compared for bias, standarddeviation, and mean squared error for the two contrasting population distributions ofstrength. It is found that, a regression of the logit on the logarithm of load seems to be themost robust approach if normality of strengths is in doubt.Keywords:Logit, probit, regression analysis, counted proportion, gear teeth, staircase design.1. IntroductionFor ordinary linear regression, the response variable is always quantitative and continuous in nature. When the response variables are categorical and in particular binary, that is, it can assume only two values (a ‘yes-no’ or ‘fail-survive’) or in terms of counted proportions (r fail out of n tested) we are led to consider some other models which are more appropriate than ordinary linear regression. An important characteristic of data in which the response variables are binary is that the response variables must lie between 0 and 1. Therefore fitting these data using ordinary linear regression can give prediction for the proportion of above one or less than zero, which would be meaningless. On the otherThe Comparison Logit and Probit Regression Analyses in Estimating the Strength of Gear Teeth 549hand what we actually need in this situation is a regression model which will predict the proportion ofoccurrences, p (let us call them p instead of y ) at certain levels of x .For this type of data, in particular, when the response variables are in terms of countedproportions, the relationship between response variables p and explanatory variables x is a non-linearcurved relationship (S-shaped) curve which is usually called sigmoid . The S-shaped behaviour is verycommon in modeling binomial responses as a function of predictors and also makes use of theassumption that the responses are from underlying binomial or binary distribution. The purpose oflogistic modelling (and also probit) is the same as other modelling techniques used in statistics, that is,to find a model that fits the data best and is the simplest, yet physically reasonable in describing therelationship between the response and the explanatory variables [1]. There is little to distinguishbetween logit and probit models. Both curves are so similar as to yield essentially identical results. Itwas found that probit and logit analysis applied to the same set of data produce coefficient estimateswhich differ approximately by a factor of proportionality, and that factor should be about 1.8 [2].2. Materials and Methods2.1. Logit Versus Probit Regression Techniques Logit model can be presented asp Z Z =+exp()exp()1 (1) where p is the proportion of occurrences, Z =βββ011+++x x k k ... and x x k 1... are the explanatory variables. The inverse relation of equation (1) isZ p p =−⎛⎝⎜⎞⎠⎟ln 1 (2) that is, the natural logarithm of the odds ratio, known as the logit. It transforms p which is restricted tothe range [0, 1] to a range [,]−∞∞ .Probit regression analysis involves modeling the response function with the normal cumulativedistribution function. The probit of a proportion p is just the point on a normal curve with mean 0 andstandard deviation 1 which has this proportion to the left of it.The model can be presented asΦ−==+++1011()...p Z x x k k βββ (3) where p is the proportion and Φ−1 is the inverse of the cumulative distribution function of the standard normal distribution. That is,p Z u du Z ==−−∞∫Φ()exp(/)1222π (4) is the cumulative distribution function of the standard normal distribution.For logistic and probit regression, the binomial, rather than the normal distribution describesthe distribution of the errors and will be the statistic upon which the analysis is based. The principlesthat are used for ordinary linear regression analysis could be adapted to fit both regressions. However,instead of using least square method to fit the model, for logistic and probit regressions, it is moreappropriate to use maximum likelihood estimate. The likelihood function is given asL p p i r i mi n r i i i =−=−∏()11,where the p i are defined in terms of the parameters ββ0,...,k and the known values of the predictorvariables. This has to be maximized with respect to the parameters.550A. A. Shariff, A. Zaharim and K. Sopian2.2. Experimental Design Gear teeth are commonly tested by applying oscillatory loads, using a special machine called pulsator-test machine. In the experiment, the test specimen, in this case the gear tooth is subjected to vibrations of a resonant spring/mass system. When this happens, it experiences stresses and crack propagation takes place. Eventually, after certain number of cycles the tooth fails. The number of cycles to failure can then be recorded. If the tooth does not fail after a certain fixed number of cycles, it is considered to have survived in the experiment. The experimental procedure used is the well-known staircase design (SCD). SCD experiment is also known as sensitivity testing or ‘up-and-down’ method [3] where the testing of specimens is made close to the anticipated mean level. In the experiment the first test piece should be tested at a load level assumed to be near the mean value of the fatigue strength. If failure occurs before N cycles, the next test piece is tested at one step, a fixed change in load, below the first load level. Otherwise, the next test at the load one step above the first level. This procedure is continued until all the pieces have been tested. The increment between load levels should be equal for steps up and down and should be approximately one standard deviation of the fatigue strength distribution. Since the data obtained are categorical in nature, particularly in terms of counted proportions, fatigue strength of a gear is then determined by analysing the data obtained using appropriate statistical techniques, in this case logit and probit.2.3. Analysis For SCDThe results obtained in the experiment are then analysed using logit and compared with probit. Thelogit transformation of p is defined by ln(p p1−, and the lines ln()p px 101−=+ββ are fitted, using maximum likelihood. A comparison is made with fitting the line,ln()ln p px 101−=+ββ which is equivalent to assuming a log-normal distribution of strengths.For probit, results of SCD experiment are analysed by fitting the lineΦ−=+101()p x ββwhere p is the proportion failing and x is the corresponding load. This is equivalent to assuming a normal distribution of strengths. Then a comparison is made withΦ−=+101()ln p x ββThe estimated mean fatigue strength, μ, and the lower 1% point of the distribution of fatiguestrength, .x 099, are the values of x corresponding to p =05. and p =001. respectively. The standarddeviation can be estimated from ( )/..σμ=−x 099233 .The methods of analysis have been compared for simulated data sets. The population distribution of strengths is specified and 1000 SCD experiments are simulated. The sampling distributions of the various estimators can thus be compared for bias, standard deviation, and mean squared error. Two contrasting population distributions of strength are considered:(i). normal distribution with mean of 20.0 and standard deviation of 2.0;(ii).W eibull distribution [4 – 6], which has a cumulative distribution function F(x) defined by Pr()()exp[(/)]X x F x x b c <==−−1),with shape parameter, c = 2, and scale parameter, b = 22.56. These parameter values correspond to a mean of 20 and standard deviation of 10.45. The probability density functions of both distributions are plotted in Figure 1. The Weibull distribution has a substantial area near zero. This might be realistic forThe Comparison Logit and Probit Regression Analyses in Estimating the Strength of Gear Teeth 551the strength of a component being tested under extreme conditions, as in an accelerated testing programme. It could also be interpreted as strength above some minimum value.Figure 2: Probability density function of normal (,.)μσ==2020 and Weibull (,.45)μσ==2010distributionIn the simulation, each SCD used 50 test specimens. There are 1000 independent SCD experiments within a simulation. Each specimen put on test is randomly selected from the 50 specimens. For the normal distribution the load increment is chosen to be 2, while for the Weibull distribution it is chosen to be 6, since the standard deviation for this distribution is larger.Results obtained from the above experiments are analysed using logit and then compared with probit analyses for each distribution. The means and standard deviations of the estimated mean, standard deviation and lower 1% point of the strength distribution are computed from 1000 SCD experiments for each distribution.These results are tabulated in Table 1; the standard deviations of these statistics are shown in brackets, and the root mean square error (RMSE) is also calculated using the formula552A. A. Shariff, A. Zaharim and K. Sopian Table 1:Results of Staircase Experiments Analysed by Probit and Logit Regression Techniques with the load on a linear scale for Each Distribution.RMSE (standard deviation)(bias)22=+where, bias = (actual value - mean of estimated value).These RMSE values are presented in square brackets. Table 2 shows results for the logit and probit analysis using a logarithmic scale for load.Table 2: Results of Staircase Experiments Analysed by Probit and Logit Regression Techniques with the loadon a logarithmic scale for Each DistributionThe Comparison Logit and Probit Regression Analyses in Estimating the Strength of Gear Teeth 553 3. Results and DiscussionTable 1 indicates that for load on linear scale, results obtained by probit analysis are more realistic with less error as compared to logit for normal distribution of strength. Logit analysis appears to overestimate the standard deviation and hence underestimate the lower one percent point of the distribution. However, for the Weibull distribution, which has a large standard deviation, both methods predict negative lower 1% points which are physically impossible.Regressing the sample probit against the logarithm of load (refer to Table 2) gives estimates with a smaller standard deviation and, somewhat surprisingly, a slightly smaller mean squared error. Regressing logit of the sample proportion against the logarithm of load is a slight improvement on the probit analysis and aconsiderable improvement on a regression of logit against load.When sampling from both normal and Weibull distributions the regression of the logit against the logarithm of load gives an estimate of the lower 1% point with the smallest mean squared error. Overall, a regression of the logit on the logarithm of load seems to be the most robust approach if normality of strengths is in doubt.References[1]Hosmer, D.W., & Lemeshow, S. (1989). Applied Logistic Regression. Wiley Series inProbability and Mathematical Science. Wiley-Interscience Publication.[2]Aldrich, J.H. & Nelson, F.D. (1984). Linear Probability, Logit and Probit Models. SageUniversity Paper series on Quantitative Applications in the Social Sciences, 07-045. Beverly Hills and London: Sage Pubns.[3]Lloyd, D. K., & Lipow, M. (1989). Reliability: managements, methods, and mathematics(Second ed.). American Society for Quality Control.[4]ISO/CD 12107. (1997). Draft for Public Comment, Metallic Materials - Fatigue Testing -Statistical Planning and Analysis of Data, British Standard Institution.[5]Weibull, W. (1961). Fatigue Testing and the Analysis of Results. Pergamon Press. Oxford.[6]Crowder, M.J., Kimber, A.C., Smith, R.L. & Sweeting, T.J. (1991). Statistical Analysis ofReliability Data. Chapman and Hall.。