RMIT University An Empirical Investigation of the Impact of Discretization on Common Data D

格式：pdf
大小：230.45 KB
文档页数：33

下载文档原格式

职业教育投入对全要素生产率的r影响研究r——基于空间纠正系统GMM法

职业教育投入对全要素生产率的r影响研究r——基于空间纠正系统GMM法吕连菊;阚大学【摘要】将环境因素纳入TFP测度中,运用空间纠正系统GMM法实证分析发现,各层次职业教育公共投入、非公共投入与TFP均存在空间自相关,TFP存在空间溢出效应;全国和三大地区高职教育和中职教育各自公共投入与非公共投入均提高了TFP,两类职业教育非公共投入的作用均较小,高职教育两类投入对TFP的正面影响是互补的,且互补效应在东部地区更为突出,中职教育两类投入对TFP的正面影响在东部地区是前者挤出了后者,在全国和中西部地区两者是互补的,且只有西部地区两者对TFP的影响大于其高职教育相应投入;全国和三大地区初职教育两类投入均不利于TFP提高,前者负面影响较大,并且前者挤出了后者,这种挤出效应在东部地区更为突出.【期刊名称】《职业技术教育》【年(卷),期】2016(037)010【总页数】6页(P39-44)【关键词】职业教育投入;全要素生产率;空间自相关;GMM法【作者】吕连菊;阚大学【作者单位】南昌工程学院经济贸易学院南昌,330099;南昌工程学院经济贸易学院【正文语种】中文【中图分类】G710全国教育科学规划教育部青年课题“普通教育与职业教育对TFP的影响：教育规模及教育投入结构视角”（EFA150365），主持人：吕连菊改革开放以来，我国经济以年均9.8%的速度快速增长，但在经济发展过程中出现了较多问题，如何提高经济增长质量成为中国经济进入新常态下的突出问题之一。

在当前劳动力与原材料等生产要素成本上升、资源和环境约束加剧情况下，我国提高经济增长质量的重要途径之一显然是提高全要素生产率（TFP）。

而职业教育与TFP提高紧密相关，职业教育可通过促进要素重置、优化人口结构、产业结构和就业结构、提高劳动力质量、缩小城乡收入差距、实现教育公平、促进技术创新、有助于技术外溢吸收、加快技术扩散和推进专业化分工等方面作用于TFP。

VC毕业论文GMRES算法的加速收敛现象分析毕业论文

摘要随着科学和工程技术的发展，越来越多的问题需要求解大规模的线性方程组，对这类方程的快速求解已成为数值代数研究的热点之一，特别是具有稀疏结构的大型方程组的求解。

基于Galerkin原理的Arnoldi算法是求解这种线性代数方程组的近似算法，以下称这种方法为广义极小残余算法(GMRES算法)。

GMRES 方法是目前求解大型稀疏非对称线性方程组最为流行的一种迭代方法。

GMRES算法在迭代过程中通常表现出一种加速收敛行为，随着迭代次数的增加，这种加速收敛现象越明显，即残量收敛会随着迭代步数的增加而逐渐得到改善。

在CG方法中，这种加速收敛与Ritz值有密切关系。

通过分析，我们发现GMRES的加速收敛与其斜投影过程中产生的Ritz值对特征值的逼近程度有关系。

在实际应用中，为了减少存储量和计算量，我们通常使用GMRES算法的重新开始版本来求解大型非对称线性方程组。

本文描绘了GMRES和GMRES(m)的加速收敛现象，并通过实验给予解释。

关键字：广义最小残量； Krylov子空间； Ritz值；加速收敛；正交投影方法；非对称线性方程组On The Superlinear Convergence of GMRESAbstractWit h the d evelo p me nt o f science and p ro ject techno lo g y,mo re and mo re q uestio ns need the so lut io n o f b ig linear syste ms. T h is so lut io n is o ne o f the fastest ways fo r researchin g nu mer ica l algeb ra,esp ecia lly fo r the b ig sparse matr ix. The way o f Arno ld i is b ased up o n the p rinc ip le o f Galerk in, wh ich is clo se d to the so lut io n o f the linear nu mer ica l system.Here, we call the so lut io n as Generalized Min imu m Res id ua l (GMRES).GMRES is o ne o f the mo st p op u lar iterat ive met ho d s fo r the so lut io n o f b ig no ns in gu lar no nsy mmetr ic linear syste ms.It us ua lly has a so-called sup er linear co n vergence b ehav io r.The rate o f co nverge nce seems to imp ro ve as the iterat io n p ro ceed s.F o r ano ther say,the rate o f resid ua l var iab le w ill b e imp ro ved as we increase its iterat io n.F o r the co nju gate grad ie nts metho d, th is met ho d has b een related to a d egree o f co nverge nce o f t he Rit z va lue. Thro u g h so me ana lys is,we fo und that fo r GMRES to o, changes in co n vergence b ehav io r seem to be related to the co nverge nce o f R it z va lue. In o ur p ractica l app licat io n,we also usua lly use GMRES(m) fo r red uc in g sto rage and co unter so lv in g b ig linear systems.Th is p ap er stud ies the sup erlinear co nvergence b ehav io r o f GMRES and GMRES(m),and sup p lies exp la in thro u g h exp erimen t.Key wo rd: GMRES; K ry lo v sub sp ace; R it z va lue; sup er linear co nverge nce;o rtho go na lizat io n metho d; no nsy mmetr ic linear system目录摘要 (I)A B S T RA C T ................................................. I I 第一章引言.. (1)第二章G M R E S算法基础知识 (3)§2.1向量范数 (3)§2.2线性方程组最小二乘问题 (4)§2.2.1Gr a m-S ch m id t正交化方法 (4)§2.2.2Gi v en s变换 (4)第三章G M R E S算法理论 (6)§3.1K RYLOV子空间方法的基本理论 (6)§3.2A RNOLDI算法 (7)§3.3G MR E S算法结构 (8)第四章G M R E S算法的加速收敛现象分析 (9)第五章数值示例与算法实现 (19)§5.1数值实验 (19)§5.2算法改进与实现 (22)§5.2.1预处理技术 (22)§5.2.2算法实现 (24)§5.3实验总结 (34)致谢 (35)参考文献 (38)R E P O RT OF LI T E RA T UR E (39)文献报告 (43)第一章引言关于线性方程组的数值解法一般分为两大类:直接法和迭代法。

PbI2

Remnant PbI2, an unforeseen necessity in high-efficiency hybrid perovskite-based solar cells?a)Duyen H. Cao, Constantinos C. Stoumpos, Christos D. Malliakas, Michael J. Katz, Omar K. Farha, Joseph T. Hupp, and Mercouri G. KanatzidisCitation: APL Materials 2, 091101 (2014); doi: 10.1063/1.4895038View online: /10.1063/1.4895038View Table of Contents: /content/aip/journal/aplmater/2/9?ver=pdfcovPublished by the AIP PublishingArticles you may be interested inParameters influencing the deposition of methylammonium lead halide iodide in hole conductor free perovskite-based solar cellsAPL Mat. 2, 081502 (2014); 10.1063/1.4885548Air stability of TiO2/PbS colloidal nanoparticle solar cells and its impact on power efficiencyAppl. Phys. Lett. 99, 063512 (2011); 10.1063/1.3617469High efficiency mesoporous titanium oxide PbS quantum dot solar cells at low temperatureAppl. Phys. Lett. 97, 043106 (2010); 10.1063/1.3459146Near-IR activity of hybrid solar cells: Enhancement of efficiency by dissociating excitons generated in PbS nanoparticlesAppl. Phys. Lett. 96, 073505 (2010); 10.1063/1.3292183Effects of molecular interface modification in hybrid organic-inorganic photovoltaic cellsJ. Appl. Phys. 101, 114503 (2007); 10.1063/1.2737977APL MATERIALS2,091101(2014)Remnant PbI2,an unforeseen necessity in high-efﬁciency hybrid perovskite-based solar cells?aDuyen H.Cao,1Constantinos C.Stoumpos,1Christos D.Malliakas,1Michael J.Katz,1Omar K.Farha,1,2Joseph T.Hupp,1,band Mercouri G.Kanatzidis1,b1Department of Chemistry,and Argonne-Northwestern Solar Energy Research(ANSER)Center,Northwestern University,2145Sheridan Road,Evanston,Illinois60208,USA2Department of Chemistry,Faculty of Science,King Abdulaziz University,Jeddah,Saudi Arabia(Received2May2014;accepted26August2014;published online18September2014)Perovskite-containing solar cells were fabricated in a two-step procedure in whichPbI2is deposited via spin-coating and subsequently converted to the CH3NH3PbI3perovskite by dipping in a solution of CH3NH3I.By varying the dipping time from5s to2h,we observe that the device performance shows an unexpectedly remark-able trend.At dipping times below15min the current density and voltage of thedevice are enhanced from10.1mA/cm2and933mV(5s)to15.1mA/cm2and1036mV(15min).However,upon further conversion,the current density decreases to9.7mA/cm2and846mV after2h.Based on X-ray diffraction data,we determinedthat remnant PbI2is always present in these devices.Work function and dark currentmeasurements showed that the remnant PbI2has a beneﬁcial effect and acts as ablocking layer between the TiO2semiconductor and the perovskite itself reducingthe probability of back electron transfer(charge recombination).Furthermore,weﬁnd that increased dipping time leads to an increase in the size of perovskite crys-tals at the perovskite-hole-transporting material interface.Overall,approximately15min dipping time(∼2%unconverted PbI2)is necessary for achieving optimaldevice efﬁciency.©2014Author(s).All article content,except where otherwisenoted,is licensed under a Creative Commons Attribution3.0Unported License.[/10.1063/1.4895038]With the global growth in energy demand and with compelling climate-related environmental concerns,alternatives to the use of non-renewable and noxious fossil fuels are needed.1One such alternative energy resource,and arguably the only legitimate long-term solution,is solar energy. Photovoltaic devices which are capable of converting the photonﬂux to electricity are one such device.2Over the last2years,halide hybrid perovskite-based solar cells with high efﬁciency have engendered enormous interest in the photovoltaic community.3,4Among the perovskite choices, methylammonium lead iodide(MAPbI3)has become the archetypal light absorber.Recently,how-ever,Sn-based perovskites have been successfully implemented in functional solar cells.5,6MAPbI3 is an attractive light absorber due to its extraordinary absorption coefﬁcient of1.5×104cm−1 at550nm;7it would take roughly1μm of material to absorb99%of theﬂux at550nm.Further-more,with a band gap of1.55eV(800nm),assuming an external quantum efﬁciency of90%,a maximum current density of ca.23mA/cm2is attainable with MAPbI3.Recent reports have commented on the variability in device performance as a function of perovskite layer fabrication.8In our laboratory,we too have observed that seemingly identicalﬁlmsa Invited for the Perovskite Solar Cells special topic.b Authors to whom correspondence should be addressed.Electronic addresses:j-hupp@ and m-kanatzidis@2,091101-12166-532X/2014/2(9)/091101/7©Author(s)2014FIG.1.X-ray diffraction patterns of CH3NH3PbI3ﬁlms with increasing dipping time(%composition of PbI2was determined by Rietveld analysis(see Sec.S3of the supplementary material for the Rietveld analysis details).have markedly different device performance.For example,when ourﬁlms of PbI2are exposed to MAI for several seconds(ca.60s),then a light brown coloredﬁlm is obtained rather than the black color commonly observed for bulk MAPbI3(see Sec.S2of the supplementary material for the optical band gap of bulk MAPbI3).23This brown color suggests only partial conversion to MAPbI3and yields solar cells exhibiting a J sc of13.4mA/cm2and a V oc of960mV;these values are signiﬁcantly below the21.3mA/cm2and1000mV obtained by others.4Under the hypothesis that fully converted ﬁlms will achieve optimal light harvesting efﬁciency,we increased the conversion time from seconds to2h.Unexpectedly,the2-h dipping device did not show an improved photovoltaic response(J sc =9.7mA/cm2,V oc=846mV)even though conversion to MAPbI3appeared to be complete.With the only obvious difference between these two devices being the dipping time,we hypothesized that the degree of conversion of PbI2to the MAPbI3perovskite is an important parameter in obtaining optimal device performance.We thus set out to understand the correlation between the method of fabrication of the MAPbI3layer,the precise chemical compositions,and both the physical and photo-physical properties of theﬁlm.We report here that remnant PbI2is crucial in forming a barrier layer to electron interception/recombination leading to optimized J sc and V oc in these hybrid perovskite-based solar cells.We constructed perovskite-containing devices using a two-step deposition method according to a reported procedure with some modiﬁcations.4(see Sec.S1of the supplementary material for the experimental details).23MAPbI3-containing photo-anodes were made by varying the dipping time of the PbI2-coated photo-anode in MAI solution.In order to minimize the effects from unforeseen variables,care was taken to ensure that allﬁlms were prepared in an identical manner.The composi-tions ofﬁnal MAPbI3-containingﬁlms were monitored by X-ray diffraction(XRD).Independently of the dipping times,only theβ-phase of the MAPbI3is formed(Figure1).9However,in addition to theβ-phase,allﬁlms also showed the presence of unconverted PbI2(Figure1,marked with*) which can be most easily observed via the(001)and(003)reﬂections at2θ=12.56◦and38.54◦respectively.As the dipping time is increased,the intensities of PbI2reﬂections decrease with a concomitant increase in the MAPbI3intensities.In addition to the decrease in peak intensities of PbI2,the peak width increases as the dipping time increases indicating that the size of the PbI2 crystallites is decreasing,as expected,and the converse is observed for the MAPbI3reﬂections.This observation suggests that the conversion process begins from the surface of the PbI2crystallites and proceeds toward the center where the crystallite domain size of the MAPbI3phase increases and that of PbI2diminishes.Interestingly,the remnant PbI2phase can be seen in the data of other reports, but has not been identiﬁed as a primary source of variability in cell performance.8,10 Considering that the perovskite is the primary light absorber within the device,we wantedto further investigate how the optical absorption of theﬁlm changes with increasing dipping timeFIG.2.Absorption spectra of CH3NH3PbI3ﬁlms as a function of unconverted PbI2phase fraction.FIG.3.(a)J-V curves and(b)EQE of CH3NH3PbI3-based devices as a function of unconverted PbI2phase fraction.(Figure2).11,12The pure PbI2ﬁlm shows a band gap of2.40eV,consistent with the yellow color of PbI2.As the PbI2ﬁlm is gradually converted to the perovskite,the band gap is progressively shifted toward1.60eV.The deviation of MAPbI3’s band gap(1.60eV)from that of the bulk MAPbI3 material(1.55eV)could be explained by quantum conﬁnement effects related with the sizes of TiO2and MAPbI3crystallites and their interfacial interaction.13,14Interestingly,we also noticed the presence of a second absorption in the light absorber layer,in which the gap gradually red shifts from1.90eV to1.50eV as the PbI2concentration is decreased from9.5%to0.3%(Figure2—blue arrow).Having established the chemical compositions and optical properties of the light absorberﬁlms, we proceeded to examine the photo-physical responses of the corresponding functional devices in order to determine how the remnant PbI2affects device performance.The pure PbI2based device remarkably achieved a0.4%efﬁciency with a J sc of2.1mA/cm2and a V oc of564mV (Figure3(a)).Upon progressive conversion of the PbI2layer to MAPbI3,we observe two different regions(Figure4,Table I).In theﬁrst region,the expected behavior is observed;as more PbI2is converted to MAPbI3,the trend is toward higher photovoltaic efﬁciency,due both to J sc and V oc, until1.7%PbI2is reached.The increase in J sc is attributable,at least in part,to increasing absorption of light by the perovskite.We speculate that progressive elimination of PbI2,present as a layer between TiO2and the perovskite,also leads to higher net yields for electron injection into TiO2and therefore,higher J values.For a sufﬁciently thick PbI2spacer layer,electron injection would occur instepwise fashion,i.e.,perovskite→PbI2→TiO2.Finally,the photovoltage increase is attributable toFIG.4.Summary of J-V data vs.PbI2concentration of CH3NH3PbI3-based devices(Region1:0to15min dipping time, Region2:15min to2h dipping time).TABLE I.Photovoltaic performance of CH3NH3PbI3-based devices as a function of unconverted PbI2fraction.Dipping time PbI2concentration a J sc(mA/cm2)V oc(V)Fill factor(%)Efﬁciency(%) 0s100% 2.10.564320.45s9.5%10.10.93352 4.960s7.2%13.40.96052 6.72min 5.3%14.00.964557.45min 3.7%14.70.995578.315min 1.7%15.1 1.036629.730min0.8%13.60.968648.51h0.4%12.40.938657.62h0.3%9.70.84668 5.5a Determined from the Rietveld analysis of X-ray diffraction data.the positive shift in TiO2’s quasi-Fermi level as the population of photo-injected electrons is higher with increased concentration of MAPbI3.The second region yields a notably different trend;surprisingly,below a concentration of2% PbI2,J sc,V oc,and ultimatelyηdecrease.Considering that the light-harvesting efﬁciency would increase when the remaining2%PbI2is converted to MAPbI3(albeit to only a small degree),then the remnant PbI2must have some other role.We posit that remnant PbI2serves to inhibit detrimental electron-transfer processes(Figure5).Two such processes are back electron transfer from TiO2to holes in the valence band of the perovskite(charge-recombination)or to the holes in the HOMO of the HTM(charge-interception).This retardation of electron interception/recombination observation is reminiscent of the behavior of atomic layer deposited Al2O3/ZrO2layers that have been employed in dye-sensitized solar cells.15–18It is conceivable that the conversion of PbI2to MAPbI3occurs from the solution interface toward the TiO2/PbI2interface and thus would leave sandwiched between TiO2and MAPbI3a blocking layer of PbI2that inhibits charge-interception/recombination.For this hypothesis to be correct,it is crucial that the conduction-band-edge energy(E cb)of the PbI2be higher than the E cb of the TiO2.19–21 The work function of PbI2was measured by ultraviolet photoelectron spectroscopy(UPS)and was observed to be at6.35eV vs.vacuum level,which is0.9eV lower than the valence-band-edge energy(E vb)of MAPbI3(see Sec.S7of the supplementary material23for the work function of PbI2);the E cb(4.05eV)was calculated by subtracting the work function from the band gap(2.30eV).solar cell.FIG.6.Dark current of CH3NH3PbI3-based devices as a function of unconverted PbI2phase fraction.The E cb of PbI2is0.26eV higher than the E cb of TiO2and thus PbI2satisﬁes the conditions of a charge-recombination/interception barrier layer.In order to probe the hypothesis that PbI2acts as a charge-interception barrier,dark current measurements,in which electronsﬂow from TiO2to the HOMO of the HTM,were made.Consistent with our hypothesis,Figure6illustrates that the onset of the dark current occurs at lower potentials as the PbI2concentration decreases.In the absence of other effects,the increasing dark current with increasing fraction of perovskite(and decreasing fraction of PbI2)should result in progressively lower open-circuit photovoltages.Instead,the photocurrent density and the open-circuit photovoltage bothincrease,at least until to PbI2fraction reaches1.7%.As discussed above,thinning of a PbI2-basedFIG.7.Cross-sectional SEM images of CH3NH3PbI3ﬁlm with different dipping time.sandwich layer should lead to higher net injection yields,but excessive thinning would diminish the effectiveness of PbI2as a barrier layer for back electron transfer reactions.Given the surprising role of remnant PbI2in these devices,we further probed the two-step conversion process by using scanning-electron microscopy(SEM)(Figure7).Two domains of lead-containing materials(PbI2and MAPbI3)are present.Theﬁrst domain is sited within the mesoporous TiO2network(area1)while the second grows on top of the network(area2).Area2initially contains 200nm crystals.As the dipping time is increased,the crystals show marked changes in size and morphology.The formation of bigger perovskite crystals is likely the result of the thermodynamically driven Ostwald ripening process,i.e.,smaller perovskite crystals dissolves and re-deposits onto larger perovskite crystals.22The rate of charge-interception,as measured via dark current,is proportional to the contact area between the perovskite and the HTM.Thus,the eventual formation of large, high-aspect-ratio crystals,as shown in Figure7,may well lead to increases in contact area and thereby contributes to the dark-current in Figure6.Regardless,we found that the formation of large perovskite crystals greatly decreased our success rate in constructing high-functioning,non-shorting solar cells.In summary,residual PbI2appears to play an important role in boosting overall efﬁciencies for CH3NH3PbI3-containing photovoltaics.PbI2’s role appears to be that of a TiO2-supported blocking layer,thereby slowing rates of electron(TiO2)/hole(perovskite)recombination,as well as decreasing rates of electron interception by the hole-transporting material.Optimal performance for energy conversion is observed when ca.98%of the initially present PbI2has been converted to the perovskite. Conversion to this extent requires about15min.Pushing beyond98%(and beyond15min of reaction time)diminishes cell performance and diminishes the success rate in constructing non-shorting cells.The latter problem is evidently a consequence of conversion of small and more-or-less uniformly packed perovskite crystallites to larger,poorly packed crystallites of varying shape and size.Finally,the essential,but previously unrecognized,role played by remnant PbI2 provides an additional explanation for why cells prepared dissolving and then depositing pre-formed CH3NH3PbI3generally under-perform those prepared via the intermediacy of PbI2.We thank Prof.Tobin Marks for use of the solar simulator and EQE measurement system. Electron microscopy was done at the Electron Probe Instrumentation Center(EPIC)at Northwestern University.Ultraviolet Photoemission Spectroscopy was done at the Keck Interdisciplinary SurfaceScience facility(Keck-II)at Northwestern University.This research was supported as part of theANSER Center,an Energy Frontier Research Center funded by the U.S Department of Energy, Ofﬁce of Science,Ofﬁce of Basic Energy Sciences,under Award No.DE-SC0001059.1R.Monastersky,Nature(London)497(7447),13(2013).2H.J.Snaith,J.Phys.Chem.Lett.4(21),3623(2013).3M.M.Lee,J.Teuscher,T.Miyasaka,T.N.Murakami,and H.J.Snaith,Science338(6107),643(2012).4J.Burschka,N.Pellet,S.J.Moon,R.Humphry-Baker,P.Gao,M.K.Nazeeruddin,and M.Gratzel,Nature(London) 499(7458),316(2013).5F.Hao,C.C.Stoumpos,D.H.Cao,R.P.H.Chang,and M.G.Kanatzidis,Nat.Photonics8(6),489(2014);F.Hao,C.C. Stoumpos,R.P.H.Chang,and M.G.Kanatzidis,J.Am.Chem.Soc.136,8094–8099(2014).6N.K.Noel,S.D.Stranks,A.Abate,C.Wehrenfennig,S.Guarnera,A.Haghighirad,A.Sadhanala,G.E.Eperon,M.B. Johnston,A.M.Petrozza,L.M.Herz,and H.J.Snaith,Energy Environ.Sci.7,3061(2014).7H.S.Kim,C.R.Lee,J.H.Im,K.B.Lee,T.Moehl,A.Marchioro,S.J.Moon,R.Humphry-Baker,J.H.Yum,J.E.Moser, M.Gratzel,and N.G.Park,Sci.Rep.2,591(2012).8D.Y.Liu and T.L.Kelly,Nat.Photonics8(2),133(2014).9C.C.Stoumpos,C.D.Malliakas,and M.G.Kanatzidis,Inorg.Chem.52(15),9019(2013).10J.H.Noh,S.H.Im,J.H.Heo,T.N.Mandal,and S.I.Seok,Nano Lett.13(4),1764(2013).11Diffuse reﬂectance measurements of MAPbI3ﬁlms were converted to absorption spectra using the Kubelka-Munk equation,α/S=(1-R)2/2R,where R is the percentage of reﬂected light,andαand S are the absorption and scattering coefﬁcients, respectively.The band gap values are the energy value at the intersection point of the absorption spectrum’s tangent line and the energy axis.12L.F.Gate,Appl.Opt.13(2),236(1974).13O.V oskoboynikov,C.P.Lee,and I.Tretyak,Phys.Rev.B63(16),165306(2001).14X.X.Xue,W.Ji,Z.Mao,H.J.Mao,Y.Wang,X.Wang,W.D.Ruan,B.Zhao,and J.R.Lombardi,J.Phys.Chem.C 116(15),8792(2012).15E.Palomares,J.N.Clifford,S.A.Haque,T.Lutz,and J.R.Durrant,J.Am.Chem.Soc.125(2),475(2003).16C.Prasittichai,J.R.Avila,O.K.Farha,and J.T.Hupp,J.Am.Chem.Soc.135(44),16328(2013).17A.K.Chandiran,M.K.Nazeeruddin,and M.Gratzel,Adv.Funct.Mater.24(11),1615(2014).18M.J.Katz,M.J.D.Vermeer,O.K.Farha,M.J.Pellin,and J.T.Hupp,Langmuir29(2),806(2013).19M.J.DeVries,M.J.Pellin,and J.T.Hupp,Langmuir26(11),9082(2010).20C.Prasittichai and J.T.Hupp,J.Phys.Chem.Lett.1(10),1611(2010).21F.Fabregat-Santiago,J.Garcia-Canadas,E.Palomares,J.N.Clifford,S.A.Haque,J.R.Durrant,G.Garcia-Belmonte, and J.Bisquert,J.Appl.Phys.96(11),6903(2004).22Alan D.McNaught and Andrew Wilkinson,IUPAC Compendium of Chemical Terminology(Blackwell Scientiﬁc Publica-tions,Oxford,1997).23See supplementary material at /10.1063/1.4895038for experimental details,absorption spectrum of bulk CH3NH3PbI3,fraction,size,absorption spectrum,work function of unconverted PbI2,and average photovoltaic perfor-mance.。

MIT麻省理工学院算法导论公开课 Problem Set 4 solution

Introduction to Algorithms October 29, 2005 Massachusetts Institute of Technology 6.046J/18.410J Professors Erik D. Demaine and Charles E. Leiserson Handout 18Problem Set 4 SolutionsProblem 4-1. TreapsIf we insert a set of n items into a binary search tree using T REE-I NSERT, the resulting tree may be horribly unbalanced. As we saw in class, however, we expect randomly built binary search trees to be balanced. (Precisely, a randomly built binary search tree has expected height O(lg n).) Therefore, if we want to build an expected balanced tree for a ﬁxed set of items, we could randomly permute the items and then insert them in that order into the tree.What if we do not have all the items at once? If we receive the items one at a time, can we still randomly build a binary search tree out of them?We will examine a data structure that answers this question in the afﬁrmative. A treap is a binary search tree with a modiﬁed way of ordering the nodes. Figure 1 shows an example of a treap. As usual, each item x in the tree has a key key[x]. In addition, we assign priority[x], which is a random number chosen independently for each x. We assume that all priorities are distinct and also that all keys are distinct. The nodes of the treap are ordered so that (1) the keys obey the binary-search-tree property and (2) the priorities obey the min-heap order property. In other words,•if v is a left child of u, then key[v]<key[u];•if v is a right child of u, then key[v]>key[u]; and•if v is a child of u, then priority(v)>priority(u).(This combination of properties is why the tree is called a “treap”: it has features of both a binary search tree and a heap.)Figure 1: A treap. Each node x is labeled with key[x]:p riority[x]. For example, the root has key G and priority 4.It helps to think of treaps in the following way. Suppose that we insert nodes x1,x2,...,x n, each with an associated key, into a treap in arbitrary order. Then the resulting treap is the tree that wouldhave been formed if the nodes had been inserted into a normal binary search tree in the order given by their (randomly chosen) priorities. In other words, priority[x i]<priority[x j]means that x i is effectively inserted before x j.(a) Given a set of nodes x1,x2,...,x n with keys and priorities all distinct, show that thereis a unique treap with these nodes.Solution:Prove by induction on the number of nodes in the tree. The base case is a tree withzero nodes, which is trivially unique. Assume for induction that treaps with k−1orfewer nodes are unique. We prove that a treap with k nodes is unique. In this treap, theitem x with minimum priority must be at the root. The left subtree has items with keys<key[x]and the right subtree has items with keys >key[x]. This uniquely deﬁnes theroot and both subtrees of the root. Each subtree is a treap of size ≤k−1, so they areunique by induction.Alternatively, one can also prove this by considering a treap in which nodes are inserted in order of their priority. Assume for induction that the treap with the k−1nodes with smallest priority is unique. For k=0t his is trivially true. Now considerthe treap with the k nodes with smallest priority. Since we know that the structureof a treap is the same as the structure of a binary search tree in which the keys areinserted in increasing priority order, the treap with the k nodes with smallest priorityis the same as the treap with the k−1nodes with smallest priority after inserting thek-th node. Since BST insert is a deterministic algorithm, there is only one place thek-th node could be inserted. Therefore the treap with k nodes is also unique, provingthe inductive hypothesis.(b) Show that the expected height of a treap is O(lg n), and hence the expected time tosearch for a value in the treap is O(lg n).Solution: The idea is to realize that a treap on n nodes is equivalent to a randomlybuilt binary search tree on n nodes. Recall that assigning priorities to nodes as theyare inserted into the treap is the same as inserting the n nodes in the increasing orderdeﬁned by their priorities. So if we assign the priorities randomly, we get a randomorder of n priorities, which is the same as a random permutation of the n inputs, sowe can view this as inserting the n items in random order.The time to search for an item is O(h)where h is the height of the tree. As we saw inlecture, E[h]=O(lg n). (The expectation is taken over permutations of the n nodes,i.e., the random choices of the priorities.)Let us see how to insert a new node x into an existing treap. The ﬁrst thing we do is assign x a random priority priority[x]. Then we call the insertion algorithm, which we call T REAP-I NSERT, whose operation is illustrated in Figure 2.(e) (f)Figure 2: Operation of T REAP-I NSERT. As in Figure 1, each node x is labeled with key[x]: priority[x]. (a) Original treap prior to insertion. (b) The treap after inserting a node with key C and priority 25. (c)–(d) Intermediate stages when inserting a node with key D and priority 9.(e) The treap after insertion of parts (c) and (d) is done. (f) The treap after inserting a node with key F and priority 2.(c) Explain how T REAP-I NSERT works. Explain the idea in English and give pseudocode.(Hint: Execute the usual binary search tree insert and then perform rotations to restorethe min-heap order property.)Solution: The hint gives the idea: do the usual binary search tree insert and thenperform rotations to restore the min-heap order property.T REAP-I NSERT (T,x)inserts x into the treap T(by modifying T). It requires that xhas deﬁned key and priority values. We have used the subroutines T REE-I NSERT,R IGHT-R OTATE, and R IGHT-R OTATE as deﬁned in CLRS.T REAP-I NSERT (T,x)1T REE-I NSERT (T,x)2 while x =root[T]and priority[x]<priority[p[x]]3 do if x=left[p[x]]4 then R IGHT-R OTATE (T,p[x])5 else L EFT-R OTATE (T,p[x])Note that parent pointers simplify the code but are not necessary. Since we only needto know the parent of each node on the path from the root to x(after the call toT REE-I NSERT), we can keep track of these ourselves.(d) Show that the expected running time of T REAP-I NSERT is O(lg n). Solution:T REAP-I NSERT ﬁrst inserts an item in the tree using the normal binary search treeinsert and then performs a number of rotations to restore the min-heap property.The normal binary-search-tree insertion algorithm T REE-I NSERT always places thenew item at a new leaf of tree. Therefore the time to insert an item into a treap isproportional to the height of a randomly built binary search tree, which as we saw inlecture is O(lg n)in expectation.The maximum number of rotations occurs when the new item receives a priority lessthan all priorities in the tree. In this case it needs to be rotated from a leaf to the root.An upper bound on the number of rotations is therefore the height of a randomly builtbinary search tree, which is O(lg n)in expectation. (We will see that this is a fairlyloose bound.) Because each rotation take constant time, the expected time to rotate isO(lg n).Thus the expected running time of T REAP-I NSERT is O(lg n+lg n)=O(lg n).T REAP-I NSERT performs a search and then a sequence of rotations. Although searching and rotating have the same asymptotic running time, they have different costs in practice. A search reads information from the treap without modifying it, while a rotation changes parent and child pointers within the treap. On most computers, read operations are much faster than write operations. Thus we would like T REAP-I NSERT to perform few rotations. We will show that the expected number of rotations performed is bounded by a constant (in fact, less than 2)!(a) (b)Figure 3: Spines of a binary search tree. The left spine is shaded in (a), and the right spine is shaded in (b).In order to show this property, we need some deﬁnitions, illustrated in Figure 3. The left spine of a binary search tree T is the path which runs from the root to the item with the smallest key. In other words, the left spine is the maximal path from the root that consists only of left edges. Symmetrically, the right spine of T is the maximal path from the root consisting only of right edges. The length of a spine is the number of nodes it contains.(e) Consider the treap T immediately after x is inserted using T REAP-I NSERT. Let Cbe the length of the right spine of the left subtree of x. Let D be the length of theleft spine of the right subtree of x. Prove that the total number of rotations that wereperformed during the insertion of x is equal to C+D.Solution: Prove the claim by induction on the number of rotations performed. Thebase case is when x is the parent of y. Performing the rotation so that y is the new rootgives y exactly one child, so C+D=1.Assume for induction that the number of rotations k performed during the insertionof x equals C+D. The base case is when 0 rotations are necessary and x is insertedas a leaf. Then C+D=0. To prove the inductive step, we show that if after k−1rotations C+D=k−1, then after k rotations C+D=k. Draw a picture of aleft and right rotation and observe that C+D increases by 1 in each case. Let y bethe parent of x, and suppose x is a left child of y. After performing a right rotation, ybecomes the right child of x, and the previous right child of x becomes the left childof y. That is, the left spine of the right subtree of x before the rotation is tacked onto y, so the length of that spine increases by one. The left subtree of x is unaffectedby a right rotation. The case of a left rotation is symmetric. Therefore after one morerotation C+D increases by one and k=C+D, proving the inductive hypothesis.We will now calculate the expected values of C and D. For simplicity, we assume that the keys are 1,2,...,n. This assumption is without loss of generality because we only compare keys.�For two distinct nodes x and y , let k = key [x ]and i = key [y ], and deﬁne the indicator random variableX 1 if y is a node on the right spine of the left subtree of x (in T ),i,k =0 otherwise .(f) Show that X i,k =1if and only if (1) priority [y ]> priority [x ], (2) key [y ]< key [x ], and(3) for every z such that key [y ]< key [z ]< key [x ],we have p riority [y ]< priority [z ].Solution:To prove this statement, we must prove both directions of the “if and only if”. Firstwe prove the “if” direction. We prove that if (1) priority [y ]> priority [x ], (2) key [y ]<key [x ], and (3) for every z such that key [y ]< key [z ]< key [x ]are true, priority [y ]< priority [z ], then X i,k =1. We prove this by contradiction. Assume that X i,k =0. That is, assume that y is not on the right spine of the left subtree of x . We show thatthis leads to a contradiction. If y is not on the right spine of the left subtree of x ,it could be in one of three places:1. Suppose y is in the right subtree of x . This contradicts condition (2) becausekey [y ]< key [x ].2. Suppose y is not in one of the subtrees of x . Then x and y must share somecommon ancestor z . Since key [y ]< key [x ], we know that y is in the left subtreeof z and x is in the right subtree of z and key [y ]< key [z ] < key [x ]. Since y isbelow z in the tree, priority [z ]< priority [x ]and priority [z ]< priority [y ]. Thiscontradicts condition (3).3. Suppose that y is in the left subtree of x but not on the right spine of the leftsubtree of x . This implies that there exists some ancestor z of y in the left subtreeof x such that y is in the left subtree of z . Hence key [y ]< key [z ]< key [x ]. Sincez is an ancestor of y , priority [z ]< priority [y ], which contradicts condition (3).All possible cases lead to contradictions, and so X i,k =1.Now for the “only if” part. We prove that if X i,k =1, then statements (1), (2), and (3) are true. If X i,k =1, then y is in the right spine of the left subtree of x . Since y is ina subtree of x , y must have been inserted after x ,so p riority [y ]> priority [x ], proving(1). Since y is in the left subtree of x , key [y ]< key [x ], proving (2). We prove (3) by contradiction: suppose that X i,k =1and there exists a z such that key [y ]< key [z ]< key [x ]and priority [z ]< priority [y ]. In other words, z was inserted before y . There arethree possible cases that satisfy the condition key [z ]< key [x ]:1. Suppose z is in the right spine of the left subtree of x .For y to be inserted into theright spine of the left subtree of x , it will have to be inserted into the right subtreeof z . Since key [y ]< key [z ], this leads to a contradiction.2. Suppose z is in the left subtree of x but not in the right spine. This implies thatz is in the left subtree of some node z in the right spine of x . Therefore for y tobe inserted into the right spine of the left subtree of x , it must be inserted into theright subtree of z . This leads to a contradiction by reasoning similar to case 1.3. Suppose that z is not in one of the subtrees of x . Then z and x have a commonancestor z such that z is in the left subtree of z and x is in the right subtree of x .This implies key [z ] < key [z ] < key [x ]. Since key [y ]< key [z ]< key [z ], y cannotbe inserted into the right subtree of z . Therefore it cannot be inserted in a subtreeof x , which is a contradiction.Therefore there can be no z such that key [y ] < key [z ] < key [x ] and priority [z ] < priority [y ]. This proves statement (3). We have proven both the “if” and “only if” directions, proving the claim.(g) Show that(k − i − 1)! 1 Pr {X i,k =1} == . (k − i +1)! (k − i +1)(k − i )Solution: We showed in the previous part that X i,k =1if and only if the priorities of the items between y and x are ordered in a certain way. Since all orderings are equally likely, to calculate the probability we count the number of permutations of priorities that obey this order and divide by the number of total number of priority permutations. We proved in (e) that whether or not X i,k =1depends only on the relative ordering of the priorities of y , x , and all z such that key [y ] < key [z ] < key [x ]. Since we assumed that the keys of the items come from {1,...,n }, the keys of the items in question are i,i +1,i +2,...,k − 1,k . There are (k − i +1)!permutations of the priorities of these items. Of these permutations, the ones for which X i,k =1are those where i has minimum priority, k has the second smallest priority, and the priorities of the remaining k − i − 1 items follow in any order. There are (k − i − 1)! of these permutations. Thus the probability that we get a “bad” order is (k −i −1)!/(k −i +1)!= 1/(k − i )(k − i +1).(h) Show thatk −1 � 1 1 E [C ]= =1− . j (j +1)k j =1 Solution:X For a node x with key k , E [C ]is the expected number of nodes in the right spine of the left subtree of x . This equals the sum of the expected value of the random variables i,k for all i in the tree. Since X i,k =0for all nodes i ≥ k , we only need to consider i <k .� � � k −1 �k −1 � E [C ]=E [X i,k ]=E X i,k i =1 i =1 k −1 =Pr {X i,k =1}i =1 k −1 � 1 =(k − i )(k − i +1) i =1 k −1 � 1= j (j +1)j =1 1To simplify this sum, observe that j (j 1+1) = j +1−j = − 1 . Therefore the sumj (j +1) j j +1 telescopes and we have 1 E [C ]=1− . kIf you didn’t see this, you could have proven that the equationk −1 � 1 1 =1− j (j +1)k j =1 holds by induction on k . In the proving the inductive hypothesis you might have 11discovered 1 − = .j j +1 j (j +1)(i) Use a symmetry argument to show that1 E [D ]=1− . n − k +1Solution: The idea is to consider the treap produced if the ordering relationship amongthe keys is reversed. That is, for all items x , leave priority [x ]unchanged but replacekey [x ]with n − key [x ]+1.Let T be the binary tree we get when inserting the items (in increasing order of priority) using the original keys. Once we remap the keys and insert them into a newbinary search tree, we get a tree T whose shape is the mirror image of the shape ofT . (reﬂected left to right). Consider the item x with key k in T and therefore has key n − k +1 i n T . The length of the left spine of x ’s right subtree in T has becomethe length of the right spine of x ’s left subtree in T . We know by part (g) that the expected length of the right spine of a left subtree of a node y is 1− 1/idkey [y ],so the expected length of the right spine of the left subtree of x in T is 1− 1/(n − k +1), which means that 1 E [D ]=1− . n − k +1(j) Conclude that the expected number of rotations performed when inserting a node into a treap is less than 2.Solution:11 E [number of rotations ]= E [C +D ]=E [C ]+E [D ]=1− +1− k n − k +1 ≤ 1+1=2. Problem 4-2. Being balancedCall a family of trees balanced if every tree in the family has height O (lg n ), where n is the number of nodes in the tree. (Recall that the height of a tree is the maximum number of edges along any path from the root of the tree to a leaf of the tree. In particular, the height of a tree with just one node is 0.)For each property below, determine whether the family of binary trees satisfying that property is balanced. If you answer is “no”, provide a counterexample. If your answer is “yes”, give a proof (hint: it should be a proof by induction). Remember that being balanced is an asymptotic property, so your counterexamples must specify an inﬁnite set of trees in the family, not just one tree. (a) Every node of the tree is either a leaf or it has two children.Solution: No. Counterexample is a right chain, with each node having a leaf hanging off to the left(b) The size of each subtree can be written as 2k − 1, where k is an integer (k is not the same for each subtree).Solution: Yes.Consider any subtree with root r . We know from the condition that the size of this subtree is 2k 1 − 1. We also know that the size of the subtree rooted at the left child of r is 2k 2 − 1, and the size of the subtree rooted at the right child of r is 2k 3 − 1. But the size of the subtree at r is simply the node r together with the nodes in the left and right subtrees. This leads to the equation 2k 1 − 1=1+(2k 2 − 1)+(2k 3 − 1),or 2k 1 =2k 2 +2k 3. Now we show that k 2 =k 3. This is easy to see if you consider the binary representations of k 1, k 2, and k 3.Otherwise, if we assume WLOG that k 2 ≤ k 3, then we have 2k 1−k 2 =1+2k 3−k 2. 2Now, the only pair of integer powers of 2 that could satisfy this equation are 21 and 0 . Thus k 3 − k 2 =0,or k 2 =k 3, and the left and right subtrees of r have an equal number of nodes. It follows that the tree is perfectly balanced.(c) There is a constant c>0such that, for each node of the tree, the size of the smallerchild subtree of this node is at least c times the size of the larger child subtree.Solution:Yes1. The proof is by induction. Assume that the two subtrees of x with n nodes in itssubtree has two children y and z with subtree sizes n1 and n2. By inductive hypothesis,the height of y’s subtree is at most d lg n1 and the height of z’s subtree is at most d lg n2for some constant d. We now prove that the height of x’s subtree is at most d lg n.Assume wlog that n1 ≥n2. Therefore, by the problem statement, we have n2 ≥cn1.Therefore, we have n=n1 +n2 +1≥(1+c)n1 +1≥(1+c)n1 and the height hof x’s subtree is d lg n1 +1≤d lg(n/(c+1))+1≤d lg n−d lg(1+c)+1≤d lg nif d lg(1+c)≥1. Therefore, for sufﬁciently large d, the height of a tree with n nodesis at most d lg n.(d) There is a constant c such that, for each node of the tree, the heights of its childrensubtrees differ by at most c.Solution: Yes1. Let n(h)be the minimum number of nodes that a tree of height h thatsatisﬁes the stated property can have. We show by induction that n(h)≥(1+α)h−1,for some constant 0<α≤1. We can then conclude that for a tree with n nodes,h≤log1+α(n+1)=O(lg n).For the base case, a subtree of height 0has a single node, and 1≥(1+α)0 −1forany constant α≤1.In the inductive step, assume for all trees of height k<h, that the n(k)≥(1+α)k−1.Now consider a tree of height h, and look at its two subtrees. We know one subtreemust have height h−1, and the other must have height at least h−1−c. Therefore,we known(h)≥n(h−1)+n(h−1−c)+1.Using the inductive hypothesis, we getn(h) ≥(1+α)h−1 −1+(1+α)h−1−c−1+1≥2(1+α)h−1−c−1.Suppose we picked αsmall enough so that (1+α)<21/(c+1). Then (1+α)c+1 <2.Therefore, we getn(h)≥2(1+α)h−1−c−1≥(1+α)h−1.1Note that in this problem we assume that a nil pointer is a subtree of size 0, and so a node with only one child has two subtrees, one of which has size 0. If you assume that a node with only one child has only one subtree, then the answer to this problem part is “no”.�11Handout18: Problem Set4SolutionsTherefore, we satisfy the inductive hypothesis.Note that if we plug this value for αback into h≤log1+α(n+1),we getlg(n+1)h≤≤(c+1)l g(n+1).lg(1+2c+1)(e) The average depth of a node is O(lg n). (Recall that the depth of a node x is thenumber of edges along the path from the root of the tree to x.)Solution: No.√Consider a tree with n−n nodes organized as a complete binary tree, and the other √ √n nodes sticking out as a chain of length n from the balanced tree. The height of√√√the tree is lg(n−n)+n=Ω(n), while the average depth of a node is at most�√√ √(1/n)n·(n+lg n)+(n−n)·lg n√√=(1/n)(n+n lg n+n lg n−nlgn)=(1/n)(n+n lg n)=O(lg n)√Thus, we have a tree with average node depth O(lg n), but height Ω(n).。

POJ题目分类

1000 A+B Problem 送分题1001 Exponentiation 高精度1003 Hangover 送分题1004 Financial Management 送分题1005 I Think I Need a Houseboat 几何1006 Biorhythms 送分题1007 DNA Sorting 送分题1008 Maya Calendar 日期处理1010 STAMPS 搜索＋DP1011 Sticks 搜索1012 Joseph 模拟/数学方法1014 Dividing 数论/DP?/组合数学->母函数?1015 Jury Compromise DP1016 Numbers That Count 送分题1017 Packets 贪心1018 Communication System 贪心1019 Number Sequence 送分题1020 Anniversary Cake 搜索1023 The Fun Number System 数论1025 Department 模拟1026 Cipher 组合数学1027 The Same Game 模拟1028 Web Navigation 送分题1031 Fence 计算几何1034 The dog task 计算几何1037 A decorative fence DP/组合数学1039 Pipe 几何1042 Gone Fishing 贪心/DP1045 Bode Plot 送分题(用物理知识)1046 Color Me Less 送分题1047 Round and Round We Go 高精度1048 Follow My Logic 模拟1049 Microprocessor Simulation 模拟1050 To the Max DP1053 Set Me 送分题1054 The Troublesome Frog 搜索1060 Modular multiplication of polynomials 高精度1061 青蛙的约会数论1062 昂贵的聘礼DP1064 Cable master DP/二分查找1065 Wooden Sticks DP1067 取石子游戏博弈论1068 Parencodings 送分题1069 The Bermuda Triangle 搜索1070 Deformed Wheel 几何1071 Illusive Chase 送分题1072 Puzzle Out 搜索1073 The Willy Memorial Program 模拟1074 Parallel Expectations DP1075 University Entrance Examination 模拟1080 Human Gene Functions DP->LCS变形1082 Calendar Game 博弈论1084 Square Destroyer 搜索？1085 Triangle War 博弈论1086 Unscrambling Images 模拟?1087 A Plug for UNIX 图论->最大流1088 滑雪DFS/DP1090 Chain ->格雷码和二进制码的转换1091 跳蚤数论1092 Farmland 几何1093 Formatting Text DP1094 Sorting It All Out 图论->拓扑排序1095 Trees Made to Order 组合数学1096 Space Station Shielding 送分题1097 Roads Scholar 图论1098 Robots 模拟1099 Square Ice 送分题1100 Dreisam Equations 搜索1101 The Game 搜索->BFS1102 LC-Display 送分题1103 Maze 模拟1104 Robbery 递推1106 Transmitters 几何1107 W's Cipher 送分题1110 Double Vision 搜索1111 Image Perimeters 搜索1112 Team Them Up! DP1113 Wall 计算几何->convex hull1119 Start Up the Startup 送分题1120 A New Growth Industry 模拟1122 FDNY to the Rescue! 图论->Dijkstra 1125 Stockbroker Grapevine 图论->Dijkstra 1128 Frame Stacking 搜索1129 Channel Allocation 搜索（图的最大独立集）1131 Octal Fractions 高精度1135 Domino Effect 图论->Dijkstra1137 The New Villa 搜索->BFS1141 Brackets Sequence DP1142 Smith Numbers 搜索1143 Number Game 博弈论1147 Binary codes 构造1148 Utopia Divided 构造1149 PIGS 图论->网络流1151 Atlantis 计算几何->同等安置矩形的并的面积->离散化1152 An Easy Problem! 数论1157 LITTLE SHOP OF FLOWERS DP1158 TRAFFIC LIGHTS 图论->Dijkstra变形1159 Palindrome DP->LCS1160 Post Office DP1161 Walls 图论1162 Building with Blocks 搜索1163 The Triangle DP1170 Shopping Offers DP1177 Picture 计算几何->同等安置矩形的并的周长->线段树1179 Polygon DP1180 Batch Scheduling DP1182 食物链数据结构->并查集1183 反正切函数的应用搜索1184 聪明的打字员搜索1185 炮兵阵地DP->数据压缩1187 陨石的秘密DP（BalkanOI99 Par的拓展）1189 钉子和小球递推?1190 生日蛋糕搜索/DP1191 棋盘分割DP1192 最优连通子集图论->无负权回路的有向图的最长路->BellmanFord 1193 内存分配模拟1194 HIDDEN CODES 搜索+DP1197 Depot 数据结构->Young T ableau1201 Intervals 贪心/图论->最长路->差分约束系统1202 Family 高精度1209 Calendar 日期处理1217 FOUR QUARTERS 递推1218 THE DRUNK JAILER 送分题1233 Street Crossing 搜索->BFS1245 Programmer, Rank Thyself 送分题1247 Magnificent Meatballs 送分题1248 Safecracker 搜索1250 T anning Salon 送分题1251 Jungle Roads 图论->最小生成树1271 Nice Milk 计算几何1273 Drainage Ditches 图论->最大流1274 The Perfect Stall 图论->二分图的最大匹配1275 Cashier Employment 图论->差分约束系统->无负权回路的有向图的最长路->Bellman-Ford1280 Game 递推1281 MANAGER 模拟1286 Necklace of Beads 组合数学->Polya定理1288 Sly Number 数论->解模线性方程组1293 Duty Free Shop DP1298 The Hardest Problem Ever 送分题1316 Self Numbers 递推同Humble Number一样1322 Chocolate 递推/组合数学1323 Game Prediction 贪心1324 Holedox Moving BFS+压缩储存1325 Machine Schedule 图论->二分图的最大匹配1326 Mileage Bank 送分题1327 Moving Object Recognition 模拟?1328 Radar Installation 贪心（差分约束系统的特例）1338 Ugly Numbers 递推(有O(n)算法)1364 King 图论->无负权回路的有向图的最长路->BellmanFord1370 Gossiping (数论->模线性方程有无解的判断)+(图论->DFS)2184 Cow Exhibition DP2190 ISBN 送分题2191 Mersenne Composite Numbers 数论2192 Zipper DP->LCS变形2193 Lenny's Lucky Lotto Lists DP2194 Stacking Cylinders 几何2195 Going Home 图论->二分图的最大权匹配2196 Specialized Four-Digit Numbers 送分题2197 Jill's Tour Paths 图论->2199 Rate of Return 高精度2200 A Card Trick 模拟2210 Metric Time 日期处理2239 Selecting Courses 图论->二分图的最大匹配2243 Knight Moves 搜索->BFS2247 Humble Numbers 递推(最优O(n)算法)2253 Frogger 图论->Dijkstra变形(和1295是一样的)2254 Globetrotter 几何2261 France '98 递推2275 Flipping Pancake 构造2284 That Nice Euler Circuit 计算几何2289 Jamie's Contact Groups 图论->网络流?2291 Rotten Ropes 送分题2292 Optimal Keypad DP2299 Ultra-QuickSort 排序->归并排序2304 Combination Lock 送分题2309 BST 送分题2311 Cutting Game 博弈论2312 Battle City 搜索->BFS2314 POJ language 模拟2315 Football Game 几何2346 Lucky tickets 组合数学2351 Time Zones 时间处理2379 ACM Rank T able 模拟+排序2381 Random Gap 数论2385 Apple Catching DP(像NOI98“免费馅饼”)2388 Who's in the Middle 送分题(排序)2390 Bank Interest 送分题2395 Out of Hay 图论->Dijkstra变形2400 Supervisor, Supervisee 图论->二分图的最大权匹配?2403 Hay Points 送分题2409 Let it Bead 组合数学->Polya定理2416 Return of the Jedi 图论->2417 Discrete Logging 数论2418 Hardwood Species 二分查找2419 Forests 枚举2421 Constructing Roads 图论->最小生成树2423 The Parallel Challenge Ballgame 几何2424 Flo's Restaurant 数据结构->堆2425 A Chess Game 博弈论2426 Remainder BFS2430 Lazy Cows DP->数据压缩1375 Intervals 几何1379 Run Away 计算几何->1380 Equipment Box 几何1383 Labyrinth 图论->树的最长路1394 Railroad 图论->Dijkstra1395 Cog-Wheels 数学->解正系数的线性方程组1408 Fishnet 几何1411 Calling Extraterrestrial Intelligence Again 送分题1430 Binary Stirling Numbers 日期处理1431 Calendar of Maya 模拟1432 Decoding Morse Sequences DP1434 Fill the Cisterns! 计算几何->离散化/1445 Random number 数据结构->碓1447 Ambiguous Dates 日期处理1450 Gridland 图论(本来TSP问题是NP难的，但这个图比较特殊，由现成的构造方法)1458 Common Subsequence DP->LCS1459 Power Network 图论->最大流1462 Random Walk 模拟+解线性方程组1463 Strategic game 贪心1466 Girls and Boys 图论->n/a1469 COURSES 贪心1475 Pushing Boxes DP1476 Always On the Run 搜索->BFS1480 Optimal Programs 搜索->BFS1481 The Die Is Cast 送分题1482 It's not a Bug, It's a Feature! 搜索->BFS1483 Going in Circles on Alpha Centauri 模拟1484 Blowing Fuses 送分题1485 Fast Food DP(似乎就是ioi2000的postoffice)1486 Sorting Slides 图论->拓扑排序1505 Copying Books DP+二分查找1510 Hares and Foxes 数论1512 Keeps Going and Going and ... 模拟1513 Scheduling Lectures DP1514 Metal Cutting 几何1515 Street Directions 图论->把一个无向连通图改造成为有向强连通图1517 u Calculate e 送分题1518 Problem Bee 几何1519 Digital Roots 送分题(位数可能很大)1520 Scramble Sort 排序1547 Clay Bully 送分题1555 Polynomial Showdown 送分题(非常阴险)1563 The Snail 送分题1601 Pizza Anyone? 搜索1604 Just the Facts 送分题1605 Horse Shoe Scoring 几何1606 Jugs 数论/搜索1631 Bridging signals DP+二分查找1632 Vase collection 图论->最大完全图1633 Gladiators DP1634 Who's the boss? 排序1635 Subway tree systems 图论->不同表示法的二叉树判同1637 Sightseeing tour 图论->欧拉回路1638 A number game 博弈论1639 Picnic Planning 图论->1641 Rational Approximation 数论1646 Double Trouble 高精度1654 Area 几何1657 Distance on Chessboard 送分题1658 Eva's Problem 送分题1660 Princess FroG 构造1661 Help Jimmy DP1663 Number Steps 送分题1664 放苹果组合数学->递推1677 Girls' Day 送分题1688 Dolphin Pool 计算几何1690 (Your)((Term)((Project))) 送分题1691 Painting A Board 搜索/DP1692 Crossed Matchings DP1693 Counting Rectangles 几何1694 An Old Stone Game 博弈论?1695 Magazine Delivery 图论->1712 Flying Stars DP1713 Divide et unita 搜索1714 The Cave 搜索/DP1717 Dominoes DP1718 River Crossing DP1719 Shooting Contest 贪心1729 Jack and Jill 图论->1730 Perfect Pth Powers 数论1732 Phone numbers DP1734 Sightseeing trip 图论->Euler回路1738 An old Stone Game 博弈论?1741 Tree 博弈论?1745 Divisibility DP1751 Highways 图论->1752 Advertisement 贪心/图论->差分约束系统1753 Flip Game 搜索->BFS1755 Triathlon 计算几何?1770 Special Experiment 树形DP1771 Elevator Stopping Plan DP1772 New Go Game 构造?1773 Outernet 模拟1774 Fold Paper Strips 几何1775 Sum of Factorials 送分题1776 T ask Sequences DP1777 Vivian's Problem 数论1870 Bee Breeding 送分题1871 Bullet Hole 几何1872 A Dicey Problem BFS1873 The Fortified Forest 几何+回溯1874 Trade on Verweggistan DP1875 Robot 几何1876 The Letter Carrier's Rounds 模拟1877 Flooded! 数据结构->堆1879 Tempus et mobilius Time and motion 模拟+组合数学->Polya定理1882 Stamps 搜索+DP1883 Theseus and the Minotaur 模拟1887 Testing the CATCHER DP1889 Package Pricing DP1893 Monitoring Wheelchair Patients 模拟+几何1915 Knight Moves 搜索->BFS1916 Rat Attack 数据结构->?1936 All in All DP?1946 Cow Cycling DP1947 Rebuilding Roads 二分1985 Cow Marathon 图论->有向无环图的最长路1995 Raising Modulo Numbers 数论->大数的幂求余2049 Finding Nemo 图论->最短路2050 Searching the Web 模拟(需要高效实现)2051 Argus 送分题(最好用堆，不用也可以过)2054 Color a Tree 贪心2061 Pseudo-random Numbers 数论2080 Calendar 日期处理2082 Terrible Sets 分治/2083 Fractal 递归2084 Game of Connections 递推(不必高精度)2105 IP Address 送分题2115 C Looooops 数论->解模线性方程2136 Vertical Histogram 送分题2165 Gunman 计算几何2179 Inlay Cutters 枚举2181 Jumping Cows 递推2182 Lost Cows ->线段树/＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝＝1370 Gossiping (数论->模线性方程有无解的判断)+(图论->DFS)1090 Chain ->格雷码和二进制码的转换2182 Lost Cows ->线段树/2426 Remainder BFS1872 A Dicey Problem BFS1324 Holedox Moving BFS+压缩储存1088 滑雪DFS/DP1015 Jury Compromise DP1050 To the Max DP1062 昂贵的聘礼DP1065 Wooden Sticks DP1074 Parallel Expectations DP1093 Formatting Text DP1112 Team Them Up! DP1141 Brackets Sequence DP1157 LITTLE SHOP OF FLOWERS DP1160 Post Office DP1163 The Triangle DP1170 Shopping Offers DP1179 Polygon DP1180 Batch Scheduling DP1191 棋盘分割DP1293 Duty Free Shop DP2184 Cow Exhibition DP2193 Lenny's Lucky Lotto Lists DP2292 Optimal Keypad DP1432 Decoding Morse Sequences DP1475 Pushing Boxes DP1513 Scheduling Lectures DP1633 Gladiators DP1661 Help Jimmy DP1692 Crossed Matchings DP1712 Flying Stars DP1717 Dominoes DP1718 River Crossing DP1732 Phone numbers DP1745 Divisibility DP1771 Elevator Stopping Plan DP1776 T ask Sequences DP1874 Trade on Verweggistan DP1887 Testing the CATCHER DP1889 Package Pricing DP1946 Cow Cycling DP1187 陨石的秘密DP（BalkanOI99 Par的拓展）1485 Fast Food DP(似乎就是ioi2000的postoffice) 2385 Apple Catching DP(像NOI98“免费馅饼”) 1064 Cable master DP/二分查找1037 A decorative fence DP/组合数学1936 All in All DP?1505 Copying Books DP+二分查找1631 Bridging signals DP+二分查找1159 Palindrome DP->LCS1458 Common Subsequence DP->LCS1080 Human Gene Functions DP->LCS变形2192 Zipper DP->LCS变形1185 炮兵阵地DP->数据压缩2430 Lazy Cows DP->数据压缩1067 取石子游戏博弈论1082 Calendar Game 博弈论1085 Triangle War 博弈论1143 Number Game 博弈论2311 Cutting Game 博弈论2425 A Chess Game 博弈论1638 A number game 博弈论1694 An Old Stone Game 博弈论?1738 An old Stone Game 博弈论?1741 Tree 博弈论?2083 Fractal 递归1104 Robbery 递推1217 FOUR QUARTERS 递推1280 Game 递推2261 France '98 递推2181 Jumping Cows 递推1316 Self Numbers 递推同Humble Number一样2084 Game of Connections 递推(不必高精度) 1338 Ugly Numbers 递推(有O(n)算法)2247 Humble Numbers 递推(最优O(n)算法)1322 Chocolate 递推/组合数学1189 钉子和小球递推?1947 Rebuilding Roads 二分2418 Hardwood Species 二分查找2082 Terrible Sets 分治/1001 Exponentiation 高精度1047 Round and Round We Go 高精度1060 Modular multiplication of polynomials 高精度1131 Octal Fractions 高精度1202 Family 高精度2199 Rate of Return 高精度1646 Double Trouble 高精度1147 Binary codes 构造1148 Utopia Divided 构造2275 Flipping Pancake 构造1660 Princess FroG 构造1772 New Go Game 构造?1005 I Think I Need a Houseboat 几何1039 Pipe 几何1070 Deformed Wheel 几何1092 Farmland 几何1106 Transmitters 几何2194 Stacking Cylinders 几何2254 Globetrotter 几何2315 Football Game 几何2423 The Parallel Challenge Ballgame 几何1375 Intervals 几何1380 Equipment Box 几何1408 Fishnet 几何1514 Metal Cutting 几何1518 Problem Bee 几何1605 Horse Shoe Scoring 几何1654 Area 几何1693 Counting Rectangles 几何1774 Fold Paper Strips 几何1871 Bullet Hole 几何1875 Robot 几何1873 The Fortified Forest 几何+回溯1031 Fence 计算几何1034 The dog task 计算几何1271 Nice Milk 计算几何2284 That Nice Euler Circuit 计算几何1688 Dolphin Pool 计算几何2165 Gunman 计算几何1755 Triathlon 计算几何?1379 Run Away 计算几何->1113 Wall 计算几何->convex hull1434 Fill the Cisterns! 计算几何->离散化/1151 Atlantis 计算几何->同等安置矩形的并的面积->离散化1177 Picture 计算几何->同等安置矩形的并的周长->线段树2419 Forests 枚举2179 Inlay Cutters 枚举1025 Department 模拟1027 The Same Game 模拟1048 Follow My Logic 模拟1049 Microprocessor Simulation 模拟1073 The Willy Memorial Program 模拟1075 University Entrance Examination 模拟1098 Robots 模拟1103 Maze 模拟1120 A New Growth Industry 模拟1193 内存分配模拟1281 MANAGER 模拟2200 A Card Trick 模拟2314 POJ language 模拟1431 Calendar of Maya 模拟1483 Going in Circles on Alpha Centauri 模拟1512 Keeps Going and Going and ... 模拟1773 Outernet 模拟1876 The Letter Carrier's Rounds 模拟1883 Theseus and the Minotaur 模拟2050 Searching the Web 模拟(需要高效实现)1012 Joseph 模拟/数学方法1086 Unscrambling Images 模拟?1327 Moving Object Recognition 模拟?1893 Monitoring Wheelchair Patients 模拟+几何1462 Random Walk 模拟+解线性方程组2379 ACM Rank T able 模拟+排序1879 Tempus et mobilius Time and motion 模拟+组合数学->Polya定理1520 Scramble Sort 排序1634 Who's the boss? 排序2299 Ultra-QuickSort 排序->归并排序1008 Maya Calendar 日期处理1209 Calendar 日期处理2210 Metric Time 日期处理1430 Binary Stirling Numbers 日期处理1447 Ambiguous Dates 日期处理2080 Calendar 日期处理2351 Time Zones 时间处理1770 Special Experiment 树形DP1916 Rat Attack 数据结构->?1197 Depot 数据结构->Young T ableau1182 食物链数据结构->并查集2424 Flo's Restaurant 数据结构->堆1877 Flooded! 数据结构->堆1445 Random number 数据结构->碓1023 The Fun Number System 数论1061 青蛙的约会数论1091 跳蚤数论1152 An Easy Problem! 数论2191 Mersenne Composite Numbers 数论2381 Random Gap 数论2417 Discrete Logging 数论1510 Hares and Foxes 数论1641 Rational Approximation 数论1730 Perfect Pth Powers 数论1777 Vivian's Problem 数论2061 Pseudo-random Numbers 数论1014 Dividing 数论/DP?/组合数学->母函数?1606 Jugs 数论/搜索1995 Raising Modulo Numbers 数论->大数的幂求余2115 C Looooops 数论->解模线性方程1288 Sly Number 数论->解模线性方程组1395 Cog-Wheels 数学->解正系数的线性方程组1000 A+B Problem 送分题1003 Hangover 送分题1004 Financial Management 送分题1006 Biorhythms 送分题1007 DNA Sorting 送分题1016 Numbers That Count 送分题1019 Number Sequence 送分题1028 Web Navigation 送分题1046 Color Me Less 送分题1053 Set Me 送分题1068 Parencodings 送分题1071 Illusive Chase 送分题1096 Space Station Shielding 送分题1099 Square Ice 送分题1102 LC-Display 送分题1107 W's Cipher 送分题1119 Start Up the Startup 送分题1218 THE DRUNK JAILER 送分题1245 Programmer, Rank Thyself 送分题1247 Magnificent Meatballs 送分题1250 T anning Salon 送分题1298 The Hardest Problem Ever 送分题1326 Mileage Bank 送分题2190 ISBN 送分题2196 Specialized Four-Digit Numbers 送分题2291 Rotten Ropes 送分题2304 Combination Lock 送分题2309 BST 送分题2390 Bank Interest 送分题2403 Hay Points 送分题1411 Calling Extraterrestrial Intelligence Again 送分题1481 The Die Is Cast 送分题1484 Blowing Fuses 送分题1517 u Calculate e 送分题1547 Clay Bully 送分题1563 The Snail 送分题1604 Just the Facts 送分题1657 Distance on Chessboard 送分题1658 Eva's Problem 送分题1663 Number Steps 送分题1677 Girls' Day 送分题1690 (Your)((Term)((Project))) 送分题1775 Sum of Factorials 送分题1870 Bee Breeding 送分题2105 IP Address 送分题2136 Vertical Histogram 送分题1555 Polynomial Showdown 送分题(非常阴险) 2388 Who's in the Middle 送分题(排序)1519 Digital Roots 送分题(位数可能很大)1045 Bode Plot 送分题(用物理知识)2051 Argus 送分题(最好用堆，不用也可以过) 1011 Sticks 搜索1020 Anniversary Cake 搜索1054 The Troublesome Frog 搜索1069 The Bermuda Triangle 搜索1072 Puzzle Out 搜索1100 Dreisam Equations 搜索1110 Double Vision 搜索1111 Image Perimeters 搜索1128 Frame Stacking 搜索1142 Smith Numbers 搜索1162 Building with Blocks 搜索1183 反正切函数的应用搜索1184 聪明的打字员搜索1248 Safecracker 搜索1601 Pizza Anyone? 搜索1713 Divide et unita 搜索1129 Channel Allocation 搜索（图的最大独立集）1190 生日蛋糕搜索/DP1691 Painting A Board 搜索/DP1714 The Cave 搜索/DP1084 Square Destroyer 搜索？1010 STAMPS 搜索＋DP1194 HIDDEN CODES 搜索+DP1882 Stamps 搜索+DP1101 The Game 搜索->BFS1137 The New Villa 搜索->BFS1233 Street Crossing 搜索->BFS2243 Knight Moves 搜索->BFS2312 Battle City 搜索->BFS1476 Always On the Run 搜索->BFS1480 Optimal Programs 搜索->BFS1482 It's not a Bug, It's a Feature! 搜索->BFS 1753 Flip Game 搜索->BFS1915 Knight Moves 搜索->BFS1017 Packets 贪心1018 Communication System 贪心1323 Game Prediction 贪心1463 Strategic game 贪心1469 COURSES 贪心1719 Shooting Contest 贪心2054 Color a Tree 贪心1328 Radar Installation 贪心（差分约束系统的特例）1042 Gone Fishing 贪心/DP1752 Advertisement 贪心/图论->差分约束系统1201 Intervals 贪心/图论->最长路->差分约束系统1097 Roads Scholar 图论1161 Walls 图论1450 Gridland 图论(本来TSP问题是NP难的，但这个图比较特殊，由现成的构造方法)2197 Jill's Tour Paths 图论->2416 Return of the Jedi 图论->1639 Picnic Planning 图论->1695 Magazine Delivery 图论->1729 Jack and Jill 图论->1751 Highways 图论->1122 FDNY to the Rescue! 图论->Dijkstra1125 Stockbroker Grapevine 图论->Dijkstra1135 Domino Effect 图论->Dijkstra1394 Railroad 图论->Dijkstra1158 TRAFFIC LIGHTS 图论->Dijkstra变形2395 Out of Hay 图论->Dijkstra变形2253 Frogger 图论->Dijkstra变形(和1295是一样的)1734 Sightseeing trip 图论->Euler回路1466 Girls and Boys 图论->n/a1515 Street Directions 图论->把一个无向连通图改造成为有向强连通图1635 Subway tree systems 图论->不同表示法的二叉树判同1275 Cashier Employment 图论->差分约束系统->无负权回路的有向图的最长路->Bellman-Ford1274 The Perfect Stall 图论->二分图的最大匹配1325 Machine Schedule 图论->二分图的最大匹配2239 Selecting Courses 图论->二分图的最大匹配2195 Going Home 图论->二分图的最大权匹配2400 Supervisor, Supervisee 图论->二分图的最大权匹配?1637 Sightseeing tour 图论->欧拉回路1383 Labyrinth 图论->树的最长路1094 Sorting It All Out 图论->拓扑排序1486 Sorting Slides 图论->拓扑排序1149 PIGS 图论->网络流2289 Jamie's Contact Groups 图论->网络流?1192 最优连通子集图论->无负权回路的有向图的最长路->BellmanFord 1364 King 图论->无负权回路的有向图的最长路->BellmanFord1985 Cow Marathon 图论->有向无环图的最长路1087 A Plug for UNIX 图论->最大流1273 Drainage Ditches 图论->最大流1459 Power Network 图论->最大流1632 Vase collection 图论->最大完全图2049 Finding Nemo 图论->最短路1251 Jungle Roads 图论->最小生成树2421 Constructing Roads 图论->最小生成树1026 Cipher 组合数学1095 Trees Made to Order 组合数学2346 Lucky tickets 组合数学1286 Necklace of Beads 组合数学->Polya定理2409 Let it Bead 组合数学->Polya定理1664 放苹果组合数学->递推。

深圳人口英文版(优秀)

The P opulation of Shenzhen and T he M edical D emand PredictionFrist editorSecond editorThird editorYour school nameAdvisor:********Abstract::AbstractIn this paper,using the establishment of the Malthus index of population growth model,gray model and linear regression model predict the population growth and beds demand for medical care in Shenzhen city.For question One:first of all,through the analysis the household register population,the non-registered population,the change of the number of the population characteristics of Shenzhen in recent ten years,we build the Malthus index of population growth model on the household registration population.To predict the number of the non-registered population,we use the cubic exponential smoothing method.We establish the gray model GM(1,1)based on different age groups,moreover，we verify the feasibility of the ing the model,we predict the number of Shenzhen resident non-registered population numbers and different age population in the next ten years.Then,using linear fitting method forecasts the value of beds demand.For question Two:we mainly consider the relationship between age structure and prevalence characteristics.According to each kind of disease in high-risk population,we determine the effect of age structure of population.Then,combined with the incidence rate,we analyzed the number of sick.According to the clinic characteristics of medical institutions of different types,we figure out the bed-space requirements of various kinds of diseases in different types of medical institutions by MATLAB software.Keywords:Malthus index of population growth model,Cubic exponential smoothing method,GM(1,1),MATLAB software1.IntroductionShenzhen is one of the fastest growing city in economy development.In recent years,with the reform and opening up and the changes of industrial structure in Shenzhen,the population of Shenzhen is undergoing tremendous changes.Therefore，it is particularly important to predict the population’s trend of Shenzhen.2.Problem analysisAiming at a population prediction problem,the population is divided into the household register population and the non-registered population.Due to the mobility of the population is relatively low,and the time of the prediction is relatively short,so we can develop the Malthus index of population growth model to estimate the number of registered population.The non-registered population population growth is much affected by various factors and the non-domicile population increased by second-degree curve,so we can use the cubic exponential smoothing method to predict the number of non-registered population.For the problem of a bed demand prediction,in a certain extent,the total numberof beds can reflect the demand of beds in 1979-2010years .Because of serious tendency of the aging population of Shenzhen city ,the number of beds are still needed on the original basis and keep a year-on-year increase of population at least in the future.For problem two,we do analysis on children pneumonia and hypertension,children pneumonia is characterized by pneumonia in children,which only for the age is in 0-4years old group.Meanwhile,hypertension is mainly in the elderly population.3.Model assumptions1.The data are accurate and reliable;2.Shenzhen city population prevalence is in the average rate of medium-sized city National hospital.3.The district of Shenzhen city population distribution is not changed greatly in 10years.4.Assuming that economic development in Shenzhen city level maintain a stable in the future of a period of time.5.The population policy and the medical system does not change .6.There is not considering the impact of major disease and war effect on population.4.Symbolic descriptionpopulation Permanent :y populationRegistered :1y populationregistered -Non :2y 5.Establishing and Analyzing the Models5.1the Prediction of P opulation of Shenzhen (2011-2020)We assume that the number of the population of Shenzhen is y in t year,of which the household registration population is 1y and the non-registered population is 2y .We can get :21y y y +=.Next,we forecast on the registration population and non -household population.5.1.1the Prediction M odel of the Registered P opulationIn consideration of the household registration population,owing to the mobility population is low relatively and Δt is short,so there is no need to consider the blocking effect of natural resources,environmental conditions and other factors on population growth,so using Malthus index of []1growth population model to estimate the population.)(110)0()(t t r e y t y −=(1)According to the formula (1)and the number of register population （Appendix I attached),,Using nonlinear least squares fitting by MATLAB software can get the r parameter value is 0.0677.Get the model :)1979(0667.0126.31)(−=t e t y (2)According to the formula (2),we get the number of estimated population,its residual error and relative error from 1979to 2010.The results are shown in the following T able 1:Year Actual number of registered population Estimated number of registered population Residual error Relative error (%)197931.2631.2600198032.0933.4496-1.3596 4.236833905198133.3935.7925-2.40257.195268044198235.4538.2996-2.84968.038363893198340.5240.9823-0.4623 1.140918065198443.5243.8528-0.33280.764705882198547.8646.92450.9355 1.954659423198651.4550.2113 1.2387 2.407580175198755.653.7283 1.8717 3.366366906198860.1457.4916 2.6484 4.403724643198964.8261.5186 3.3014 5.093181117199068.6565.8276 2.8224 4.111289148199173.2270.4385 2.7815 3.798825458199280.2275.3723 4.8477 6.043006731199387.6980.65177.03838.026342798199493.9786.30087.66928.161328083199599.1692.3457 6.8143 6.872025011996103.3898.814 4.566 4.4167150321997109.46105.7354 3.7246 3.4027041841998114.6113.1415 1.4585 1.2726876091999119.85121.0664-1.2164 1.0149353362000124.92129.5464-4.6264 3.7034902342001132.04138.6204-6.5804 4.9836413212002139.45148.33-8.886.367873792003150.93158.7196-7.7896 5.1610680452004165.13169.837-4.707 2.8504814392005181.93181.73310.19690.108228442006196.83194.4625 2.3675 1.2028146122007212.38208.0835 4.2965 2.0230247672008228.07222.6585 5.4115 2.3727364412009241.45238.2545 3.1955 1.3234624152010251.03254.9428-3.9128 1.558698164 Table1the N umber of E stimated P opulationopulation,,Its R esidual E rror and R elative E rrorIt is seen from Table1,the actual population and estimated population are quite close and the absolute relative error is within9%.we can conclude that the model is reliable,efficient and accurate.Through the model to calculate the predictive value of2011-2020,we obtain the following table:Table2the predictive value of2011-2020Year20112012201320142015Predictive value(10000)272.8001291.9082312.3547334.2333357.6444Year20162017201820192020Predictive value(10000)382.6954409.501438.1842468.8765501.71865.1.2the Prediction M odel of the Non-registered P opulationFirst,we analyze in the non-registered population data in Shenzhen city from 1979to2010.The results are shown the flowing Figure1:Figure 1the Trend of the Non-registered P opulationFrom the Figure one ,we can arrival at a conclusion that the non-registered population increased with second-degree curve from 1979to 2010.Meanwhile ,cubic exponential smoothing method can be used to predicted the the trend of the non-registered population from 2011to 2020.Exponential smoothing method is a time sequence ，which based on the analysis and prediction means,in the moving average method.With some time series prediction model of certain phenomena in the future prediction ，it is calculated by exponential smoothing value.Its principle is exponential smoothing any period values are the actual observed and previous period of exponential smoothing weighted average.The forecasting model of cubic exponential smoothing method are shown as follows:],...2,1,2=++=+∧T T c T b a y t t t T t (3)of which⎪⎪⎪⎪⎩⎪⎪⎪⎪⎨⎧+−−=−+−−−−=+−=]2[)1(2])34()45(2)56[()1(233)3()2()1(22)3()2()1(2)3()2()1(t t tt t t t t t t t tS S S c S S S b S S S a ααααααα(4)Where )(k t S is smoothing value with an index of k and α（10<<α）is the weighted coefficient.Here α=0.4,i nitial value ,55.13321)0(3)0(2)0(1=++===y y y S S S .We can gain )1(t S ,)2(t S ,(By Appendix 1).Moreover ,we can get the values asfollows:)3(t S 7867.743)1(32=S ,4296.705)2(32=S ,7718.668)3(32=S .When t=32,from the formula (3),32a ，32b ，32c can be calculated :783.843132=a ，28.214732=b ，0.377632=c ;Therefore,the prediction model is:2113776.02147.288431.783T T y T ++=+∧（5）here t=11.This model can be forecast the number of the the non-registered population from 2011to 2020.As is shown in Table 3:Table 3the n on-on-registeredregistered p opulation from 2011to 2020Year 20112012201320142015Predictivevalue value(10000)(10000)812.43841.78871.88902.74934.35Year 20162017201820192020Predictivevalue value(10000)(10000)966.72999.841033.71068.41103.8When T=3,we took the formula (4)into the prediction model of formula (3).Then:)3(2)2(2)1(221)1(1)1(3)1(33tt tt S S S y αααααα−+−−−−+−=+∧（6）To gain the prediction value ,taking t(from 0to 32)into formula (6).The fitted resultcan be seen in Figure 2:Figure 2Seen from figure 2,the forecasting value basically agrees with the actual value.Therefore,we can predict the number of the Non-registered population.5.1.3t he D evelopment T rend of P opulation S tructureIn order to predict the development trend of Shenzhen population structure,we listed the population age distribution of Shenzhen city according to Annex2,3,4.As shown in the following table:According to the data from the table,we can see the population age structure and sex structure data is only in 2000,2005and 2010,and the interval is 5years.So we can predict the population structure model in 2015and 2020by using the gray prediction theory.In the process of establishing the gray model,we put the data of 32years as the initial sequence:))3(),2(),1((0000x x x x =(7)So we can get )(k λ:)()1(00k x k x k −=λ(8)Checking and calculating of ratio values to test whether they are in ),(2212++−n n e e ,if not,we need to take sequence x (0)to do the necessary transformation to fall into the admissible covering interval .To seek the proper constant C and translation transform:nk c k x k y ,...,2,1,)()(00=+=(9)Frist ,to make the accumulated generating operation (AGO)gain the sequence :)]3()2(),2()1(),1([)]2(),2(),1([)0()1()0()1()1()1()1()1()1(x x x x x x x x x ++==(10)Of which )3,2,1)(()(10)1(==∑=k i x k x ki .3,2),1(5.0)(5.0)()1()1()1(=−+=k k x k x k z (11)Figure3From the Figure3,the change of population and beds is linear on the whole.The population growth faster during1991-2003,while the level of medical development is relatively stable.Especially in the past ten years and the population is linear growth basically.By the MATLAB,the data is used to do quadratic fit in nearly ten years,we can conclude the Figure4:Figure 4According to Figure 4,we can calculate that the parameter is 0.0036and -1.4815.Then ,equations can be obtained:()134815.10036.0)(−=n n y By Formula (13),we can calculate the number of the corresponding pared with the actual data,the relative error is shown as following:Figure5R elative E rror of E stimated V alue and A ctual V alue During2000-2010From Figure5,the relative error of estimated value and actual value is almost within0.1,indicating that the prediction equation can simulate the change trend of hospital beds accurately.Therefore the development trend of beds can be predicted 2011-2020in the Shenzhen city,as is shown the follows in the Table7:Table7t he D evelopment Trend of B eds During2010-2020Year20112012201320142015 Predictivevalue(1042455826316281503006332059person)Year20162017201820192020Predictive3414236317385864095843434person)value(104The population distribution of Shenzhen city in2000and2010the district is shown in Figure6and Figure7:Figure6the P roportion of P opulation D istribution in2020000Figure7the P roportion of P opulation D istribution in2010See from Figure 6and Figure 7,the population distribution changes in the proportion of the district is very little in 10years.So we can assume that the population distribution of Shenzhen District remained basically unchanged within ten years ,then the proportion of the population is shown in Table 8:Table 8the P roportion of the P opulation in 2010in Shenzhen City Luohu Distric tFutian District Nan-sh an District Baoan District Long-g ang District Yanhu District ,Guang ming New District Ping-sh an New District K（i）0.090.130.110.390.190.020.050.02The bed demand of each area:)(*)()(i Q i K i q =,here q is the need of beds each area,K(i)isthe proportionof the population in 2010in Shenzhen city,Q(i)is the bed need of Shenzhen and the variable i representing the year.By calculating,we can get the following Table 9:Table 9the Bed Demand Prediction of Each AreaLuohu DistrictFutian District Nan-sha n District Baoan District LonggangDistrictYanhuDistrictGuangm ing New District Ping-sha n New District k(i)0.080.1270.1050.3880.1940.020.0460.032011196431182578952847644911129736201221053342276310210510552612107892013225235752955109225461563129484420142405381831561166458326011382901201525644071336612438.8621964114749612016273143363584132476623682157010242017290546123813140907045726167010892018308649004051149717485771177411572019327652014300158917945819188412282020347455164560168528426868199713035.2the A nalysis of B eds D emand According to D ifferent D isease 5.2.1Infantile PneumoniaInfantile pneumonia is a common clinical disease ；it is easy to occur in four seasons ，especially in winter and spring.If it is not treated thoroughly,easy to repeated attacks and affecting the child's development normally.According to the Shenzhen City Health Bureau of Health Statistics Yearbook from 2001to 2010,the prevalence rate of different years are listed in the following T able 9:T able 9.The prevalence rate and the duration of hospitalization with i nfantilepneumoniaYear 2001200220032004200520062007200820092010P revalenc e rate (%)0.08970.08120.06950.07190.06290.05190.04020.04020.0310.03D uration of hospitalization ation(day (day )6666676667It can be seen from Table 9，along with the modern medical technology improving,the prevalence rate of infantile pneumonia is falling.As stated in 5-1-3,Checking and calculating of ratio values,we can know that they are in ()9680.9666,1.2.Thus we can use the gray model GM (1,1).UsingMATLAB software,the parameters can be obtained.The absolute value of the relative error between the estimated value and the actual value is:Figure 8The absolute value of the relative errorAs you can see from figure 8,the relative error is rather small,so we can estimate the future incidence rate using this model.By this model,we can get the prevalence rate was 0.0163and 0.0087in 2015and 2020.According to table 6,we can get the number of infantile pneumonia were 10529and 8773.5.2.2senile cataractAccording to theShenzhen City Health Bureau of Health Statistics Yearbook in 2001-2010,the prevalence rate of different years are listed in the following table 10:rate rate(%)(%)D uratio n of hospital ization ization((day)444443 2.7Along with the development of Shenzhen City,senile cataract prevalence is increasing year by year from the data in table 10.Then,we predicted the senile cataract prevalence by the cubic exponential smoothing method,and verify the correctness of the model.Moreover,we work out the relative error of the senile cataract prevalence between estimation value and the actual value ,as shown in the following table F igure 8:F igure 8the relative error of the senile cataract prevalenceFrom the table we can see ,owing to having the smaller relative errors between model value and real value,so it is a reasonable model.By using this mode,lwe can estimate the prevalence rate ,and then get that the prevalence rate is 0.00361and 0.01433.According to table 6,the number of old people who suffering from senile cataract were 4394and 29262.5.2.3B ed demand of different type types s hospital According to the statistical yearbook of 2011medical institutions of various kindsvisits and inpatients in Shenzhen city ,health agencies can be divided for the hospital,nursing homes,clinics and so on 。

多尺度分类挖掘算法

收稿日期：２０２００１１３；修回日期：２０２００３０３基金项目：国家社科基金重大项目（１３＆ＺＤ０９１，１８ＺＤＡ２００）作者简介：张璐璐（１９９３），女，河北景县人，硕士，主要研究方向为数据挖掘、智能信息处理；赵书良（１９６７），男（通信作者），河北献县人，教授，博导，主要研究方向为数据挖掘、智能信息处理（ｚｈａｏｓｈｕｌｉａｎｇ＠ｓｉｎａ．ｃｏｍ）；田真真（１９９４），女，河北威县人，硕士，主要研究方向为数据挖掘、智能信息处理；陈润资（１９８１），男，河南潢川人，博士研究生，主要研究方向为数据挖掘、智能信息处理．多尺度分类挖掘算法张璐璐ａ，ｂ，ｃ，赵书良ａ，ｂ，ｃ，田真真ａ，ｂ，ｃ，陈润资ｄ（河北师范大学ａ．计算机与网络空间安全学院；ｂ．河北省供应链大数据分析与数据安全工程研究中心；ｃ．河北省网络与信息安全重点实验室；ｄ．数学科学学院，石家庄０５００２４）摘　要：多尺度分类挖掘多局限于空间数据，且对一般数据尺度特性进行分类的研究较少。

针对上述问题，进行普适的多尺度分类方法研究，以扩大多尺度适用范围。

从空间数据估计角度出发，结合层次理论和尺度特性，基于概率密度估计离散化方法，针对数据的多尺度特性进行分类挖掘。

以非局部均值和三次卷积插值为理论基础，利用Ｑ统计和不一致度量进行操作，提出多尺度分类尺度上推算法和多尺度分类尺度下推算法。

采用ＵＣＩ数据集和Ｈ省人口真实数据集进行实验，并与ＣＦＷ、ＭＳＣＳＵＡ和ＭＳＣＳＤＡ等算法进行对比，结果表明，该算法可行有效。

与其他算法相比，尺度上推算法正确率平均提高４．５％，Ｆｓｃｏｒｅ提高４．８％，ＮＭＩ提高１２．３％，尺度下推算法各个相应指标分别平均提高５．３％，６．６％和１１．８％。

关键词：多尺度；不一致度量；尺度转换；多尺度分类挖掘；Ｑ统计中图分类号：ＴＰ３９１文献标志码：Ａ文章编号：１００１３６９５（２０２１）０２０１６０４１４０７ｄｏｉ：１０．１９７３４／ｊ．ｉｓｓｎ．１００１３６９５．２０２０．０１．０００７ＭｕｌｔｉｓｃａｌｅｃｌａｓｓｉｆｉｃａｔｉｏｎａｌｇｏｒｉｔｈｍＺｈａｎｇＬｕｌｕａ，ｂ，ｃ，ＺｈａｏＳｈｕｌｉａｎｇａ，ｂ，ｃ，ＴｉａｎＺｈｅｎｚｈｅｎａ，ｂ，ｃ，ＣｈｅｎＲｕｎｚｉｄ（ａ．ＣｏｌｌｅｇｅｏｆＣｏｍｐｕｔｅｒ＆ＣｙｂｅｒＳｅｃｕｒｉｔｙ，ｂ．ＨｅｂｅｉＰｒｏｖｉｎｃｉａｌＥｎｇｉｎｅｅｒｉｎｇＲｅｓｅａｒｃｈＣｅｎｔｅｒｆｏｒＳｕｐｐｌｙＣｈａｉｎＢｉｇＤａｔａＡｎａｌｙｔｉｃｓ＆ＤａｔａＳｅｃｕｒｉｔｙ，ｃ．ＫｅｙＬａｂｏｒａｔｏｒｙｏｆＮｅｔｗｏｒｋ＆ＩｎｆｏｒｍａｔｉｏｎＳｅｃｕｒｉｔｙ，ｄ．ＳｃｈｏｏｌｏｆＭａｔｈｅｍａｔｉｃａｌＳｃｉｅｎｃｅｓ，ＨｅｂｅｉＮｏｒｍａｌＵｎｉｖｅｒｓｉｔｙ，Ｓｈｉｊｉａｚｈｕａｎｇ０５００２４，Ｃｈｉｎａ）Ａｂｓｔｒａｃｔ：Ｍｕｌｔｉｓｃａｌｅｃｌａｓｓｉｆｉｃａｔｉｏｎｍｉｎｉｎｇａｒｅｍｏｓｔｌｙｌｉｍｉｔｅｄｔｏｓｐａｔｉａｌｄａｔａ，ａｎｄｔｈｅｒｅａｒｅｆｅｗｒｅｓｅａｒｃｈｅｓｏｎｓｃａｌｅｃｈａｒａｃｔｅｒｉｓｔｉｃｓｏｆｇｅｎｅｒａｌｄａｔａ．Ｂｙｓｏｌｖｉｎｇｔｈｅａｂｏｖｅｐｒｏｂｌｅｍｓ，ｔｈｉｓｐａｐｅｒｔｒｉｅｄｔｏｓｔｕｄｙｔｈｅｕｎｉｖｅｒｓａｌｍｕｌｔｉｓｃａｌｅｃｌａｓｓｉｆｉｃａｔｉｏｎｍｅｔｈｏｄ，ｉｎｏｒｄｅｒｔｏｅｘｐａｎｄｔｈｅｓｃｏｐｅｏｆｍｕｌｔｉｓｃａｌｅａｐｐｌｉｃａｔｉｏｎ．Ｆｒｏｍｔｈｅｐｅｒｓｐｅｃｔｉｖｅｏｆｓｐａｔｉａｌｄａｔａｅｓｔｉｍａｔｉｏｎ，ｃｏｍｂｉｎｅｄｔｈｅｈｉｅｒａｒｃｈｉｃａｌｔｈｅｏｒｙａｎｄｓｃａｌｅｃｈａｒａｃｔｅｒｉｓｔｉｃｓ，ａｎｄｂａｓｅｄｏｎｔｈｅｄｉｓｃｒｅｔｉｚａｔｉｏｎｍｅｔｈｏｄｏｆｐｒｏｂａｂｉｌｉｔｙｄｅｎｓｉｔｙｅｓｔｉｍａｔｉｏｎ，ｔｈｉｓｐａｐｅｒｓｔｕｄｉｅｄｔｈｅｃｌａｓｓｉｆｉｃａｔｉｏｎｍｉｎｉｎｇｏｎｍｕｌｔｉｓｃａｌｅｃｈａｒａｃｔｅｒｉｓｔｉｃｓｏｆｇｅｎｅｒａｌｄａｔａ．Ｂａｓｅｄｏｎｔｈｅｔｈｅｏｒｙｏｆｎｏｎｌｏｃａｌｍｅａｎａｎｄｄｏｕｂｌｅｃｕｂｅｉｎｔｅｒｐｏｌａｔｉｏｎ，ｕｓｉｎｇＱｓｔａｔｉｓｔｉｃｓａｎｄｉｎｃｏｎｓｉｓｔｅｎｔｍｅａｓｕｒｅｍｅｎｔｔｏｏｐｅｒａｔｅ，ｉｔｐｒｏｐｏｓｅｄｔｈｅｕｐｓｃａｌｉｎｇａｌｇｏｒｉｔｈｍｏｆｍｕｌｔｉｓｃａｌｅｃｌａｓｓｉｆｉｃａｔｉｏｎａｎｄｄｏｗｎｓｃａｌｉｎｇａｌｇｏｒｉｔｈｍｏｆｍｕｌｔｉｓｃａｌｅｃｌａｓｓｉｆｉｃａｔｉｏｎ．ＴｈｉｓｐａｐｅｒｐｅｒｆｏｒｍｅｄｅｘｐｅｒｉｍｅｎｔｓｏｎＵＣＩｄａｔａｓｅｔｓａｎｄＨｐｒｏｖｉｎｃｅｒｅａｌｐｏｐｕｌａｔｉｏｎｄａｔａｓｅｔ，ａｎｄｃｏｍｐａｒｅｄｗｉｔｈＣＦＷ，ＭＳＣＳＵＡ，ＭＳＣＳＤＡａｎｄｏｔｈｅｒａｌｇｏｒｉｔｈｍｓ．Ｔｈｅｒｅｓｕｌｔｓｓｈｏｗｔｈａｔｔｈｅａｌｇｏｒｉｔｈｍｓｉｎｔｈｉｓｐａｐｅｒａｒｅｆｅａｓｉｂｌｅａｎｄｅｆｆｅｃｔｉｖｅ．Ｃｏｍｐａｒｅｄｗｉｔｈｏｔｈｅｒａｌｇｏｒｉｔｈｍｓ，ｔｈｅｕｐｓｃａｌｉｎｇａｌｇｏｒｉｔｈｍｉｍｐｒｏｖｅｓａｃｃｕｒａｃｙｂｙ４．５％，Ｆｓｃｏｒｅｂｙ４．８％ａｎｄＮＭＩｂｙ１２．３％ａｎｄｔｈｅｄｏｗｎｓｃａｌｉｎｇａｌｇｏｒｉｔｈｍｉｍｐｒｏｖｅｓｔｈｅｃｏｒｒｅｓｐｏｎｄｉｎｇｉｎｄｅｘｅｓｂｙ５．３％，６．６％ａｎｄ１１．８％．Ｋｅｙｗｏｒｄｓ：ｍｕｌｔｉｓｃａｌｅ；ｄｉｓａｇｒｅｅｍｅｎｔｍｅａｓｕｒｅ；ｓｃａｌｅｃｏｎｖｅｒｓｉｏｎ；ｍｕｌｔｉｓｃａｌｅｃｌａｓｓｉｆｉｃａｔｉｏｎｍｉｎｉｎｇ；Ｑｓｔａｔｉｓｔｉｃｓ０　引言尺度是各种数据自身的属性，普遍存在于客观世界中［１，２］。

政策不确定性的宏观经济后果2365

政策不确定性的宏观经济后果2365经济理论与经济管理2014年第2期政策不确定性的宏观经济后果金雪军钟意王义中(浙江大学经济学院，杭州310027) [提要]本文采取贝克的政策不确定性指数，运用FAVAR方法分析政策不确定性冲击对中国宏观经济的影响。

经验结果发现，政策不确定性冲击对GDP、投资、消费、出口和价格变动会带来负向影响，导致实际有效汇率贬值，促使股票价格和房地产价格下跌。

同时发现，政策不确定性作用于宏观经济的主要机制为预期渠道。

该结论表明，政府应当尽量保持宏观经济政策的稳定性和持续性，并加强引导公众合理预期。

[关键词]政策不确定性FAVAR模型;政策不确定性指数[中图分类号]F822.0 [文献标识码]A [文章编号]1000-596X (2014) 02一0017-10这种反向关系在不同国家(例如发达国家和发展中国家)和不同时间都是稳健的。

因一、引言一直以来，不确定性对经济的影响被学术界所经济萧条的一个最突出的特征就是不确定性的关注。

[5J尤其是2007年爆发的美国次贷危机，使得普遍蔓延。

[IJ美国联邦公开市场委员会一再强调不政府层面和学术界更加关注这一问题。

经济主体面确定性是导致美国2001年以及2007-2009年经济对的不确定性水平突然变化已经被视为驱动美国商J衰退的一个关键因素。

由斯托克和沃森(Stock业周期的一个重要的冲击。

[6J布卢姆(Bloom)[7and 使Watson)[3J也认为美国在2007-2009年经济衰退用一个简单的简约式(Reduced-form)VAR模型，期间，产出和就业下降的主要原因来自金融和不确估计不确定性冲击会减少大约1%的美国工业产定性冲击。

大量证据显示不确定性是反经济周期出。

并且初始下降以后，工业产出迅速恢复，随后的，在经济萧条时剧烈增加而在经济繁荣时下降，产生超调，超过其总体趋势大约1%。

古里奥等人并且不确定性强烈影响经济衰退和复苏程度。

Kalman filtering with intermittent observations

where xt ∈
n
is the state vector, yt ∈
m
the output vector, wt ∈
p
and vt ∈
m
are Gaussian
random vectors with zero mean and covariance matrices Q ≥ 0 and R > 0, respectively. wt is independent of ws for s < t. Assume that the initial state, x0 , is also a Gaussian vector of zero mean and covariance Σ0 . Under the hypothesis of stabilizability of the pair (A, Q) and detectability of the pair (A, C ), the estimation error covariance of the Kalman ﬁlter converges to a unique value from any initial condition [8]. These assumptions have been relaxed in various ways [8]. Extended Kalman ﬁltering attempts to cope with nonlinearities in the model; particle ﬁltering is also appropriate for nonlinear models, and additionally does not require the noise model to be Gaussian. Recently, more general observation processes have been studied. In particular, in [9], [10] the case in which observations are randomly spaced in time according to a Poisson process has been studied, where the underlying dynamics evolve in continuous time. These authors showed the existence of a lower bound on the arrival rate of the observations below which it is possible to maintain the estimation error covariance below a ﬁxed value, with high probability. The results were restricted to scalar SISO systems.

产品替代弹性——Globalization and gains from variety

Globalization and the Gains from Variety
I) Introduction It is striking that in the quarter-century since Krugman (1979) revolutionized international trade theory by modeling how countries could gain from trade through the import of new varieties, no one has structurally estimated the impact of increased variety on aggregate welfare. As a result, our understanding of the importance of new trade theory for national welfare rests on conjecture, calibration, and case studies. This paper represents the first attempt to provide an answer to the question of how much increases in traded varieties matter for the US. Analyzing the most disaggregated US import data available for the period between 1972 and 2001, we find that consumers have low elasticities of substitution across similar goods produced in different countries. This establishes the reasonableness of a key assumption underlying the Dixit-Stiglitz (1977) framework – namely, that consumers value variety. Moreover, we find that the four-fold increase in available global varieties arising in the last 30 years has produced a large welfare gain for the United States. Increases in imported varieties have raised US real income by about 3 percent. In short, our results provide stunning confirmation of the importance of thinking about international trade within the Dixit-Stiglitz framework. In order to generate these point estimates, our work makes contributions to a number of literatures. The first is toward the measurement of prices. One of the major problems confronting economists is how to build an aggregate price index that allows for the creation of new goods and varieties. While there is general agreement that this is important, efforts to accomplish it

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

AN EMPIRICAL INVESTIGATION OF THE IMPACT OF DISCRETIZATION ON COMMON DATA DISTRIBUTIONS

Michael Kumaran Ismail

Master of Technology (Information Technology)

2003

RMIT University 2

An Empirical Investigation of the Impact of Discretization on Common Data Distributions

by Michael K. Ismail

A dissertation submitted in partial fulfillment of the requirements for the degree of Master of Technology (Information Technology)

Department of Computer Science RMIT University Melbourne, VIC 3001, AUSTRALIA

Abstract This study attempts to identify the merits of six of the most popular discretization methods when confronted with a randomly generated dataset consisting of attributes that conform to one of eight common statistical distributions. It is hoped that the analysis will enlighten as to a heuristic which identifies the most appropriate discretization method to be applied, given some preliminary analysis or visualization to determine the type of statistical distribution of the attribute to be discretized. Further, the comparative effectiveness of discretization given each data distribution is a primary focus. Analysis of the data was accomplished by inducing a decision tree classifier (C4.5) on the discretized data and an error measure was used to determine the relative value of discretization. The experiments showed that the method of discretization and the level of inherent error placed in the class attribute have a major impact on classification errors generated post-discretization. More importantly, the general effectiveness of discretization varies significantly depending on the shape of data distribution considered. Distributions that are highly skewed or have high peaks tend to result in higher classification errors, and the relative superiority of supervised discretization over unsupervised discretization is diminished significantly when applied to these data distributions. 3

Declaration I certify that all work on this dissertation was carried out between March 2003 and June 2003 and it was not submitted for any academic award at any other college, institute or university. The work presented was carried out under the supervision of Dr Vic Ciesielski who proposed the idea of creating random data and applying an inherent error in the construction of class labels. All work in the dissertation is my own except where acknowledged in the text.

Signed, Michael K. Ismail, 25th of June, 2003 4

TABLE OF CONTENTS 1. Introduction....................................................................................................................................................6 2. Data Generation Methodology.......................................................................................................................6 3. Data Distributions..........................................................................................................................................8 3.1 Normal Distribution....................................................................................................................................8 3.2 Uniform Distribution...................................................................................................................................8 3.3 Leptokurtic Distribution..............................................................................................................................9 3.4 Platykurtic Distribution............................................................................................................................10 3.5 Bimodal Distribution.................................................................................................................................10 3.6 Skewed Distribution..................................................................................................................................11 3.7 Exponential Distribution...........................................................................................................................12 3.8 Zipf Distribution.......................................................................................................................................12 4. Discretization Techniques............................................................................................................................13 4.1 Unsupervised Discretization.....................................................................................................................13 4.1.1 Equal Interval Width/Uniform Binning.............................................................................................13 4.1.2 Equal Frequency/Histogram Binning................................................................................................13

RMIT University An Empirical Investigation of the Impact of Discretization on Common Data D

合集下载

职业教育投入对全要素生产率的r影响研究r——基于空间纠正系统GMM法

VC毕业论文GMRES算法的加速收敛现象分析毕业论文

PbI2

MIT麻省理工学院算法导论公开课 Problem Set 4 solution

POJ题目分类

深圳人口英文版(优秀)

多尺度分类挖掘算法

政策不确定性的宏观经济后果2365

Kalman filtering with intermittent observations

产品替代弹性——Globalization and gains from variety

文档推荐

最新文档

RMIT University An Empirical Investigation of the Impact of Discretization on Common Data D

合集下载

职业教育投入对全要素生产率的r影响研究r——基于空间纠正系统GMM法

VC毕业论文GMRES算法的加速收敛现象分析毕业论文

PbI2

MIT麻省理工学院 算法导论公开课 Problem Set 4 solution

POJ题目分类

深圳人口英文版(优秀)

多尺度分类挖掘算法

政策不确定性的宏观经济后果2365

Kalman filtering with intermittent observations

产品替代弹性——Globalization and gains from variety

文档推荐

最新文档

MIT麻省理工学院算法导论公开课 Problem Set 4 solution