IDRQR An Incremental Dimension Reduction Algorithm via QR Decomposition
- 格式:pdf
- 大小:150.80 KB
- 文档页数:10
gtsam 常用因子English answers:IMU Preintegration.Inertial Measurement Unit (IMU) preintegration is a technique used to integrate the IMU measurements over a period of time, typically between two keyframes. This reduces the computational cost of integrating the measurements online and allows for more efficient optimization. GTSAM provides a robust and accurate implementation of IMU preintegration.Stereo Vision.Stereo vision is a technique used to estimate the depth of a scene using two or more cameras. GTSAM provides a variety of stereo vision factors, including the pinhole model and the fisheye model. These factors can be used to estimate the pose of the cameras and the depth of the scene.Lidar.Lidar (Light Detection and Ranging) is a remote sensing technology that uses laser pulses to measure the distance to objects. GTSAM provides a variety of lidar factors, including the point-to-plane, point-to-line, and plane-to-plane models. These factors can be used to estimate the pose of the lidar sensor and the location of objects in the scene.GPS.Global Positioning System (GPS) is a satellite-based navigation system that provides location and time information. GTSAM provides a variety of GPS factors, including the position-only model and the velocity-aided model. These factors can be used to estimate the pose of the GPS receiver and the velocity of the vehicle.Odomery.Odometry is a technique used to estimate the pose of a vehicle using the measurements from its wheel encoders. GTSAM provides a variety of odometry factors, including the differential drive model and the unicycle model. These factors can be used to estimate the pose of the vehicle and the velocity of the wheels.Chinese answers:IMU预积分。
曲率约束正则化(Curvature Constraint Regularization)是一种在机器学习和优化问题中使用的技术,用于控制模型的复杂度并防止过拟合。
它通过引入与模型曲率相关的正则化项来实现这一目标。
在传统的正则化方法中,如L1正则化和L2正则化,我们通常通过限制模型参数的绝对值或平方和来控制模型的复杂度。
然而,这些方法主要关注参数的大小,而不直接考虑模型的曲率。
曲率约束正则化的核心思想是通过限制模型的曲率来控制其复杂度。
曲率可以看作是函数在给定点的局部变化率,它反映了模型在该点附近的弯曲程度。
通过限制曲率,我们可以使模型更加平滑,并减少过拟合的风险。
在实际应用中,曲率约束正则化可以通过在损失函数中添加与模型曲率相关的正则化项来实现。
例如,对于神经网络模型,可以计算模型的Hessian矩阵(即二阶导数矩阵),并使用其范数或特征值作为正则化项。
通过最小化这个正则化项,我们可以限制模型的曲率,并使其更加平滑。
需要注意的是,曲率约束正则化在实际应用中可能会面临一些挑战。
计算模型的曲率可能需要额外的计算资源和时间,特别是对于复杂的模型。
此外,确定合适的曲率约束参数也可能需要一些经验和实验验证。
总的来说,曲率约束正则化是一种有趣且有用的技术,可以帮助我们更好地控制模型的复杂度并防止过拟合。
然而,在实际应用中,我们需要权衡其计算复杂性和性能提升
之间的权衡,并根据具体的问题和数据集来选择合适的正则化方法。
基于非凸与不可分离正则化算法的电容层析成像图像重建李宁;朱朋飞;张立峰;卢栋臣
【期刊名称】《化工学报》
【年(卷),期】2024(75)3
【摘要】搅拌器内两相混合是化工生产中常见的现象,电容层析成像(ECT)技术主要对两相分布进行可视化重构,以达到监测的目的。
受稀疏贝叶斯学习的启发,提出了一种非凸与不可分离正则化(NNR)算法重建ECT图像。
在稀疏先验的基础上引入矩阵低秩特性,采用最大后验估计在潜在空间中提出一个新的优化问题,利用对偶变量将潜在空间的目标函数映射到原始空间进行迭代求解,用来恢复同时稀疏与低秩的矩阵。
与凸近似L1范数相比,NNR算法可获得更准确的重建图像,同时比非凸可分离方法更容易收敛到全局最优解。
为验证NNR算法的重建效果,通过数值仿真与静态实验的方法分别与其他5种算法进行重建对比。
结果表明:NNR算法可以有效减少重建伪影,提升中心物体的重建质量,为搅拌器内两相分布提供了高质量的重建算法。
【总页数】11页(P836-846)
【作者】李宁;朱朋飞;张立峰;卢栋臣
【作者单位】重庆工商大学化学化工系;华北电力大学自动化系
【正文语种】中文
【中图分类】TB937
【相关文献】
1.迭代正则化电容层析成像图像重建算法
2.电容层析成像图像重建的总变差正则化算法
3.正则化优化修正的电容层析成像图像重建算法
4.基于正则化的LANDWEBER电容层析成像图像重建算法研究
5.基于非凸熵最小化与高斯混合模型聚类的电容层析成像图像重建
因版权原因,仅展示原文概要,查看原文内容请购买。
Package‘xwf’October14,2022Version0.2-3Date2020-02-19Title Extrema-Weighted Feature ExtractionAuthor Willem van den Boom[aut,cre]Maintainer Willem van den Boom<*****************>Description Extrema-weighted feature extraction for varying length functional data.Func-tional data analysis method that performs dimensionality reduction based on predefined fea-tures and allows for quantile weighting.Method implemented as pre-sented in van den Boom et al.(2018)<doi:10.1093/bioinformatics/bty120>.License MIT+file LICENSEImports mgcvRoxygenNote6.0.1NeedsCompilation noRepository CRANDate/Publication2020-02-2007:00:02UTCR topics documented:default_psi (2)xwf (2)xwfGAM (3)xwfGridsearch (4)XWFpValues (6)Index912xwf default_psi Default psi listDescriptionList with the same local feature functions psi as in the original paperUsagedefault_psi()ValueList with4different local features psiExamplesdefault_psi()xwf Compute XWFsDescriptionCompute extrema-weighted features based on functions,predefined local features,and weighting functionsUsagexwf(xx,t,n.i,psi,w=function(t,i)ifelse(left,min(1,(1-F(xx[[i]](t)))/(1-b)),min(1,F(xx[[i]](t))/b)),b=0.5,F=NULL,t.min=NULL,t.max=NULL,t.range=NULL,rel.shift=0.001,left=TRUE)Argumentsxx List of function for which to compute the XWFst Matrix containing the times at which the functions xx were measured:Element (i,j)contains the time of the j-th measurement of the i-th function.n.i Vector containing the number of measurements for each function.Thefirst n.i[i] elements of the i-th row of t should not be NA.psi Predefined local feature which is a function of a function(first argument)and a measurement time(second argument)w Weighting function.The default is the one used in the original paper.xwfGAM3b Parameter of the weighting function.See original paper for details.Ignored ifweighting function w is not the default.F CDF of the values of the functions xx.Ignored if weighting function w is notthe default.t.min Vector with time offirst measurement for each puted from t if omitted but providing it saves computational cost.t.max Analogous to t.min but now the time of the last measurement.t.range Vector with differences between t.max and t.min.Can be supplied to avoid recomputation.rel.shift Optional relative reduction of the integration range to avoid instabilities at the end of the integration ranges.Set to0if no such correction is desired.left Boolean specifying whether the left(TRUE)or right(FALSE)extrema-weighted features should be computed:Left and right refer to the weighting function.Ig-nored if weighting function w is not the default.ValueVector containing the extrema-weighted features obtained by numerical integration for each of the functions.Examplesxwf(xx=list(function(t)t),t=(1:10)/10,n.i=10,psi=function(x,t)x(t),b=.2,F=function(x)x)xwfGAM Evaluate the GAMDescriptionEvaluate the generalized additive model for a set of computed extrema-weighted featuresUsagexwfGAM(wL,wR,y,z=NULL)ArgumentswL Matrix with left extrema-weighted featureswR Matrix with right extrema-weighted featuresy Binary vector with outcomesz Optional matrix z with extra,linear predictorsExamplesxwf:::xwfGAM(wL=rep(1:45,10),wR=rep(1:90,5),y=c(rep(0:1,225)))xwfGridsearch Adaptive grid searchDescriptionAdaptive grid search to optimize the weighting functions in the extrema-weighted features. UsagexwfGridsearch(y,xx,t,n.i,psi.list=default_psi(),F=NULL,z=NULL,iter=3,w=function(t,i,b,left)ifelse(left,min(1,(1-F(xx[[i]](t)))/(1-b)),min(1,F(xx[[i]](t))/b)),rel.shift=0.001,progressbar=TRUE)Argumentsy Vector with binary outcomes dataxx List of functions for which to compute the XWFst Matrix containing the times at which the functions xx were measured:Element (i,j)contains the time of the j-th measurement of the i-th function.n.i Vector containing the number of measurements for each function.Thefirst n.i[i] elements of the i-th row of t should not be NA.psi.list List of predefined local features which are functions of a function(first argu-ment)and a measurement time(second argument)F CDF of the values of the functions xx.Ignored if weighting function w is notthe default.z Optional matrix with covariates to be included as linear predictors in the gener-alized additive modeliter Number of levels in the adaptive grid search.The resolution in b obtained is 2^-1-iter.w Weighting function.The default is the one used in the original paper.See the default for what the roles of its3arguments are.rel.shift Optional relative reduction of the integration range to avoid instabilities at the end of the integration ranges.Set to0if no such correction is desired.progressbar Boolean specifying whether a progress bar indicating what level of the adaptive grid has been completed should be displayed.ValueList containing thefinal XWFs(wL and wR),the parameters for the optimal weighting functions(b.left and b.right),and the gmcv::gamObject corresponding to thefinal optimal generalized additivemodelfit.Examples#Data simulation similar to Section3.2of the paper#Sample sizen<-100#Length of trajectoriesn.i<-rep(5,n)max.n.i<-max(n.i)#Timest<-matrix(NA_integer_,nrow=n,ncol=max.n.i)for(i in1:n)t[i,1:n.i[i]]<-1:n.i[i]#Sample periodsphi<-runif(n=n,min=1,max=10)#Sample offsetsm<-10*runif(n=n)#Blood pressure measurementsx<-tfor(i in1:n)x[i,1:n.i[i]]<-sin(phi[i]*2*pi/max.n.i*t[i,1:n.i[i]])+m[i]#Matrix with covariates zq<-2#Number of covariatesz<-matrix(rnorm(n=n*q),nrow=n,ncol=q)#Generate outcomestemp<-phi*min(m,7)temp<-40*tempprob<-1/(1+exp(2*(median(temp)-temp)))y<-rbinom(n=n,size=1,prob=prob)xx<-list()for(i in1:n)xx[[i]]<-approxfun(x=t[i,1:n.i[i]],y=x[i,1:n.i[i]],rule=2)#Estimate fweights<-matrix(1/n.i,ncol=max.n.i,nrow=n)[!is.na(t)]f<-density(x=t(sapply(X=1:n,FUN=function(i)c(xx[[i]](t[i,1:n.i[i]]),rep(NA,max.n.i-n.i[i])))), weights=weights/sum(weights),na.rm=T)#Define CDF of f,FCDF<-c(0)for(i in2:length(f$x))CDF[i]<-CDF[i-1]+(f$x[i]-f$x[i-1])*(f$y[i]+f$y[i-1])/2F<-approxfun(x=f$x,y=CDF/max(CDF),yleft=0,yright=1)psi<-list(function(x,t)abs(x(t)-x(t-1)))XWFresult<-xwfGridsearch(y=y,xx=xx,t=t,n.i=n.i,psi.list=psi,F=F,z=z) summary(XWFresult$GAMobject)XWFresult$b.leftXWFresult$b.rightXWFpValues p-value computation for XWFsDescriptionRandomization method to compute p-values for an optimized extrema-weighted features general-ized additive modelfit.UsageXWFpValues(GAMobject,xx,t,n.i,psi.list=NULL,F,z=NULL,w=function(t,i,b,left)ifelse(left,min(1,(1-F(xx[[i]](t)))/(1-b)),min(1,F(xx[[i]](t))/b)),n.boot=100,progressbar=TRUE)ArgumentsGAMobject The GAMobject returned by xwfGridsearchxx List of function for which to compute the XWFst Matrix containing the times at which the functions xx were measured:Element (i,j)contains the time of the j-th measurement of the i-th function.n.i Vector containing the number of measurements for each function.Thefirst n.i[i] elements of the i-th row of t should not be NA.psi.list List of predefined local features which are functions of a function(first argu-ment)and a measurement time(second argument)F CDF of the values of the functions xx.Ignored if weighting function w is notthe default.z Optional matrix with covariates to be included as linear predictors in the gener-alized additive modelw Weighting function.The default is the one used in the original paper.See the default for what the roles of its3arguments are.n.boot Number for randomizations used to obtain the p-values.The resolution of the p-values is1/n.bootprogressbar Boolean specifying whether a progress bar indicating which randomizations have been completed should be displayed.ValueNamed vector with p-valuesExamples#Data simulation similar to Section3.2of the paper#Sample sizen<-100#Length of trajectoriesn.i<-rep(5,n)max.n.i<-max(n.i)#Timest<-matrix(NA_integer_,nrow=n,ncol=max.n.i)for(i in1:n)t[i,1:n.i[i]]<-1:n.i[i]#Sample periodsphi<-runif(n=n,min=1,max=10)#Sample offsetsm<-10*runif(n=n)#Blood pressure measurementsx<-tfor(i in1:n)x[i,1:n.i[i]]<-sin(phi[i]*2*pi/max.n.i*t[i,1:n.i[i]])+m[i]#Matrix with covariates zq<-2#Number of covariatesz<-matrix(rnorm(n=n*q),nrow=n,ncol=q)#Generate outcomestemp<-phi*min(m,7)temp<-40*tempprob<-1/(1+exp(2*(median(temp)-temp)))y<-rbinom(n=n,size=1,prob=prob)xx<-list()for(i in1:n)xx[[i]]<-approxfun(x=t[i,1:n.i[i]],y=x[i,1:n.i[i]],rule=2)#Estimate fweights<-matrix(1/n.i,ncol=max.n.i,nrow=n)[!is.na(t)]f<-density(x=t(sapply(X=1:n,FUN=function(i)c(xx[[i]](t[i,1:n.i[i]]),rep(NA,max.n.i-n.i[i])))), weights=weights/sum(weights),na.rm=T)#Define CDF of f,FCDF<-c(0)for(i in2:length(f$x))CDF[i]<-CDF[i-1]+(f$x[i]-f$x[i-1])*(f$y[i]+f$y[i-1])/2F<-approxfun(x=f$x,y=CDF/max(CDF),yleft=0,yright=1)psi<-list(function(x,t)abs(x(t)-x(t-1)))XWFresult<-xwfGridsearch(y=y,xx=xx,t=t,n.i=n.i,psi.list=psi,F=F,z=z) XWFpValues(GAMobject=XWFresult$GAMobject,xx=xx,t=t,n.i=n.i,psi.list=psi,F=F,z=z,n.boot=3)Indexdefault_psi,2xwf,2xwfGAM,3xwfGridsearch,4,6XWFpValues,69。
Obtaining information from imagery Measuring the Rondonia rain forest Satellite or aerial imagery can be described as a direct measurement or primary data source. Products derived from the primary source are considered secondary sources. Secondary sources can be derived by digitizing and obtaining vector information from imagery. W ith imagery serving as the basemap, you can use a mouse to perform manual digitizing, the process of converting geographic features on an analog map into digital format. As in any human procedure, errors can be introduced.Deforestation is defined as the removal of a forest or stand of trees where the land is thereafter converted to a nonforest use. Clear-cutting is a type of deforestation logging practice in which most or all trees in an area are uniformly cut down. The practice of clear-cutting can be clearly seen from imagery in the state of Rondônia in Brazil. The deforestation in Rondônia is seen in a pattern that looks like a fish skeleton because the cutting fans out along the roads. You have been asked to provide a quick approximation of how much of Rondônia’s land has been affected by clear-cutting. You have access to both high-resolution imagery and a Landsat 8 image that shows dry and wet areas .Build skills in these areas Opening an existing online mapInterpreting clear-cut forest regionsDigitizing on screenCalculating percentageInterpreting Landsat 8 (Moisture Index) imageryVisualizing derived moisture content imageryWhat you need Account not requiredEstimated time: 30 minutes – 1 hour Publication date: March 14, 2019Investigating Rondonia1.Click Imagery with Metadata.2.Click Open in Map Viewer.3.Click Content in the left panel.4.Zoom all the way out to the world view of the map.5. Search for Rondonia, Brazil.Take a few minutes to zoom in and out and observe the fish-skeleton-like pattern of clear-cutting in Rondônia. As you zoom inward, you can see that the land isnow agricultural.6.Click Basemap on the left and change to Imagery with Labels.Q1 What are the major cities in Rondonia?Q2 Zoom into the city areas. Describe the urban areas.To calculate the percentage of Rondônia that has been deforested by clear-cutting, you need to know two numbers: the total number of square miles ofRondônia and the number of square miles already clear-cut. Remember, at thebeginning of the exercise, it was determined that this secondary information can be obtained from the primary image source.7.Zoom out so the entire state of Rondonia is visible.8.On the upper right, click Measure and then the Area icon. The unit is SqMiles.9.After clicking the Area icon, digitize around the country by selecting pointsand creating a polygon. When the polygon is complete, record the area.Total area of Rondonia___________________ sq miles.10.Zoom in and digitize around the areas of Rondonia that have beendeforested by clear-cutting. You need to digitize several polygons and add the areas together for a total.Total area of deforestation ______________________ sq miles.Total area of Rondonia/Total area of deforestation X 100 = _________________ of deforested land.In the previous section, high-resolution aerial imagery was used to observeRondônia. In the next section, a Landsat 8 Moisture Index image will be used.The moisture index has been derived from the most recent Landsat 8 imageryand is an estimated level of moisture in vegetation. Wetlands and othervegetated areas with high levels of moisture appear as blue, whereas desertsappear as tan to brown. The Normalized Difference Moisture Index (NDMI) iscalculated by using the near infrared (band 5) and shortwave infrared (band 6).The map is updated daily, retaining the four most recent scenes for eachpath/row that has cloud coverage of less than 50 percent.11.Open The Moisture Index: How wet or dry?12.On the right panel at the bottom, click Data Source. This opens the Landsat 8 (Moisture Index).13.C lick Landsat 8 (Moisture Index) and then Open in Map Viewer.14.In the upper-right search box, search for Rondonia, Brazil.15.B elow Details on the left, click Contents. This shows the two layers on the map. You can turn off the Landsat 8 (Moisture Index) to observe the outline of R0ndonia.Q3 Write a description of the deforested land using the Moisture IndexLegend.16.F ollowing the procedure explained in steps 8-10, digitize around thedeforested region and perform the calculation for percentage of clear-cutting again. You may use the total area of Rondonia that you obtained previously.Total area of Rondonia:____________ sq miles.Total area of deforestation__________ sq miles.Total area of Rondonia/total area of deforestation X 100 = ___________ ofdeforested land.In this lesson, you have identified deforestation areas using both aerial andderived remote sensing imagery. You have used this imagery to obtainmeasurements to quantify the area of deforestation.Copyright © 2018 Esri. All rights reserved.https:///。
PRODUCT FEATURES Tolerance AnalysisThis release introduces a powerful new feature that can help you determinewhen your illumination design is ready for production: tolerance analysis.With tools for analyzing performance changes based on errors expected tobe introduced in the manufacturing process, the LightTools tolerance analysisfeature allows you to evaluate performance, adjust the required precision toachieve acceptable results, and predict manufacturability and production yields.Making adjustments before manufacturing allows you to control productioncosts and meet performance requirements for your illumination system.Tolerance analysis is part of the LightTools Optimization Module and supports:• Sensitivity analysis to evaluate how sensitive each performance measureis to changes in tolerances that affect the system• Interactive tolerancing to fine tune tolerance limits and instantly see theimpact of changes• Monte Carlo tolerance analysis to predict system performanceNURBS and Interpolated Curves Added to the 2D Objects Tools PaletteLightTools can now create native NURBS (Non-uniform rational basis spline)and interpolated curves. The new curves can be used for annotations likepolylines and for swept light guides in place of imported geometry createdin 3D CAD programs. The native curves are parametrized and available foroptimization and tolerancing.Ray Data Source Support for Backward SimulationsTo address the frequent need to use measured ray data files in illuminationdesign, LightTools now supports ray data sources for backward simulations.This improvement allows designers to perform more efficient luminancecalculations made possible with these types of simulations.Freeform Design EnhancementsLightTools freeform design has been enhanced to support disk, rectangle, sphere and ray file sources for evaluation.Light Guide Designer Enhancement The Light Guide Designer now includes an option to enable path angle optimization during spatial optimization.©2018 Synopsys, Inc. All rights reserved. Synopsys is a trademark of Synopsys, Inc. in the United States and other countries. A list of Synopsys trademarks is available at /copyright.html . All other names mentioned herein are trademarks or registered trademarks of their respective owners.06/07/18.CS12892_LightTools 8.6 New Features DS.For more information or to start yourfree 30-day evaluation, please contactSynopsys’ Optical Solutions Group at (626)795-9101, visit /optical-solutions/lighttools , or send an e-mail to*******************.。
关于L_p-极曲率映像的几个不等式(英文)李新红;马统一【期刊名称】《数学季刊:英文版》【年(卷),期】2016(31)4【摘要】Zhu,Lü and Leng extended the concept of L_p-polar curvature image. We continuously study the L_p-polar curvature image and mainly expound the relations between the volumes of star bodies and their L_p-polar curvature images in this article. We first establish the L_p-affine isoperimetric inequality associated with L_p-polar curvature image. Secondly,we give a monotonic property for L_p-polar curvature image. Finally, we obtain an interesting equation related to L_p-projection body of L_p-polar curvature image and L_p-centroid body.【总页数】10页(P349-358)【关键词】star bodies;convex bodies;Lp-curvature image;Lp-polar curvature image【作者】李新红;马统一【作者单位】College of Mathematics and Statistics,Northwest Normal University;College of Mathematics and Statistics,Hexi University【正文语种】中文【中图分类】O184【相关文献】1.关于Lp-混合极投影体的几个不等式 [J], 胡妍;卢峰红2.Lp-新几何体极的几个不等式 [J], 朱保成;李妮3.关于L_p-极投影体Shephard问题的一个类似(英文) [J], 马统一4.关于L_p-混合仿射表面积的不等式(英文) [J], 朱先阳5.L_p-极投影Brunn-Minkowski不等式 [J], 赵长健;张荣森因版权原因,仅展示原文概要,查看原文内容请购买。
A mathematical Model for Inter-Cellular Inductive Jeyaraman Srividhya【期刊名称】《计算分子生物学(英文)》【年(卷),期】2012(2)3【摘要】In vertebrate limb, a group of specialized epithelial cells called Apical Ectodermal Ridge (AER) form at the boundary of dorsal and ventral limb ectoderm. Recent experiments suggest that AER forms at the boundary of Fringe expressing and Fringe non-expressing cells by a specific type of receptor-ligand interaction called as inductive signaling, involving the transmembrane proteins Notch, Serrate and Delta. Experiments conducted on Drosophila wing disc have shown that Fringe inhibits the binding ability of Serrate ligand to Notch and enhances that of Delta to Notch. Although several of the signaling elements have been identified experimentally, it remains unclear how the inter-cellular interactions can give rise to such a boundary of specialized cells. Here we present an ordinary differential equation (ODE) model involvingDelta→Notch and Serrate→Notch interactions between juxtaposed Fringe expressing and Fringe nonexpressing cells. When simulated in a compartmentalized set up, this model gives rise to high Notch levels at the boundary of Fringe expressing and Fringe non-expressing cells.【总页数】6页(P102-107)【关键词】Delta-Notch;Signaling;Apical;Ectodermal;Ridge;Compucell3D;Boundary;Formatio n;Inductive;Signaling【作者】Jeyaraman Srividhya【作者单位】The Biocomplexity Institute, Indiana University, Bloomington, IN【正文语种】中文【中图分类】R73【相关文献】1.Mathematical models and expert system for grate-kiln process of irono re oxide pellet production (Part Ⅰ):Mathematical models of grate process [J], WANG Yi;FAN Xiao-hui;CHEN Xu-ling2.Effects of Three Typical Resistivity Models on Pulsed Inductive Plasma Acceleration Modeling [J], Xin-Feng Sun;Yan-Hui Jia;Tian-PingZhang;Chen-Chen Wu;Xiao-Dong Wen;Ning Guo;Hai Jin;Yu-Jun Ke;Wei-Long Guo3.Mathematical models and expert system for grate-kiln process of iron ore oxide pellet production (Part I): Mathematical models of grate process [J], 王祎;范晓慧;陈许玲4.ESTABLISHMENT AND APPLICATIONS OF MATHEMATICAL MODELS FOR CONCENTRATION DISTRIBUTION IN Mo-Fe-Ni-Co DIFFUSION QUARTERNARY──(Ⅱ)Characteristics and Applications of Mathematical Models [J],5.ESTABLISHMENT AND APPLICATIONS OF MATHEMATICAL MODELS FOR CONCENTRATION DISTRIBUTIONS IN Mo-Fe-Ni-Co DIFFUSION QUARTERNARY (Ⅰ)Establishment of Mathematical Models [J],因版权原因,仅展示原文概要,查看原文内容请购买。
Package‘mvna’October13,2022Title Nelson-Aalen Estimator of the Cumulative Hazard in MultistateModelsVersion2.0.1Author Arthur AllignolDescription Computes the Nelson-Aalen estimator of the cumulative transition hazard for arbi-trary Markov multistate models<ISBN:978-0-387-68560-1>.Maintainer Arthur Allignol<*************************>License MIT+file LICENSEImports latticeNeedsCompilation yesRepository CRANDate/Publication2017-09-1123:21:40UTCR topics documented:abortion (2)lines.mvna (2)mvna (4)plot.mvna (6)predict.mvna (8)print.mvna (10)sir.adm (11)sir.cont (12)summary.mvna (14)xyplot.mvna (15)Index171abortion Pregnancies exposed to coumarin derivativesDescriptionOutcomes of pregnancies exposed to coumarin derivatives.The aim is to investigate whether ex-position to coumarin derivatives increases the probability of spontaneous abortions.Apart from spontaneous abortion,pregnancy may end in induced abortion or live birth.Moreover,data are left-truncated as women usually enter the study several weeks after conception.Usagedata(abortion)FormatA data frame with1186observations on the following5variables.id Identification numberentry Entry times into the cohortexit Event timesgroup Group.0:control,1:exposed to coumarin derivativescause Cause of failure.1:induced abortion,2:life birth,3:spontaneous abortionSourceMeiester,R.and Schaefer,C(2008).Statistical methods for estimating the probability of sponta-neous abortion in observational studies–Analyzing pregnancies exposed to coumarin derivatives.Reproductive Toxicology,26,31–35Examplesdata(abortion)lines.mvna Lines method for’mvna’objectsDescriptionLines method for mvna objects.Usage##S3method for class mvnalines(x,tr.choice,col=1,lty,conf.int=FALSE,level=0.95,var.type=c("aalen","greenwood"),ci.fun=c("log","linear","arcsin"),ci.col=col,ci.lty=3,...)Argumentsx An object of class mvna.tr.choice A character vector of the form c("from to","from to")specifying which tran-sitions should be displayed.By default,all the transition hazards are plotted.col A vector of colours.Default is black.lty A vector of line types.Default is1:number of transitions.conf.int Logical.Indicates whether to display pointwise confidence interval.Default is FALSE.level Level of the confidence interval.Default is0.95.var.type Specifies the variance estimator that should be used to compute the confidence interval.One of"aalen"or"greenwood".Default is"aalen".ci.fun Specifies the transformation applied to the confidence interval.Choices are"lin-ear","log","arcsin".Default is"log".ci.col Colours of the confidence interval lines.By default,ci.col equals col.ci.lty Line types for the confidence intervals.Default is3....Further arguments for lines.ValueNo value returned.Author(s)Arthur Allignol,<*************************>See Alsomvna,plot.mvnaExamplesdata(sir.adm)##data set transformationdata(sir.adm)id<-sir.adm$idfrom<-sir.adm$pneuto<-ifelse(sir.adm$status==0,"cens",sir.adm$status+1)times<-sir.adm$time4mvnadat.sir<-data.frame(id,from,to,time=times)##Possible transitionstra<-matrix(ncol=4,nrow=4,FALSE)tra[1:2,3:4]<-TRUEna.pneu<-mvna(dat.sir,c("0","1","2","3"),tra,"cens")plot(na.pneu,tr.choice=c("02"),conf.int=TRUE,col=1,lty=1,legend=FALSE)lines(na.pneu,tr.choice=c("12"),conf.int=TRUE,col=2,lty=1)mvna Nelson-Aalen estimator in multistate modelsDescriptionThis function computes the multivariate Nelson-Aalen estimator of the cumulative transition haz-ards in multistate models,that is,for each possible transition,it computes an estimate of the cumu-lative hazard.Usagemvna(data,s,tra,)Argumentsdata A data.frame of the form data.frame(id,from,to,time)or(id,from,to,entry,exit) id:patient idfrom:the state from where the transition occursto:the state to which a transition occurstime:time when a transition occursentry:entry time in a stateexit:exit time from a stateThis data.frame is transition-oriented,i.e.it contains one row per transition,andpossibly several rows per patient.Specifying an entry and exit time permits totake into account left-truncation.s A vector of character giving the states names.tra A quadratic matrix of logical values describing the possible transitions within the multistate model. A character giving the code for censored observations in the column to of data.If there is no censored observations in your data,put NULL.mvna5DetailsThis functions computes the Nelson-Aalen estimator as described in Anderson et al.(1993),along with the two variance estimators described in eq.(4.1.6)and(4.1.7)of Andersen et al.(1993)at each transition time.ValueReturns a list named after the possible transitions,e.g.if we define a multistate model with two possible transitions:from state0to state1,and from state0to state2,the returned list will have two parts named"01"and"02".Each part contains a data.frame with columns:na Nelson-Aalen estimates at each transition times.var.aalen Variance estimator given in eq.(4.1.6)of Andersen et al.(1993).var.greenwood Variance estimator given in eq.(4.1.7)of Andersen et al.(1993).time The transition times.The list also contains:time All the event times.n.risk A matrix giving the number at individual at risk in the transient states just before an event.n.event An array which gives the number of transitions at each event times.n.cens A matrix giving the number a censored observations at each event times.s The same as in the function call. The same as in the function call.trans A data frame,with columns from and to,that gives the possible transitions. NoteThe variance estimator(4.1.6)may overestimate the true variance,and the one defined eq.(4.1.7) may underestimate the true variance(see Klein(1991)and Andersen et al.(example IV.1.1,1993)), especially with small sample set.Klein(1991)recommends the use of the variance estimator of eq.(4.1.6,"aalen")because he found it to be less biased.Author(s)Arthur Allignol,<*************************>ReferencesAndersen,P.K.,Borgan,O.,Gill,R.D.and Keiding,N.(1993).Statistical models based on counting processes.Springer Series in Statistics.New York,NY:Springer.Beyersmann J,Allignol A,Schumacher M:Competing Risks and Multistate Models with R(Use R!),Springer Verlag,2012(Use R!)Klein,J.P.Small sample moments of some estimators of the variance of the Kaplan-Meier and Nelson-Aalen estimators.Scandinavian Journal of Statistics,18:333–340,1991.See Alsosir.adm,sir.contExamplesdata(sir.cont)#Modification for patients entering and leaving a state#at the same datesir.cont<-sir.cont[order(sir.cont$id,sir.cont$time),]for(i in2:nrow(sir.cont)){if(sir.cont$id[i]==sir.cont$id[i-1]){if(sir.cont$time[i]==sir.cont$time[i-1]){sir.cont$time[i-1]<-sir.cont$time[i-1]-0.5}}}#Matrix of logical giving the possible transitionstra<-matrix(ncol=3,nrow=3,FALSE)tra[1,2:3]<-TRUEtra[2,c(1,3)]<-TRUE#Computation of the Nelson-Aalen estimatesna<-mvna(sir.cont,c("0","1","2"),tra,"cens")#plotif(require(lattice))xyplot(na)###example with left-truncationdata(abortion)#Data set modification in order to be used by mvnanames(abortion)<-c("id","entry","exit","from","to")abortion$to<-abortion$to+1##computation of the matrix giving the possible transitions tra<-matrix(FALSE,nrow=5,ncol=5)tra[1:2,3:5]<-TRUEna.abortion<-mvna(abortion,as.character(0:4),tra,NULL) plot(na.abortion,tr.choice=c("04","14"),curvlab=c("Control","Exposed"),bty="n",legend.pos="topleft")plot.mvna Plot method for a mvna objectDescriptionPlot method for an object of class mvna.This function plots estimates of the cumulative transition hazards in one panel.Usage##S3method for class mvnaplot(x,tr.choice,xlab="Time",ylab="Cumulative Hazard",col=1,lty,xlim,ylim,conf.int=FALSE,level=0.95,var.type=c("aalen","greenwood"),ci.fun=c("log","linear","arcsin"),ci.col=col,ci.lty=3,legend=TRUE,legend.pos,curvlab,legend.bty="n",...)Argumentsx An object of class mvna.tr.choice A character vector of the form c("from to","from to")specifying which tran-sitions should be plotted.Default,all the cumulative transition hazards are plot-ted.xlab x-axis label.Default is"Time".ylab y-axis label.Default is"Cumulative Hazard".col Vector of colour.Default is black.lty Vector of line type.Default is1:number of transitionsxlim Limits of x-axis for the plotylim Limits of y-axis for the plotconf.int Logical.Whether to display pointwise confidence intervals.Default is FALSE.level Level of the pointwise confidence intervals.Default is0.95.var.type A character vector specifying the variance that should be used to compute the pointwise confidence intervals.Choices are"aalen"or"greenwood".Default is"aalen".ci.fun One of"log","linear"or"arcsin".Indicates which transformation to apply to the confidence intervals.ci.col Colour for the confidence intervals.By default,the colour specified by col is used.ci.lty Line type for the confidence intervals.Default is3.legend A logical specifying if a legend should be addedlegend.pos A vector giving the legend’s position.See legend for further details.curvlab A character or expression vector to appear in the legend.Default is the name of the transitions.legend.bty Box type for the legend....Further arguments for plot method.DetailsThis plot method permits to draw several cumulative transition hazards on the same panel.ValueNo value returnedAuthor(s)Arthur Allignol<*************************>See AlsomvnaExamplesdata(sir.cont)#Modification for patients entering and leaving a state#at the same datesir.cont<-sir.cont[order(sir.cont$id,sir.cont$time),]for(i in2:nrow(sir.cont)){if(sir.cont$id[i]==sir.cont$id[i-1]){if(sir.cont$time[i]==sir.cont$time[i-1]){sir.cont$time[i-1]<-sir.cont$time[i-1]-0.5}}}tra<-matrix(ncol=3,nrow=3,FALSE)tra[1,2:3]<-TRUEtra[2,c(1,3)]<-TRUEna.cont<-mvna(sir.cont,c("0","1","2"),tra,"cens")plot(na.cont,tr.choice=c("02","12"))predict.mvna Calculates Nelson-Aalen estimates at specified time-pointsDescriptionThis function gives the Nelson-Aalen estimates at time-points specified by the user.Usage##S3method for class mvnapredict(object,times,tr.choice,level=0.95,var.type=c("aalen","greenwood"),ci.fun=c("log","linear","arcsin"),...)Argumentsobject An object of class mvnatimes Time-points at which one wants the estimatestr.choice A vector of character giving for which transitions one wants estimates.By de-fault,the function will give the Nelson-Aalen estimates for all transitions.level Level of the pointwise confidence intervals.Default is0.95.var.type Variance estimator displayed and used to compute the pointwise confidence in-tervals.One of"aalen"or"greenwood".Default is"aalen".ci.fun Which transformation to apply for the confidence intervals.Choices are"linear", "log"or"arcsin".Default is"log"....Other arguments to predictValueReturns a list named after the possible transitions,e.g.if we define a multistate model with two possible transitions:from state0to state1,and from state0to state2,the returned list will have two parts named"01"and"02".Each part contains a data.frame with columns:times Time points specified by the user.na Nelson-Aalen estimates at the specified times.var.aalen or var.greenwoodDepending on what was specified in var.type.lower Lower bound of the pointwise confidence intervals.upper Upper bound.Author(s)Arthur Allignol,<*************************>ReferencesAndersen,P.K.,Borgan,O.,Gill,R.D.and Keiding,N.(1993).Statistical models based on counting processes.Springer Series in Statistics.New York,NY:Springer.See Alsomvna,summary.mvna10print.mvnaExamplesdata(sir.cont)#Modification for patients entering and leaving a state#at the same datesir.cont<-sir.cont[order(sir.cont$id,sir.cont$time),]for(i in2:nrow(sir.cont)){if(sir.cont$id[i]==sir.cont$id[i-1]){if(sir.cont$time[i]==sir.cont$time[i-1]){sir.cont$time[i-1]<-sir.cont$time[i-1]-0.5}}}#Matrix of logical giving the possible transitionstra<-matrix(ncol=3,nrow=3,FALSE)tra[1,2:3]<-TRUEtra[2,c(1,3)]<-TRUE#Computation of the Nelson-Aalen estimatesna<-mvna(sir.cont,c("0","1","2"),tra,"cens")#Using predictpredict(na,times=c(1,5,10,15))print.mvna Print method for’mvna’objectDescriptionPrint method for an object of class mvna.It prints estimates of the cumulative hazard along with estimates of the variance described in eq.(4.1.6)and(4.1.7)of Andersen et al.(1993)at several time points obtained with the quantile function.Usage##S3method for class mvnaprint(x,...)Argumentsx An object of class mvna...Other arguments for print methodValueNo value returned.sir.adm11Author(s)Arthur Allignol,<*******************************>See Alsomvnasir.adm Pneumonia on admission in intenive care unit patientsDescriptionPneumonia status on admission for intensive care unit(ICU)patients,a random sample from the SIR-3study.Usagedata(sir.adm)FormatThe data contains747rows and4variables:id:Randomly generated patient idpneu:Pneumonia indicator.0:No pneumonia,1:Pneumoniastatus Status indicator.0:censored observation,1:discharged,2:deadtime:Follow-up time in dayage:Age at inclusionsex:Sex.F for female and M for maleSourceBeyersmann,J.,Gastmeier,P.,Grundmann,H.,Baerwolff,S.,Geffers,C.,Behnke,M.,Rueden,H., and Schumacher,e of multistate models to assess prolongation of intensive care unit stay due to nosocomial infection.Infection Control and Hospital Epidemiology,27:493-499,2006. Examples#data set transformationdata(sir.adm)id<-sir.adm$idfrom<-sir.adm$pneuto<-ifelse(sir.adm$status==0,"cens",sir.adm$status+1)times<-sir.adm$timedat.sir<-data.frame(id,from,to,time=times)#Possible transitionstra<-matrix(ncol=4,nrow=4,FALSE)tra[1:2,3:4]<-TRUEna.pneu<-mvna(dat.sir,c("0","1","2","3"),tra,"cens")if(require("lattice")){xyplot(na.pneu,tr.choice=c("02","12","03","13"),aspect=1,strip=strip.custom(bg="white",factor.levels=c("No pneumonia on admission--Discharge","Pneumonia on admission--Discharge","No pneumonia on admission--Death","Pneumonia on admission--Death"),par.strip.text=list(cex=0.9)),scales=list(alternating=1),xlab="Days",ylab="Nelson-Aalen esimates")}sir.cont Ventilation status in intensive care unit patientsDescriptionTime-dependent ventilation status for intensive care unit(ICU)patients,a random sample from the SIR-3study.Usagedata(sir.cont)FormatA data frame with1141rows and6columns:id:Randomly generated patient idfrom:State from which a transition occursto:State to which a transition occurstime:Time when a transition occursage:Age at inclusionsex:Sex.F for female and M for maleThe possible states are:0:No ventilation1:Ventilation2:End of stay.And cens stands for censored observations.DetailsThis data frame consists in a random sample of the SIR-3cohort data.It focuses on the effect of ven-tilation on the length of stay(combined endpoint discharge/death).Ventilation status is considered as a transcient state in an illness-death model.The data frame is directly formated to be used with the mvna function,i.e.,it is transition-oriented with one row per transition.SourceBeyersmann,J.,Gastmeier,P.,Grundmann,H.,Baerwolff,S.,Geffers,C.,Behnke,M.,Rueden,H., and Schumacher,e of multistate models to assess prolongation of intensive care unit stay due to nosocomial infection.Infection Control and Hospital Epidemiology,27:493-499,2006. Examplesdata(sir.cont)#Matrix of possible transitionstra<-matrix(ncol=3,nrow=3,FALSE)tra[1,2:3]<-TRUEtra[2,c(1,3)]<-TRUE#Modification for patients entering and leaving a state#at the same datesir.cont<-sir.cont[order(sir.cont$id,sir.cont$time),]for(i in2:nrow(sir.cont)){if(sir.cont$id[i]==sir.cont$id[i-1]){if(sir.cont$time[i]==sir.cont$time[i-1]){sir.cont$time[i-1]<-sir.cont$time[i-1]-0.5}}}#Computation of the Nelson-Aalen estimatesna.cont<-mvna(sir.cont,c("0","1","2"),tra,"cens")if(require("lattice")){xyplot(na.cont,tr.choice=c("02","12"),aspect=1,strip=strip.custom(bg="white",factor.levels=c("No ventilation--Discharge/Death","Ventilation--Discharge/Death"),par.strip.text=list(cex=0.9)),scales=list(alternating=1),xlab="Days",ylab="Nelson-Aalen estimates")}14summary.mvna summary.mvna Summary method for objects of class’mvna’DescriptionSummary method for mvna objects.The function returns a list containing the cumulative transition hazards,variance and other informations.Usage##S3method for class mvnasummary(object,level=0.95,var.type=c("aalen","greenwood"),ci.fun=c("log","linear","arcsin"),...)##S3method for class mvnaprint.summary(x,...)Argumentsobject An object of class mvna.level Level of the pointwise confidence interval.Default is0.95.var.type Which of the"aalen"or"greenwood"variance estimator should be displayed and used to compute the pointwise confidence intervals.Default is"aalen".ci.fun Which transformation to apply to the confidence intervals.One of"linear", "log","arcsin".Default is"log"....Further arguments.x An object of class summary.mvna.ValueReturns an object of class mvna which is a list of data frames named after the possible transitions.Each data frame contains the following columns:time Event times at which the cumulative hazards are estimated.na Estimated cumulative transition hazards.var.aalen or var.greenwoodVariance estimates.The name depends on the var.type argument.Default willbe var.aalen.lower Lower bound of the pointwise confidence interval.upper Upper bound.n.risk Number of individuals at risk of experiencing an event just before t.n.event Number of transitions at time t.Author(s)Arthur Allignol,<*************************>See AlsomvnaExamplesdata(sir.adm)##data set transformationdata(sir.adm)id<-sir.adm$idfrom<-sir.adm$pneuto<-ifelse(sir.adm$status==0,"cens",sir.adm$status+1)times<-sir.adm$timedat.sir<-data.frame(id,from,to,time=times)##Possible transitionstra<-matrix(ncol=4,nrow=4,FALSE)tra[1:2,3:4]<-TRUEna.pneu<-mvna(dat.sir,c("0","1","2","3"),tra,"cens")summ.na.pneu<-summary(na.pneu)##cumulative hazard for0->2transition:summ.na.pneu$"02"$naxyplot.mvna Panel plots for object of class’mvna’Descriptionxyplot function for objects of class mvna.Estimates of the cumulative hazards are plotted as a function of time for all the transitions specified by the user.The function can also plot several types of pointwise confidence interval(see Andersen et al.(1993)p.208).Usage##S3method for class mvnaxyplot(x,data=NULL,xlab="Time",ylab="Cumulative Hazard",tr.choice="all",conf.int=TRUE,var.type=c("aalen","greenwood"),ci.fun=c("log","linear","arcsin"),level=0.95,col=c(1,1,1),lty=c(1,3,3),ci.type=c(1,2),...)Argumentsx An object of class mvna.data Useless.xlab x-axis label.Default is"Time".ylab y-axis label.Default is"Cumulative Hazard"tr.choice A character vector of the form c("from to","from to")specifying which tran-sitions should be plotted.Default is"all".conf.int A logical whether plot pointwise confidence interval.Default is TRUEvar.type One of"aalen"or"greenwood".Specifies which variance estimator is used to compute the confidence intervals.ci.fun One of"log","linear"or"arcsin".Indicates the transformation applied to the pointwise confidence intervals.Default is"log".level Level of the confidence interval.Default is0.95.col Vector of colour for the plot.Default is black.lty Vector of line type.Default is c(1,3,3).ci.type DEPRECATED...Other arguments for xyplot.ValueAn object of class trellis.NoteThese plots are highly customizable,see Lattice and xyplot.For example,if one want to change strip background color and the title of each strip,it can be added’strip=strip.custom(bg="a color",factor.levels="a title","another title")’.One can use’aspect="1"’to get the size of the panels isometric.Author(s)Arthur Allignol,<*******************************>ReferencesAndersen,P.K.,Borgan,O.,Gill,R.D.and Keiding,N.(1993).Statistical models based on counting processes.Springer Series in Statistics.New York,NY:Springer.Deepayan Sarkar(2006).lattice:Lattice Graphics.R package version0.13-8.See Alsoxyplot,mvna,sir.adm,sir.contIndex∗aplotlines.mvna,2∗datasetsabortion,2sir.adm,11sir.cont,12∗hplotplot.mvna,6xyplot.mvna,15∗printprint.mvna,10∗survivalabortion,2lines.mvna,2mvna,4plot.mvna,6predict.mvna,8print.mvna,10sir.adm,11sir.cont,12summary.mvna,14xyplot.mvna,15abortion,2Lattice,16legend,7lines.mvna,2mvna,3,4,8,9,11,15,16plot.mvna,3,6predict.mvna,8print.mvna,10print.summary.mvna(summary.mvna),14sir.adm,6,11,16sir.cont,6,12,16summary.mvna,9,14xyplot,16xyplot.mvna,1517。
局部线性嵌入优化光谱回归的鲁棒人脸识别王丽荣【期刊名称】《计算机工程与应用》【年(卷),期】2014(000)019【摘要】For the robust face recognition problem with high-dimensional small sample, the algorithm of spectral regression classification optimized by local linear embedding is proposed. Firstly, feature vectors of training samples are calculated. Then, local linear embedding is used to construct embedding needed by classification and embeddings needed by sub-manifold of each classification is learned. Finally, spectral regression classification algorithm is used to compute project metrics, and nearest neighbor classifier is used to recognize face. Experimental results on the common face datasets FERET, AR and Extended YaleB show that proposed algorithm has better recognition efficiency than several other spectral regression algorithms.%针对高维小样本鲁棒人脸识别问题,提出了一种局部线性嵌入优化光谱回归算法。
Package‘tidydr’March8,2023Title Unify Dimensionality Reduction ResultsVersion0.0.5Description Dimensionality reduction(DR)is widely used in many domain for analyzing and visual-izing high-dimensional data.'tidydr'provides uniform output and is compatible with multi-ple methods,including'prcomp','mds','Rtsne'.etc.Imports ggfun,ggplot2,grid,rlang,utilsSuggests knitr,rmarkdown,prettydoc,SingleCellExperiment,SummarizedExperimentVignetteBuilder knitrByteCompile trueLicense Artistic-2.0URL https:///YuLab-SMU/tidydr/BugReports https:///YuLab-SMU/tidydr/issuesEncoding UTF-8RoxygenNote7.2.3NeedsCompilation noAuthor Guangchuang Yu[aut,cre,cph](<https:///0000-0002-6485-8781>),Shuangbin Xu[aut](<https:///0000-0003-3513-5362>),Erqiang Hu[ctb]Maintainer Guangchuang Yu<***********************>Repository CRANDate/Publication2023-03-0809:20:02UTCR topics documented:available_methods (2)dr (2)dr_extract (3)element_line2 (4)theme_dr (5)theme_noaxis (5)12dr Index7 available_methods List dimensionality reduction methods currently availableDescriptionThis function shows available methods that worked for dr()function.Usageavailable_methods(method="all")Argumentsmethod one of’data’,’distance’or’all’(default)ValueA character vector of available DR methodsAuthor(s)Lang Zhou and Guangchuang YuExamplesavailable_methods()dr drDescriptiondimensional reductionUsagedr(data,fun,...)Argumentsdata input datafun function to perform dimensional reduction...additional parameters passed to’fun’dr_extract3 DetailsThis function call the user-provided function(’fun’)to perform dimensional reduction on the input data(’data’)Valuea DrResult object,which contains’data’(original data),’drdata’(coordination after dimensionalityreduction),eigenvalue(standard deviation explained by each dimension)and stress(evaluate the effect of dimensionality reduction)Author(s)Guangchuang YuExamplesx=dr(iris[,1:4],prcomp)autoplot(x,aes(color=.group),metadata=iris$Species)dr_extract dr_extractDescriptiondr_extract genericUsagedr_extract(result)Argumentsresult DrResult objectValuea list that contains components to construct a’DrResult’object.Author(s)Guangchuang Yu4element_line2 element_line2element_line2Descriptionelement_line2for drawing shorten axis linesUsageelement_line2(colour=NULL,size=NULL,linetype=NULL,lineend=NULL,color=NULL,arrow=NULL,inherit.blank=FALSE,id,xlength=0.3,ylength=0.3,...)Argumentscolour line coloursize line size in ptslinetype line typelineend line end style(round,butt,square)color aliase to colourarrow arrow specification,as created by’grid::arrow()’inherit.blank whether inherit’element_blank’id1or2,1for axis.line.x.bottom and2for axis.line.y.left,only these two axes supportedxlength length of x axisylength length of y axis...additional parametersValueelement_line2object,which is a tailored element_line objectAuthor(s)Guangchuang Yutheme_dr5 theme_dr theme_drDescriptionDimensional reduction scatter plot axis themeUsagetheme_dr(xlength=0.3,ylength=0.3,arrow=grid::arrow(length=unit(0.15,"inches"),type="closed"))Argumentsxlength length of x axisylength length of y axisarrow arrow specification,as created by’grid::arrow()’Valuea theme object with shorten axesAuthor(s)Guangchuang Yutheme_noaxis theme_noaxisDescriptiontheme that remove axisUsagetheme_noaxis(...)Arguments...additional theme setting6theme_noaxisValuea theme object that disable axesAuthor(s)Guangchuang YuIndexavailable_methods,2dr,2dr_extract,3element_line2,4theme_dr,5theme_noaxis,57。
蜂窝晶格陈数
蜂窝晶格陈数(Honeycomb lattice Chern number)是描述蜂窝晶格中拓扑性质的一个物理量。
它与量子霍尔效应和拓扑绝缘体等领域相关。
在固体材料中,电子的行为可以通过能带结构来描述。
蜂窝晶格是一种常见的二维晶格结构,例如石墨烯就具有蜂窝晶格结构。
在蜂窝晶格中,电子的行为可以由哈密顿量(Hamiltonian)来描述,而哈密顿量中的拓扑性质可以通过陈数(Chern number)来刻画。
蜂窝晶格陈数表示了蜂窝晶格中的拓扑性质以及量子霍尔效应的特征。
它可以通过计算能带的Berry曲率(Berry curvature)来得到。
陈数是一个整数,其非零值意味着存在非平庸的拓扑相。
蜂窝晶格陈数在凝聚态物理中具有重要的应用,例如在拓扑绝缘体和拓扑半金属研究中。
它是描述材料拓扑性质的关键物理量,对于理解材料的电子性质和开展拓扑电子学研究具有重要意义。
Package‘rdmulti’June20,2023Type PackageTitle Analysis of RD Designs with Multiple Cutoffs or ScoresVersion1.1Author Matias D.Cattaneo,Rocio Titiunik,Gonzalo Vazquez-BareMaintainer Gonzalo Vazquez-Bare<******************.edu>Description The regression discontinuity(RD)design is a popular quasi-experimental de-sign for causal inference and policy evaluation.The'rdmulti'package provides tools to ana-lyze RD designs with multiple cutoffs or scores:rdmc()estimates pooled and cutoff specific ef-fects for multi-cutoff designs,rdmcplot()draws RD plots for multi-cutoff designs and rdms()es-timates effects in cumulative cutoffs or multi-score designs.See Cattaneo,Titiunik and Vazquez-Bare(2020)<https://rdpackages.github.io/references/Cattaneo-Titiunik-VazquezBare_2020_Stata.pdf>for further methodological details. Imports ggplot2,rdrobustLicense GPL-2Encoding UTF-8RoxygenNote7.2.3NeedsCompilation noRepository CRANDate/Publication2023-06-2021:00:02UTCR topics documented:rdmulti-package (2)rdmc (3)rdmcplot (6)rdms (8)Index1212rdmulti-package rdmulti-package rdmulti:analysis of RD Designs with multiple cutoffs or scoresDescriptionThe regression discontinuity(RD)design is a popular quasi-experimental design for causal infer-ence and policy evaluation.The rdmulti package provides tools to analyze RD designs with multiple cutoffs or scores:rdmc()estimates pooled and cutoff-specific effects in multi-cutoff de-signs,rdmcplot()draws RD plots for multi-cutoff RD designs and rdms()estimates effects in cumulative cutoffs or multi-score designs.For more details,and related Stata and R packages useful for analysis of RD designs,visit https://rdpackages.github.io/.Author(s)Matias Cattaneo,Princeton University.<**********************>Rocio Titiunik,Princeton University.<**********************>Gonzalo Vazquez-Bare,UC Santa Barbara.<******************.edu>ReferencesCalonico,S.,M.D.Cattaneo,M.Farrell and R.Titiunik.(2017).rdrobust:Software for Regres-sion Discontinuity Designs.Stata Journal17(2):372-404.Calonico,S.,M.D.Cattaneo,and R.Titiunik.(2014).Robust Data-Driven Inference in the Regression-Discontinuity Design.Stata Journal14(4):909-946.Calonico,S.,M.D.Cattaneo,and R.Titiunik.(2015).rdrobust:An R Package for Robust Non-parametric Inference in Regression-Discontinuity Designs.R Journal7(1):38-51.Cattaneo,M.D.,L.Keele,R.Titiunik and G.Vazquez-Bare.(2016).Interpreting Regression Dis-continuity Designs with Multiple Cutoffs.Journal of Politics78(4):1229-1248.Cattaneo,M.D.,L.Keele,R.Titiunik and G.Vazquez-Bare.(2020).Extrapolating Treatment Effects in Multi-Cutoff Regression Discontinuity Designs.Journal of the American Statistical As-sociation116(536):1941,1952.Cattaneo,M.D.,R.Titiunik and G.Vazquez-Bare.(2020).Analysis of Regression Discontinuity Designs with Multiple Cutoffs or Multiple Scores.Stata Journal20(4):866-891.Keele,L.and R.Titiunik.(2015).Geographic Boundaries as Regression Discontinuities.Political Analysis23(1):127-155rdmc3 rdmc Analysis of RD designs with multiple cutoffsDescriptionrdmc()analyzes RD designs with multiple cutoffs.Usagerdmc(Y,X,C,fuzzy=NULL,derivvec=NULL,pooled_opt=NULL,verbose=FALSE,pvec=NULL,qvec=NULL,hmat=NULL,bmat=NULL,rhovec=NULL,covs_mat=NULL,covs_list=NULL,covs_dropvec=NULL,kernelvec=NULL,weightsvec=NULL,bwselectvec=NULL,scaleparvec=NULL,scaleregulvec=NULL,masspointsvec=NULL,bwcheckvec=NULL,bwrestrictvec=NULL,stdvarsvec=NULL,vcevec=NULL,nnmatchvec=NULL,cluster=NULL,level=95,plot=FALSE,conventional=FALSE)ArgumentsY outcome variable.X running variable.4rdmcC cutoff variable.fuzzy specifies a fuzzy design.See rdrobust()for details.derivvec vector of cutoff-specific order of derivatives.See rdrobust()for details.pooled_opt options to be passed to rdrobust()to calculate pooled estimand.verbose displays the output from rdrobust for estimating the pooled estimand.pvec vector of cutoff-specific polynomial orders.See rdrobust()for details.qvec vector of cutoff-specific polynomial orders for bias estimation.See rdrobust()for details.hmat matrix of cutoff-specific bandwidths.See rdrobust()for details.bmat matrix of cutoff-specific bandwidths for bias estimation.See rdrobust()fordetails.rhovec vector of cutoff-specific values of rho.See rdrobust()for details.covs_mat matrix of covariates.See rdrobust()for details.covs_list list of covariates to be used in each cutoff.covs_dropvec vector indicating whether collinear covariates should be dropped at each cutoff.See rdrobust()for details.kernelvec vector of cutoff-specific kernels.See rdrobust()for details.weightsvec vector of length equal to the number of cutoffs indicating the names of the vari-ables to be used as weights in each cutoff.See rdrobust()for details.bwselectvec vector of cutoff-specific bandwidth selection methods.See rdrobust()for de-tails.scaleparvec vector of cutoff-specific scale parameters.See rdrobust()for details.scaleregulvec vector of cutoff-specific scale regularization parameters.See rdrobust()fordetails.masspointsvec vector indicating how to handle repeated values at each cutoff.See rdrobust()for details.bwcheckvec vector indicating the value of bwcheck at each cutoff.See rdrobust()for de-tails.bwrestrictvec vector indicating whether computed bandwidths are restricted to the range orrunvar at each cutoff.See rdrobust()for details.stdvarsvec vector indicating whether variables are standardized at each cutoff.See rdrobust() for details.vcevec vector of cutoff-specific variance-covariance estimation methods.See rdrobust() for details.nnmatchvec vector of cutoff-specific nearest neighbors for variance estimation.See rdrobust() for details.cluster cluster ID variable.See rdrobust()for details.level confidence level for confidence intervals.See rdrobust()for details.plot plots cutoff-specific estimates and weights.conventional reports conventional,instead of robust-bias corrected,p-values and confidenceintervals.rdmc5Valuetau pooled estimatese.rb robust bias corrected standard error for pooled estimatepv.rb robust bias corrected p-value for pooled estimateci.rb.l left limit of robust bias corrected CI for pooled estimateci.rb.r right limit of robust bias corrected CI for pooled estimatehl bandwidth to the left of the cutoff for pooled estimatehr bandwidth to the right of the cutofffor pooled estimateNhl sample size within bandwidth to the left of the cutoff for pooled estimateNhr sample size within bandwidth to the right of the cutoff for pooled estimateB vector of bias-corrected estimatesV vector of robust variances of the estimatesCoefs vector of conventional estimatesW vector of weights for each cutoff-specific estimateNh vector of sample sizes within bandwidthCI robust bias-corrected confidence intervalsH matrix of bandwidthsPv vector of robust p-valuesrdrobust.resultsresults from rdrobust for pooled estimatecfail Cutoffs where rdrobust()encountered problemsAuthor(s)Matias Cattaneo,Princeton University.<**********************>Rocio Titiunik,Princeton University.<**********************>Gonzalo Vazquez-Bare,UC Santa Barbara.<******************.edu>ReferencesCattaneo,M.D.,R.Titiunik and G.Vazquez-Bare.(2020).Analysis of Regression Discontinuity Designs with Multiple Cutoffs or Multiple Scores.Stata Journal,forthcoming.Examples#Toy datasetX<-runif(1000,0,100)C<-c(rep(33,500),rep(66,500))Y<-(1+X+(X>=C))*(C==33)+(.5+.5*X+.8*(X>=C))*(C==66)+rnorm(1000)#rdmc with standard syntaxtmp<-rdmc(Y,X,C)rdmcplot RD plots with multiple cutoffs.Descriptionrdmcplot()RD plots with multiple cutoffs.Usagerdmcplot(Y,X,C,nbinsmat=NULL,binselectvec=NULL,scalevec=NULL,supportmat=NULL,pvec=NULL,hmat=NULL,kernelvec=NULL,weightsvec=NULL,covs_mat=NULL,covs_list=NULL,covs_evalvec=NULL,covs_dropvec=NULL,ci=NULL,col_bins=NULL,pch_bins=NULL,col_poly=NULL,lty_poly=NULL,col_xline=NULL,lty_xline=NULL,nobins=FALSE,nopoly=FALSE,noxline=FALSE,nodraw=FALSE)ArgumentsY outcome variable.X running variable.C cutoff variable.nbinsmat matrix of cutoff-specific number of bins.See rdplot()for details.binselectvec vector of cutoff-specific bins selection method.See rdplot()for details.scalevec vector of cutoff-specific scale factors.See rdplot()for details.supportmat matrix of cutoff-specific support conditions.See rdplot()for details..pvec vector of cutoff-specific polynomial orders.See rdplot()for details.hmat matrix of cutoff-specific bandwidths.See rdplot()for details.kernelvec vector of cutoff-specific kernels.See rdplot()for details.weightsvec vector of cutoff-specific weights.See rdplot()for details.covs_mat matrix of covariates.See rdplot()for details.covs_list list of of covariates to be used in each cutoff.covs_evalvec vector indicating the evaluation point for additional covariates.See rdrobust() for details.covs_dropvec vector indicating whether collinear covariates should be dropped at each cutoff.See rdrobust()for details.ci adds confidence intervals of the specified level to the plot.See rdrobust()for details.col_bins vector of colors for bins.pch_bins vector of characters(pch)type for bins.col_poly vector of colors for polynomial curves.lty_poly vector of lty for polynomial curves.col_xline vector of colors for vertical lines.lty_xline vector of lty for vertical lines.nobins omits bins plot.nopoly omits polynomial curve plot.noxline omits vertical lines indicating the cutoffs.nodraw omits plot.Valueclist list of cutoffscnum number of cutoffsX0matrix of X values for control unitsX1matrix of X values for treated unitsYhat0estimated polynomial for control unitsYhat1estimated polynomial for treated unitsXmean bin average of X valuesYmean bin average for Y valuesCI_l lower end of confidence intervalsCI_r upper end of confidence intervalscfail Cutoffs where rdrobust()encountered problems8rdmsAuthor(s)Matias Cattaneo,Princeton University.<**********************>Rocio Titiunik,Princeton University.<**********************>Gonzalo Vazquez-Bare,UC Santa Barbara.<******************.edu>ReferencesCattaneo,M.D.,R.Titiunik and G.Vazquez-Bare.(2020).Analysis of Regression Discontinuity Designs with Multiple Cutoffs or Multiple Scores.Stata Journal,forthcoming.Examples#Toy datasetX<-runif(1000,0,100)C<-c(rep(33,500),rep(66,500))Y<-(1+X+(X>=C))*(C==33)+(.5+.5*X+.8*(X>=C))*(C==66)+rnorm(1000)#rdmcplot with standard syntaxtmp<-rdmcplot(Y,X,C)rdms Analysis of RD designs with cumulative cutoffs or two running vari-ablesDescriptionrdms()analyzes RD designs with cumulative cutoffs or two running variables.Usagerdms(Y,X,C,X2=NULL,zvar=NULL,C2=NULL,rangemat=NULL,xnorm=NULL,fuzzy=NULL,derivvec=NULL,pooled_opt=NULL,pvec=NULL,qvec=NULL,hmat=NULL,bmat=NULL,rdms9 rhovec=NULL,covs_mat=NULL,covs_list=NULL,covs_dropvec=NULL,kernelvec=NULL,weightsvec=NULL,bwselectvec=NULL,scaleparvec=NULL,scaleregulvec=NULL,masspointsvec=NULL,bwcheckvec=NULL,bwrestrictvec=NULL,stdvarsvec=NULL,vcevec=NULL,nnmatchvec=NULL,cluster=NULL,level=95,plot=FALSE,conventional=FALSE)ArgumentsY outcome variable.X running variable.C vector of cutoffs.X2if specified,second running variable.zvar if X2is specified,treatment indicator.C2if specified,second vector of cutoffs.rangemat matrix of cutoff-specific ranges for the running variable.xnorm normalized running variable to estimate pooled effect.fuzzy specifies a fuzzy design.See rdrobust()for details.derivvec vector of cutoff-specific order of derivatives.See rdrobust()for details.pooled_opt options to be passed to rdrobust()to calculate pooled estimand.pvec vector of cutoff-specific polynomial orders.See rdrobust()for details.qvec vector of cutoff-specific polynomial orders for bias estimation.See rdrobust() for details.hmat matrix of cutoff-specific bandwidths.See rdrobust()for details.bmat matrix of cutoff-specific bandwidths for bias estimation.See rdrobust()for details.rhovec vector of cutoff-specific values of rho.See rdrobust()for details.covs_mat matrix of covariates.See rdplot()for details.covs_list list of of covariates to be used in each cutoff.10rdms covs_dropvec vector indicating whether collinear covariates should be dropped at each cutoff.See rdrobust()for details.kernelvec vector of cutoff-specific kernels.See rdrobust()for details.weightsvec vector of length equal to the number of cutoffs indicating the names of the vari-ables to be used as weights in each cutoff.See rdrobust()for details.bwselectvec vector of cutoff-specific bandwidth selection methods.See rdrobust()for de-tails.scaleparvec vector of cutoff-specific scale parameters.See rdrobust()for details.scaleregulvec vector of cutoff-specific scale regularization parameters.See rdrobust()fordetails.masspointsvec vector indicating how to handle repeated values at each cutoff.See rdrobust()for details.bwcheckvec vector indicating the value of bwcheck at each cutoff.See rdrobust()for de-tails.bwrestrictvec vector indicating whether computed bandwidths are restricted to the range orrunvar at each cutoff.See rdrobust()for details.stdvarsvec vector indicating whether variables are standardized at each cutoff.See rdrobust() for details.vcevec vector of cutoff-specific variance-covariance estimation methods.See rdrobust() for details.nnmatchvec vector of cutoff-specific nearest neighbors for variance estimation.See rdrobust() for details.cluster cluster ID variable.See rdrobust()for details.level confidence level for confidence intervals.See rdrobust()for details.plot plots cutoff-specific and pooled estimates.conventional reports conventional,instead of robust-bias corrected,p-values and confidenceintervals.ValueB vector of bias-corrected coefficientsV variance-covariance matrix of the estimatorsCoefs vector of conventional coefficientsNh vector of sample sizes within bandwidth at each cutoffCI bias corrected confidence intervalsH bandwidth used at each cutoffPv vector of robust p-valuesAuthor(s)Matias Cattaneo,Princeton University.<**********************>Rocio Titiunik,Princeton University.<**********************>Gonzalo Vazquez-Bare,UC Santa Barbara.<******************.edu>rdms11ReferencesCattaneo,M.D.,R.Titiunik and G.Vazquez-Bare.(2020).Analysis of Regression Discontinuity Designs with Multiple Cutoffs or Multiple Scores.Stata Journal,forthcoming.Examples#Toy dataset:cumulative cutoffsX<-runif(1000,0,100)C<-c(33,66)Y<-(1+X)*(X<C[1])+(0.8+0.8*X)*(X>=C[1]&X<C[2])+(1.2+1.2*X)*(X>=C[2])+rnorm(1000) #rmds:basic syntaxtmp<-rdms(Y,X,C)Index_PACKAGE(rdmulti-package),2rdmc,2,3rdmcplot,2,6rdms,2,8rdmulti-package,2rdmulti_package(rdmulti-package),212。
IDR/QR:An Incremental Dimension Reduction Algorithmvia QR DecompositionJieping Y e∗Qi Li†Hui Xiong∗Haesun Park∗Ravi Janardan∗Vipin Kumar∗ABSTRACTDimension reduction is critical for many database and data mining applications,such as efficient storage and retrieval of high-dimensional data.In the literature,a well-known di-mension reduction scheme is Linear Discriminant Analysis (LDA).The common aspect of previously proposed LDA based algorithms is the use of Singular Value Decompo-sition(SVD).Due to the difficulty of designing an incre-mental solution for the eigenvalue problem on the product of scatter matrices in LDA,there is little work on design-ing incremental LDA algorithms.In this paper,we pro-pose an LDA based incremental dimension reduction algo-rithm,called IDR/QR,which applies QR Decomposition rather than SVD.Unlike other LDA based algorithms,this algorithm does not require the whole data matrix in main memory.This is desirable for large data sets.More impor-tantly,with the insertion of new data items,the IDR/QR algorithm can constrain the computational cost by apply-ing efficient QR-updating techniques.Finally,we evaluate the effectiveness of the IDR/QR algorithm in terms of clas-sification accuracy on the reduced dimensional space.Our experiments on several real-world data sets reveal that the accuracy achieved by the IDR/QR algorithm is very close to the best possible accuracy achieved by other LDA based algorithms.However,the IDR/QR algorithm has much less computational cost,especially when new data items are dy-namically inserted.Categories and Subject Descriptors:H.2.8[Database Management]:Database Applications-Data Mining General Terms:AlgorithmsKeywords:Dimension reduction,Linear Discriminant Anal-ysis,incremental learning,QR DecompositionDepartment of Computer Science&Engineering,Uni-versity of Minnesota,Minneapolis,MN55455,U.S.A. {jieping,huix,hpark,janardan,kumar}@†Department of Computer Science,University of Delaware, Newark,DE,U.S.A.qili@Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on thefirst page.To copy otherwise,to republish,to post on servers or to redistribute to lists,requires prior specific permission and/or a fee.KDD’04,August22–25,2004,Seattle,Washington,USA.Copyright2004ACM1-58113-888-1/04/0008...$5.00.1.INTRODUCTIONEfficient storage and retrieval of high-dimensional data is one of the central issues in database and data mining re-search.In the literature,many efforts have been made to design multi-dimensional index structures[2],such as R-trees,R -trees,X-trees,SR-tree,etc,for speeding up query processing.However,the effectiveness of queries using any indexing schemes deteriorates rapidly as the dimension in-creases,which is the so-called curse of dimensionality.A standard approach to overcome this problem is dimension reduction,which transforms the original high-dimensional data into a lower-dimensional space with limited loss of in-formation.Once the high-dimensional data is transfered into a low-dimensional space,indexing techniques can be ap-plied effectively to organize this low-dimensional space and facilitate efficient retrieval of data[14].A well-known dimension reduction scheme is Linear Dis-criminant Analysis(LDA)[7,9],which computes a linear transformation by maximizing the ratio of the between-class distance to the within-class distance,thereby achieving max-imal discrimination.LDA has been applied to many appli-cations such as text retrieval[3]and face recognition[1,17, 21].In the past,many LDA extensions have been developed to deal with the singularity problem encountered by clas-sical LDA.There are three major extensions:regularized LDA,PCA+LDA,and LDA/GSVD.The common point of these algorithms is the use of Singular Value Decomposi-tion(SVD)or Generalized Singular Value Decomposition (GSVD).The difference among these LDA members is as follows:Regularized LDA increases the magnitude of the diagonal elements of the scatter matrix by adding a scaled identity matrix;PCA+LDAfirst applies PCA on the raw data to get a more compact representation so that the singu-larity of the scatter matrix is decreased;LDA/GSVD solves a trace optimization problem using GSVD.The above LDA algorithms have certain limitations.First, SVD or GSVD requires that the whole data matrix is stored in main memory.This requirement imposes difficulties on making the LDA algorithms scalable to large data sets.Also, the expensive computation of SVD or GSVD can signifi-cantly degrade the computational performance of the LDA algorithms when dealing with large data sets.Finally,since it is difficult to design an incremental solution for the eigen-value problem on the product of scatter matrices,little ef-fort has been made to design incremental LDA algorithms. However,in many practical applications,acquisition of rep-resentative training data is expensive and time-consuming. It is thus common to have a small chunk of data availableover a period of time.In such settings,it is necessary to de-velop an algorithm that can run in an incremental fashion to accommodate the new data.The goal of this paper is to design an efficient and in-cremental dimension reduction algorithm while preserving competitive performance.More precisely,when queries are conducted on the reduced dimension data from the proposed algorithm,the query accuracy should be comparable with the best possible query accuracy achieved by other LDA based algorithms.To resolve these issues,we design an LDA based incremen-tal dimension reduction algorithm,called IDR/QR,which applies QR Decomposition rather than SVD or GSVD.This algorithm has two stages.Thefirst stage is to maximize the separability between different classes.This is fulfilled by QR Decomposition.The distinct property of this stage is its low time and space complexity.The second stage incorporates both between-class and within-class information by applying LDA on the“reduced”scatter matrices resulting from the first stage.Unlike other LDA based algorithms,IDR/QR does not require the whole data matrix in main memory. This is desirable for large data sets.Also,our theoreti-cal analysis indicates that the computational complexity of IDR/QR is linear in the number of the data items in the training set as well as the number of dimensions.More im-portantly,the IDR/QR algorithm can work incrementally. When new data items are dynamically inserted,the compu-tational cost of the IDR/QR algorithm can be constrained by applying efficient QR-updating techniques.Finally,we have conducted extensive experiments on sev-eral well-known real-world datasets.The experimental re-sults show that the IDR/QR algorithm can be an order of magnitude faster than SVD or GSVD based LDA algorithms and the accuracy achieved by IDR/QR is very close to the best possible accuracy achieved by other LDA based algo-rithms.Also,when dealing with dynamic updating,the computational advantage of IDR/QR over SVD or GSVD based LDA algorithms becomes more substantial while still achieving comparable accuracy.Overview:The rest of this paper is organized as follows. Section2introduces related work.In Section3,we give a review of LDA.The batch implementation of the IDR/QR algorithm is presented in Section4.Section5describes the incremental implementation of the IDR/QR algorithm.A comprehensive empirical study of the performance of the proposed algorithms is presented in Section6.We conclude in Section7with a discussion of future work.2.RELATED WORKPrincipal Component Analysis(PCA)is one of the stan-dard and well-known methods for dimension reduction[13]. Because of its simplicity and ability to extract the global structure of a given data set,PCA is used widely in com-puter vision[22].Most previous work on PCA and LDA requires that all the training data be available before the dimension reduc-tion step.This is known as the batch method.There is some recent work in vision and numerical linear algebra litera-ture for computing PCA incrementally[4,11].Despite the popularity of LDA in the vision community,there is little work for computing it incrementally.The main difficulty is the involvement of the eigenvalue problem on the productNotations Descriptionsn number of training data pointsn i number of data points in i-th classd dimension of the training datac number of classesG transformation matrixA data matrixA i data matrix of the i-th classS b between-class scatter matrixS w within-class scatter matrixC centroid matrixm global centroid of the training setm i centroid of the i-th classTable1:Notationsof scatter matrices,which is hard to maintain incrementally. Although iterative algorithms have been proposed for neural network based LDA[5,16],they require O(d2)time for each update,where d is the dimension of the data.This is still expensive,especially when the data has high dimension. 3.LINEAR DISCRIMINANT ANALYSIS For convenience,we present in Table1the important no-tations used in the paper.This section gives a brief review of classical LDA,as well as its three extensions:regularized LDA,PCA+LDA,and LDA/GSVD.Given a data matrix A∈I R d×n,we considerfinding a linear transformation G∈I R d× that maps each column a j of A,for1≤j≤n,in the d-dimensional space to a vector y j=G T a j in the -dimensional space.Classical LDA aims tofind the transformation G such that class structure of original high-dimensional space is preserved in the reduced space.Let the data matrix A be partitioned into c classes as A=[A1,A2,···,A c],where A i∈I R d×n i,and c i=1n i=n.Let I i be the set of column indices that belong to the i th class,i.e.,a j,for j∈I i,belongs to the i th class.In general,if each class is tightly grouped,but well sep-arated from the other classes,the quality of the cluster is considered to be high.In discriminant analysis,two scatter matrices,within-class and between-class scatter matrices are defined to quantify the quality of the cluster,as follows[9]:S w=ci=1j∈I i(a j−m i)(a j−m i)T,S b=ci=1j∈I i(m i−m)(m i−m)T=ci=1n i(m i−m)(m i−m)T,where m i is the centroid of the i th class and m is the global centroid.Define the matricesH w=[A1−m1·e T1,···,A c−m c·e T c]∈I R d×n,(1)H b=[√n1(m1−m),···,√n c(m c−m)]∈I R d×c,(2) where e i=(1,···,1)T∈I R n i.Then the scatter matrices S w and S b can be expressed asS w =H w H T w ,S b =H b H Tb .The traces of the two scatter matrices can be computed as follows,trace (S w )=ci =1j ∈I i ||a j −m i ||2trace (S b )=ci =1n i ||m i −m ||2.Hence,trace (S w )measures the closeness of the vectors within classes,while trace (S b )measures the separation between classes.In the lower-dimensional space resulting from the linear transformation G ,the within-class and between-class scatter matrices becomeS L w =(G T H w )(G T H w )T =G T S w G,S L b=(G T H b )(G T H b )T =G T S b G.An optimal transformation G would maximize trace (S Lb )and minimize trace (S Lw ).A common optimization in clas-sical LDA [9]isG =argmaxg T i S w g j=0,∀i =j trace (G T S w G )−1(G T S b G ),(3)where g i is the i th column of G .The solution to the optimization in Eq.(3)can be ob-tained by solving the eigenvalue problem on S −1w S b ,if S w isnon-singular,or on S −1b S w ,if S b is non-singular.There are at mostc −1eigenvectors corresponding to nonzero eigen-values,since the rank of the matrix S b is bounded above by c −1.Therefore,the reduced dimension by classical LDA is at most c −1.A stable way to solve this eigenvalue problem is to apply SVD on the scatter matrices.Details on this can be found in [21].Classical LDA requires that one of the scatter matrices be non-singular.For many applications involving under-sampled data,such as text and image retrieval,all scatter matrices are singular.Classical LDA is thus not applicable.This is the so-called singularity or undersampled problem .To cope with this challenge,several methods,including two-stage PCA+LDA,regularized LDA,and LDA/GSVD have been proposed in the past.A common way to deal with the singularity problem is to apply an intermediate dimension reduction stage,such as PCA,to reduce the dimension of the original data be-fore classical LDA is applied.The algorithm is known as PCA+LDA,or subspace LDA.In this two-stage PCA+LDA algorithm,the discriminant stage is preceded by a dimension reduction stage using PCA.A limitation of this approach is that the optimal value of the reduced dimension for PCA is difficult to determine.Another common way to deal with the singularity problem is to add some constant value to the diagonal elements of S w ,as S w +µI d ,for some µ>0,where I d is an identity matrix [8].It is easy to check that S w +µI d is positive definite,hence non-singular.This approach is called regularized LDA (RLDA).A limitation of RLDA is that the optimal value of the parameter µis difficult to determine.Cross-validation is commonly applied for estimating the optimal µ[15].The LDA/GSVD algorithm in [12,23]is a recent work along the same line.A new criterion for generalized LDA was presented in [23].The inversion of the matrix S w isavoided by applying the Generalized Singular Value Decom-position (GSVD).LDA/GSVD computes the solution ex-actly without losing any information.However,one limita-tion of this method is the high computational cost of GSVD,which limits its applicability for large datasets,such as im-age and text data.4.BATCH IDR/QRIn this section,we present the batch implementation of the IDR/QR algorithm.This algorithm has two stages.The first stage maximizes the separation between different classes via QR Decomposition [10].Without the concern of min-imizing within-class distance,this stage can be used inde-pendently as a dimension reduction algorithm.The second stage addresses the concern on within-class distance,while keeping low time/space complexity.The first stage of IDR/QR aims to solve the following optimization problem,G =arg max G T G =Itrace(G T S b G ).(4)Note that this optimization only addresses the concern of maximizing between-class distance.The solution can be ob-tained by solving the eigenvalue problem on S b .The solution can also be obtained through QR Decomposition on the cen-troid matrix C ,which is the so-called Orthogonal Centroid Method (OCM)[19],whereC =[m 1,m 2,···,m c ](5)consists of the c centroids.More specifically,let C =QR be the QR Decomposition of C ,where Q ∈I R n ×c has orthonor-mal columns and R ∈I R c ×c is upper triangular.ThenG =QM(6)solves the optimization problem in Eq.(4),for any orthog-onal matrix M .Note the choice of orthogonal matrix M is arbitrary,since trace(G T S b G )=trace(M T G T S b GM ),for any orthogonal matrix M .In OCM [19],M is set to be the identity matrix for simplicity.The second stage of IDR/QR refines the first stage by addressing the concern on within-class distance.It incor-porates the within-class scatter information by applying a relaxation scheme on M in Eq.(6)(relaxing M from an or-thogonal matrix to an arbitrary matrix).Note that the trace value in Eq.(3)is the same for an arbitrary non-singular M ,however the constraints in Eq.(3)will not be satisfied for arbitrary M .In the second stage of IDR/QR,we look for a transformation G such that G =QM ,for some M .Note that M is not required to be orthogonal.The original prob-lem on computing G is equivalent to computing M .SinceG T S b G =M T (Q T S b Q )M,G T S w G =M T (Q T S w Q )M,the original optimization on finding optimal G is equivalent to finding M ,with B =Q T S b Q and W =Q T S w Q as the reduced between-class and within-class scatter matrices,re-spectively.Note that B has much smaller size than the original scatter matrix S b (similarly for W ).The optimal M can be computed efficiently using many existing LDA based methods,since we are dealing with ma-trices B and W of much smaller size c by c .A key obser-vation is that the singularity problem of W will not be asAlgorithm1:Batch IDR/QRInput:data matrix A;Output:optimal transformation matrix G;/*Stage I:*/1.Construct centroid matrix C;pute QR Decomposition of C as C=QR;where Q∈I R d×c,R∈I R c×c;/*Stage II:*/3.Z←H T w Q;4.Y←H T b Q;5.W←Z T Z;/*Reduced within-class scatter matrix*/6.B←Y T Y;/*Reduced between-class scatter matrix*/pute the c eigenvectorsφi of(W+µI c)−1B,with decreasing eigenvalues;8.G←QM,where M=[φ1,···,φc].Methods Time Complexity Space ComplexityIDR/QR O(ndc)O(dc)PCA+LDA O(n2d)O(nd)LDA/GSVD O((n+c)2d)O(nd)OCM O(nd+c2d)O(dc)PCA O(n2d)O(nd)Table2:Complexity comparison:n is the number of training data points,d is the dimension,and c is the number of classes.severe as the original S w,since W has much smaller size than S w.We can compute optimal M by simply apply-ing regularized LDA,that is,we compute M,by solving a small eigenvalue problem on(W+µI c)−1B,for some posi-tive constantµ.The pseudo-code for this algorithm is given in Algorithm1.4.1Time and space complexityWe close this section by analyzing the time and space complexity of the IDR/QR algorithm.It takes O(dn)for the formation of the centroid matrix C in Line1.The complexity for QR Decomposition in Line2 is O(c2d)[10].Lines3and4take O(ndc)and O(dc2)respec-tively for matrix multiplications.It then takes O(c2n)and O(c3)for matrix multiplications in Lines5and6,respec-tively.Line7computes the eigen-decomposition of a c by c matrix,hence takes O(c3)[10].The matrix multiplication in Line8takes O(dc2).Note that the dimension,d,and the number,n,of points are usually much larger,compared with the number,c,of classes.Thus,the most expensive step in Algorithm1is in Line3,which takes O(ndc).Therefore,the time complexity of IDR/QR is linear in the number of points,linear in the number of classes,and linear in the dimension of the dataset. It is clear that only the c centroids are required to reside in the main memory,hence the space complexity of IDR/QR is O(dc).Table2lists the time and space complexity of several dimension reduction algorithms discussed in this paper.It is clear from the table that IDR/QR and OCM are more efficient than other methods.5.INCREMENTAL IDR/QRThe incremental implementation of the IDR/QR algo-rithm is discussed in details in this section.We will adopt the following convention:For any variable X,its updated version after the insertion of a new instance is denoted by ˜X.For example,the number,ni,of elements in the i th class is changed to˜n i,while centroid m i is changed to˜m i.With the insertion of a new instance,the centroid ma-trix C,H w and H b will change accordingly,as well as W and B.The incremental updating in IDR/QR proceeds in three steps:(1)QR-updating of the centroid matrix C= [m1,···,m k]in Line2of Algorithm1;(2)Updating of the reduced within-class scatter matrix W in Line5;and (3)Updating of the reduced between-class scatter matrix B in Line6.Let x be a new instance inserted,which belongs to the i th class.Without loss of generality,let us assume we have data from the1st to the k th class,just before x is inserted. In general,this can be done by switching the class labels between different classes.In the rest of this section,we con-sider the incremental updating in IDR/QR in two distinct cases:(1)x belongs to an existing class,i.e.,i≤k;(2)x belongs to a new class,i.e.,i>k.As will be seen later,the techniques for these two cases are quite different.5.1Insertion of a new instance from an exist-ing class(i≤k)Recall that we have data from the1st to k th classes,when a new instance x is being inserted.Since x belongs to the i th class,with1≤i≤k,the insertion of x will not create a new class.In this section,we show how to do the incremental updating in three steps.5.1.1Step1:QR-updating of the centroid matrix C Since the new instance x belongs to the i th class,˜C= [m1,···,m i+f,···,m k],where f=x−m ii,and˜n i=n i+1. Hence,˜C can be rewritten as˜C=C+f·g T,for g= (0,···,1,···,0)T,where the1appears at the i th position. The problem of QR-updating of the centroid matrix C can be formulated as follows:Given the QR Decomposition of the centroid matrix C=QR,for Q∈I R d×k and R∈I R k×k, compute the QR Decomposition of˜C.Since˜C=C+f·g T,the QR-updating of the centroid matrix C can be formulated as a rank-one QR-updating. However,the algorithm in[10]cannot be directly applied, since it requires the complete QR Decomposition,i.e.,the matrix Q is square.While in our case,we use the skinny QR Decomposition,i.e.Q is rectangular.Instead,we apply a slight variation of the algorithm in[6]via the following two-stage QR-updating:(1)A complete rank-one updating as in [10]on a small matrix;(2)A QR-updating by an insertion of a new row.Details are given below.Partition f into two parts:the projection onto the orthog-onal basis Q,and its orthogonal complement.Mathemati-cally,f can be partitioned into f=QQ T f+(I−QQ T)f.It is easy to check that Q T(I−QQ T)f=0,i.e.(I−QQ T)f is orthogonal to,or lies in the orthogonal complement of,the subspace spanned by the columns of Q.It follows that ˜C=C+f·g T=QR+QQ T f·g T+(I−QQ T)f·g T=Q(R+f1·g T)+f2·g T,where f1=Q T f,f2=(I−QQ T)f.Next,we show how to compute the QR Decomposition of˜C in two stages.Thefirst stage updates the QR Decomposition of Q(R+f1·g T). It corresponds to a rank-one updating and can be done at O(kd)[10].This results in the updated QR Decomposition as Q(R+f1·g T)=Q1R1,where Q1=QP1,and P1∈I R k×k is orthogonal.Assume||f2||=0.Denote q=f2||f2||.Since q is orthogonal to the subspace spanned by the columns of Q,it is also orthogonal to the subspace spanned by the columns of Q1= QP1,i.e.[Q1,q]has orthonormal columns.The second stage computes QR-updating of˜C=[Q1,q]R1||f2||g T,which corresponds to the case that||f2||g T is inserted as a new row.This stage can be done at O(dk)[10].The updated QR Decomposition is[Q1,q]R1||f2||g T=[˜Q,˜q]˜R=˜Q˜R,where[˜Q,˜q]=[Q1,q]P2,for some orthogonal matrix P2. Combining both stages,we have˜C=Q1R1+||f2||q·g T=[Q1,q]R1||f2||g T=˜Q˜Ras the updated QR Decomposition of˜C,assuming||f2||=0. If||f2||=0,then˜C=Q1R1is the updated QR Decompo-sition of˜C.Note that f2can be computed efficiently as f2=f−(Q(Q T f)),by doing matrix-vector multiplication twice.Hence,the total time complexity for the QR-updating of the centroid matrix C is O(dk).5.1.2Step2:Updating of WNext we consider the updating of the reduced within-class scatter matrix W=Q T H w H T w Q(Line5of Algorithm1). Let˜W=˜Q T˜H w˜H T w˜Q be its updated version.Note that H w=[A1−m1·e T1,···,A k−m k·e T k]∈I R d×n. Its updated version˜H w differs from H w on the i th block. Let the i th block of H w be H i=A i−m i·e T i.Then the i th block of its updated version˜H w is˜Hi=˜Ai−˜m i·˜e T i=[A i,x]−˜m i·˜e T i=[A i−m i·e T i,x−m i]−(˜m i−m i)·˜e T i=[H i,u]−v·˜e T i,(7) where u=x−m i,v=˜m i−m i and˜e i=e i1∈I R n i+1. The product˜H i˜H T i can be computed as˜H i ˜H Ti=([H i,u]−v·˜e T i)([H i,u]−v·˜e T i)T=[H i,u]H T iu T−v·˜e T iH T iu T−[H i,u]˜e i·v T+(v·˜e T i)(˜e i·v T)=H i H T i+u·u T−v·u T−u·v T+(n i+1)v·v T =H i H T i+(u−v)·(u−v)T+n i v·v T,(8)where the third equality follows,since(H i,u)˜e i=j∈I i(a j−m i)+u=u,and(v·˜e T i)(˜e i·v T)=vv T(˜e T i·˜e i)=(n i+1)vv T.Since H w H T w=kj=1H j H T j,we have˜Hw˜H Tw=kj=1˜Hj˜H Tj=1≤j≤k,j=i˜Hj˜H Tj+˜H i˜H T i=kj=1H j H T j+(u−v)·(u−v)T+n i v·v T.It follows that˜W=˜Q T˜Hw˜H Tw˜Q=˜Q T H w H T w˜Q+˜Q T(u−v)·(u−v)T˜Q+n i˜Q T v·v T˜Q =˜Q T H w H T w˜Q+(˜u−˜v)·(˜u−˜v)T+n i˜v·˜v T≈QH w H T w Q+(˜u−˜v)·(˜u−˜v)T+n i˜v·˜v T=W+(˜u−˜v)·(˜u−˜v)T+n i˜v·˜v T,(9) where˜u=˜Q T u,and˜v=˜Q T v.The assumption of the approximation in Eq.(9)is that the updated˜Q with the insertion of a new instance is close to Q.The computation of˜u and˜v takes O(dk).Thus,the com-putation for updating W takes O(dk).5.1.3Step3:Updating of BFinally,let us consider the updating of the reduced between-class scatter matrix B=Q T H b H T b Q(Line6of Algorithm1). Its updated version is B=˜Q T˜H b˜H T b˜Q.The key observation for efficient updating of B is that˜Hb=[√1˜m1−˜m),···,√˜n k(˜m k−˜m)]can be rewritten as˜Hb=[˜m1,˜m2,···,˜m k,˜m]F=[˜C,˜m]F,where F=D−h T,D=diag(√˜n1,···,√˜nk),and h= [√˜n1,···,√˜nk]T.By the updated QR Decomposition˜C=˜Q˜R,we have˜Q T˜Hb=[˜Q T˜C,˜Q T˜m]F=[˜R,˜Q T˜m]F=˜RD−˜Q T˜m·h T.It is easy to check that˜m=1˜n˜C·r,where r=(˜n1,···,˜n k)T. Hence,˜Q T˜m=˜Q T1˜C·r=1˜R·r.It follows that˜B=˜Q T˜Hb˜H Tb˜Q=(˜RD−˜Q T˜m·h T)·(˜RD−˜Q T˜m·h T)T =˜RD−1˜n˜R·r·h T˜RD−1˜n˜R·r·h T T.Therefore,it takes O(k3)time for updating B.Overall,the total time for QR-updating of C and updating of W and B with the insertion of a new instance from an existing class is O(dk+k3).The pseudo-code is given in Algorithm2.5.2Insertion of a new instance from a newclass(i>k)Recall that we have data from the1st to k th classes,upon the insertion of x.Since x belongs to i th class,with i>k, the insertion of x will result in a new class.Without loss of generality,let us assume i=k+1.Hence the(k+1)th centroid˜m k+1=x.Then the updated centroid matrix˜C= [m1,m2,···,m k,x]=[C,x].In the following,we focus onAlgorithm 2:Updating Existing ClassInput:centroid matrix C =[m 1,m 2,···,m k ],itsQR Decomposition C =QR ,the matrix W ,the size n j of the j -th class for each j ,and a new point x from the i -th class,i ≤kOutput:updated matrix ˜W,updated centroid matrix ˜C,its QR Decomposition ˜C =˜Q ˜R ,and updated matrix ˜B;1.˜n j ←n j ,for j =i ;˜n i ←n i +1;f ←x −m ii;2.˜m i ←m i +f ;˜m j ←m j ,for each j =i ;3.˜C ←[˜m 1,···,˜m i ,···,˜m k ];4.f 1←Q Tf ;f 2←(I −QQ T )f ;5.do rank-one QR-updating of Q (R +f 1·g T )as Q (R +f 1·g T )=Q 1R 1;6.if ||f 2||=07.˜Q ←Q 1;˜R ←R 1;8.else9.q ←(I −QQ T )fT ;g ←(0,···,1,···,0)T ;10.do QR-updating of [Q 1,q ]R 1||f 2||g Tas[Q 1,q ]R 1||f 2||g T=Q 2R 2;11.˜Q←Q 2;˜R ←R 2;12.endif13.u ←x −m i ;v ←˜m i −m i ;14.˜u ←˜Q T u ;˜v ←˜Q T v ;15.˜W ←W +(˜u −˜v )(˜u −˜v )T +n i ˜v ˜v T ;16.D ←diag(√˜n 1,···,√˜n k );h =[√˜n 1,···,√˜n k ]T;17.r ←(˜n 1,···,˜n k )T ;˜r ←1˜n ˜R ·r ;18.˜B ←(˜RD −˜r ·h T )(˜RD −˜r ·h T )T ;the case when x does not lie in the space spanned by the k centroids {m i }k i =1.5.2.1Step 1:QR-updating of the centroid matrix CGiven the QR Decomposition C =QR ,it is straightfor-ward to compute the QR Decomposition of ˜Cas ˜C =˜Q ˜R by the Gram-Schmidt procedure [10],where ˜Q=[Q,q ],for some q .The time complexity for this step is O (dk ).5.2.2Step 2:Updating of WWith the insertion of x from a new class (k +1),the(k +1)th block ˜Hk +1is created,while H j ,for j =1,···,k keep unchanged.It is easy to check that ˜Hk +1=0.It follows that ˜H w ˜H T w =H w H T w .Hence ˜W=˜Q T ˜H w ˜H T w ˜Q =˜Q T H w H T w ˜Q =[Q,q ]T H w H T w[Q,q ]≈Q T H w H Tw Q00=W 000.The assumption in the above approximation is that W isthe dominant part in ˜W.5.2.3Step 3:Updating of BThe updating of B follows the same idea as in the previouscase.Note that˜Hb =[√˜n 1(˜m 1−˜m ),···,˜n k +1(˜m k +1−˜m )]can be rewritten as˜H b =[˜m 1,˜m 2,···,˜m k +1,˜m ]F,Algorithm 3:Updating New ClassInput:centroid matrix C =[m 1,m 2,···,m k ],its QR Decomposition C =QR ,the size n j of the j -th class for each j ,and a new point x from the (k +1)-th classOutput:updated matrix ˜W,updated centroid matrix ˜C,its QR Decomposition ˜C =˜Q ˜R ,and updated matrix ˜B;1.˜n j ←n j ,for j =1,···,k ;˜n k +1←1;˜n ←n +1;2.do QR-updating of ˜C =[C,x ]as ˜C =˜Q˜R ;3.˜W ←W 000;4.D ←diag √˜n 1,···,√˜n k +1;h ←√˜n 1,···,√˜n k +1T;5.r ←(˜n 1,···,˜n k +1)T ;˜r ←1˜Rr ;6.˜B ←(˜RD −˜r ·h T )(˜RD −˜r ·h T )T ;where the matrix F =D−h T,and D is an diagonal ma-trix D =diag(√n 1,···,√n k +1),and h =[√n 1,···,√n k +1]T .By the updated QR Decomposition ˜C=˜Q ˜R ,we have ˜QT ˜H b =˜Q T [˜C,˜m ]F =[˜Q T ˜C,˜Q T ˜m ]F=[˜R,˜Q T ˜m ]F =˜RD −˜Q T ˜m ·h T .Since ˜m =1˜n˜C ·r ,where r =(˜n 1,···,˜n k +1)T ,we have ˜Q T ˜m =˜Q T 1˜C ·r =1˜R ·r .Then ˜Bcan be computed by similar arguments as in the previous case.Therefore,it takes O (k 3)for updating B .Thus,the time for QR-updating of C and updating of W and B with the insertion of a new instance from a new class is O (dk +k 3).The pseudo-code is given in Algorithm 3.5.3Main algorithmWith the above two incremental updating schemes,theincremental IDR/QR works as follows:For a given new in-stance x ,determine whether it is from an existing or new class;If it is from an existing class,update the QR Decom-position of the centroid matrix C and W and B by applying Algorithm 2;otherwise update the QR Decomposition of the centroid matrix C and W and B by applying Algo-rithm 3;The above procedure is repeated until all pointsare considered.With the final updated ˜Wand ˜B ,we can compute the ˜k eigenvectors {φi }˜k i =1of (˜W +µI ˜k )−1˜B ,and assign [φ1,···,φ˜k ]to M ,where ˜k is the (updated)numberof classes (˜kequals k ,if x is from an existing,and k +1otherwise).Then the transformation G =˜QM,assuming ˜C=˜Q ˜R is the updated QR Decomposition.The incremental IDR/QR proposed obeys the following general criteria for an incremental learning algorithm [20]:(1)It is able to learn new information from new data;(2)It does not require access to the original data;(3)It preserves previously acquired knowledge;(4)It is able to accommo-date new classes that may be introduced with new data.6.EMPIRICAL EV ALUATIONIn this section,we evaluate both the batch version andthe incremental version of the IDR/QR algorithm.The per-formance is mainly measured by the computational cost in terms of the classification accuracy and execution time.In。