chap16 STATA编程基础

格式：ppt
大小：807.00 KB
文档页数：83

下载文档原格式

Stata软件基本操作和数据分析入门（完整版讲义）

Stata软件基本操作和数据分析入门(完整版讲义)Stata软件基本操作和数据分析入门第一讲Stata操作入门张文彤赵耐青第一节概况Stata最初由美国计算机资源中心（Computer Resource Center）研制，现在为Stata公司的产品，其最新版本为7.0版。

它操作灵活、简单、易学易用，是一个非常有特色的统计分析软件，现在已越来越受到人们的重视和欢迎，并且和SAS、SPSS一起，被称为新的三大权威统计软件。

Stata最为突出的特点是短小精悍、功能强大，其最新的7.0版整个系统只有10M左右，但已经包含了全部的统计分析、数据管理和绘图等功能，尤其是他的统计分析功能极为全面，比起1G以上大小的SAS 系统也毫不逊色。

另外，由于Stata在分析时是将数据全部读入内存，在计算全部完成后才和磁盘交换数据，因此运算速度极快。

由于Stata的用户群始终定位于专业统计分析人员，因此他的操作方式也别具一格，在Windows席卷天下的时代，他一直坚持使用命令行／程序操作方式，拒不推出菜单操作系统。

但是，Stata的命令语句极为简洁明快，而且在统计分析命令的设置上又非常有条理，它将相同类型的统计模型均归在同一个命令族下，而不同命令族又可以使用相同功能的选项，这使得用户学习时极易上手。

更为令人叹服的是，Stata 语句在简洁的同时又拥有着极高的灵活性，用户可以充分发挥自己的聪明才智，熟练应用各种技巧，真正做到随心所欲。

除了操作方式简洁外，Stata的用户接口在其他方面也做得非常简洁，数据格式简单，分析结果输出简洁明快，易于阅读，这一切都使得Stata成为非常适合于进行统计教学的统计软件。

Stata的另一个特点是他的许多高级统计模块均是编程人员用其宏语言写成的程序文件（ADO文件），这些文件可以自行修改、添加和下载。

用户可随时到Stata网站寻找并下载最新的升级文件。

事实上，Stata 的这一特点使得他始终处于统计分析方法发展的最前沿，用户几乎总是能很快找到最新统计算法的Stata 程序版本，而这也使得Stata自身成了几大统计软件中升级最多、最频繁的一个。

chap第一章Stata软件概述解读

1.2 Stata窗口及基本操作
? 从图1.1中可以看出，Stata与其他操作软件一样，具有正常的标题栏、菜单栏、工具栏和状态栏，在这里着重介绍一下菜单栏，因为它是用户进行菜单操作的主要媒介和工具。菜单栏主要包括File、Edit、Data、 Graphics、Statistics、User、Window、Help这八个子菜单。由于Stata主要是通过命令进行操作，所以这里只是简要介绍一下各个菜单的功能。
? （7）Window主要是用于对显示界面的操作，主要包括对Review、Results、Variables、Command四大窗口的操作。
1.2 Stata窗口及基本操作
? Stata最主要的部分是由四大窗口组成的，它们是分别是命令回顾窗口（Review）、结果窗口（Results）、变量窗口（Variables）、和命令输入窗口（Command），接下来，将会详细地介绍一下这四个窗口。
1.2 Stata窗口及基本操作
? 1.2.2Stata帮助系统
? Stata为用户提供了强大的帮助系统，新用户可以通过帮助系统的应用，更好地利用Stata完成自己所需要的功能和操作。Stata的帮助系统主要由Stata手册、Stata 自带帮助和网络帮助三个方面组成。
? （一）Stata手册是一本学习Stata使用的权威书籍，它按字母顺序排列出了Stata所有相关的命令。
? （1）File的下拉菜单包括打开、保存、查看文件，导入、导出数据以及打印等等功能。
? （2）Edit的下拉菜单包括数据的复制、粘贴等有关数据管理和设置的功能。
1.2 Stata窗口及基本操作
? （3）Data的下拉菜单包括数据的描述、编辑、浏览、增加或删除变量、文件合并、矩阵操作等方面的内容.

stata入门中文讲义_经济学_高等教育_教育专区

Stata及数据处理目录第一章STATA基础 (3)1.1 命令格式 (4)1.2 缩写、关系式和错误信息 (6)1.3 do文件 (6)1.4 标量和矩阵 (7)1.5 使用Stata命令的结果 (8)1.6 宏 (10)1.7 循环语句 (11)1.8 用户写的程序 (15)1.9 参考文献 (15)1.10 练习 (15)第二章数据管理和画图 (18)2.1数据类型和格式 (18)2.2 数据输入 (19)2.3 画图 (21)第3章线性回归基础 (22)3.1 数据和数据描述 (22)3.1.1 变量描述 (23)3.1.2 简单统计 (23)3.1.3 二维表 (23)3.1.4 加统计信息的一维表 (26)3.1.5 统计检验 (26)3.1.6 数据画图 (27)3.2 回归分析 (28)3.2.1 相关分析 (28)3.2.2 线性回归 (29)3.2.3 假设检验 Wald test (30)3.2.4 估计结果呈现 (30)3.3 预测 (34)3.4 Stata 资源 (35)第4章数据处理的组织方法 (36)1、可执行程序的编写与执行 (36)方法1：do文件 (36)方法2：交互式-program-命令 (36)方法3：在do文件中使用program命令 (38)方法4：do文件合并 (39)方法5：ado 文件 (40)2、do文件的组织 (40)3、数据导入 (40)4、_n和_N的用法 (44)第一章STATA基础STATA的使用有两种方式，即菜单驱动和命令驱动。

菜单驱动比较适合于初学者，容易入学，而命令驱动更有效率，适合于高级用户。

我们主要着眼于经验分析，因而重点介绍命令驱动模式。

图1.1Stata12.1的基本界面关于STATA的使用，可以参考Stata手册，特别是[GS] Getting Started with Stata，尤其是第１章A sample session和第２章The Stata User Interface。

Stata入门介绍

三大权威统计软件之一
占用空间小，携带方便
操作灵活、简单、易学易用
功能非常强大
输出结果简单，运算速度极快
使用命令行／程序操作方式，新版本有菜单操
作系统
2013-822wwwthemegallerycom
LOGO
2
命令回顾窗口
结果窗口
变量名窗口
命令窗口
LOGO
2013-822wwwthemegallerycom
行可以是变量名
LOGO
2013-822wwwthemegallerycom
13

①点击图标
，然后选择路径和文件名
Байду номын сангаас

②使用use命令 use c:\data1 即扩展名可以省略，如果Stata中已经修改或者建立了数据集，则需要使用clear选项清除原有数据，命令为： use c:\data1 , clear

还有数据编辑窗口、程序文件编辑窗口、帮助窗口、绘图窗口等
LOGO
2013-822wwwthemegallerycom
5

Stata命令的基本语法格式如下： [ 特殊选项 ] 关键词命令参数 [ , 命令选项 ]

注意所有命令、函数、变量名等都区分大小写,例如
“x”和“X”会被认为是两个不同的变量。
keep
list sort clear l so
保留某个变量：keep 变量名
显示数据将记录按照指定顺序排序清除数据库
LOGO 2013-822wwwthemegallerycom
18

用命令describe可描述数据库，查看数据库的基本情况。假定数据库data2.dta已调入内存,键入

Stata 简介及基本操作ppt课件

精选版课件ppt
12
此时，可以点击 Save 图标（也可以点击菜单“File”→ “Save”），将数据存为Stata 格式的文件（扩展名为dta），比如wanger_law.dta。
这样，以后就可以用Stata 直接打开这个数据集了（不需要再从Excel 表中粘贴过来）。
打开的方式有两种。可以点击Open 图标（也可以点击菜单“File”→“Open”），然后寻找要打开的dta 文件的位置。
kernel = epanechnikov, bandwidth = 6128.97
精选版课件ppt
17
如果想删除满足“year ≥2001”条件的观测值，则可使用命令: . drop if year>=2001
反之，如果只想保留满足“year≥2001”条件的观测值，而删去所有其他观测值: . keep if year>=2001
精选版课件ppt
18
5．考察变量的统计特征如果想看变量 gov、gcons和gdp的统计特征，可输入命令:
Max 76299.93 44396.9 340506.9
满足条件的统计： . summarize gov gcons gdp if year>2000
如果不指明变量，则将显示数据集中所有变量的统计指标。 summarize
如果要显示内存中某些变量之间的相关系数，可输入命令: . correlate gov gcons pop gdp
. clear 这样，内存中所有的当前数据都被清空，然后可以再打
开另外一个数据集。
精选版课件ppt
14
2．变量的标签在变量窗口，每个变量的“名字”（Name）旁边显示了
其“标签”（label）。但目前的标签过于简略，缺乏变量的解释信息。

stata操作介绍之编程简介

将在一个do文件或ado文件中创建，并在文件执行完毕后从内存中自动清除，无论程序的结束是自然完成还是突然终止的。
全局宏
在STATA程序运行的整个过程中都一直存在。
5.2 局部宏与全局宏
5.2.1 局部宏
【例5-1】请定义一个计数宏count从1到4，定义一个包含四个国家简写（US UK DE FR）的局部宏country，然后通过使用foreach语句（循环语句foreach 会在13.4节详细介绍）和display语句生成如下样式的结果：
Country 1 : US Country 2 : UK Country 3 : DE Country 4 : FR 【例5-2】请定义一个计数宏count从1到4，定义一个包含四个国家简写（US
UK DE FR）的局部宏country，然后通过使用foreach语句（循环语句foreach 会在）和display语句生成如下样式的结果： 1 US 2 UK 3 DE 4 FR
14.6.1 r类命令下面让我们从最简单的r类命令入手。理论上说，所有的Stata命令
都会将其所有的输出项存放在一个返回列表中。输入命令：
return list
就可以获得这些输出项。
举例
【例5-6】打开数据集wage.dta，使用describe命令描述该数据集，然后返回describe命令中的输出项。
5.5.2 foreach语句
foreach lname {in|of listtype} list {
commands referring to `lname'
}
可以使用的类型包括：
对于每一个局部宏
foreach lname of local lmacname { 对于每一个全局宏

Stata 16 用户指南说明书

Austin Nichols***********************@austnncholsOne Weird TrickHow to analyze experiments•The only way to be sure we are estimating unbiased causal impacts ofa “treatment” (intervention, policy, program) is to compare means via anexperiment (Freedman 2018a,b, Lin 2013)•But we can always do better by conditioning on observable (pre-treatment) characteristics: these “covariates” can reduce MSE–Stratification/blocking preferred to post hoc statistical adjustment but has its own limitations (Kallus 2018)–How should one adjust for covariates if using a regression to analyze the experimental data? What variables should be included?❖Use the LASSO! Specifically, poregress, dsregress, xporegress, etc.•New to Stata as of Stata 16, explained in the new [LASSO] manual and in Drukker(2019)Partialing out• A series of seminal papers by Belloni, Chernozhukov, and many others (see references) derived partialing-out estimators that provide reliable inference for d after one uses covariate selection to determine which of many covariates “belong” in the model for outcome YY = A d + X g + ewhere A is a treatment variable of interest and X measures the (possibly verylarge) set of potential covariates, but many elements of g are zero •Essentially, run separate LASSO regressions of Y and A on X and regress residualizedŸon residualizedÄ (where Ä = A –Â )•The cost of using these poregress, dsregress, xporegress methods is that they do not produce estimates for the covariate coefficients gStata 16 LASSO manual page 12Add’l Stata implementations•ssc desc lassopack, ssc desc pdslasso(Ahrens, Hansen, and Schaffer 2018) released prior to Stata 16 implementations–They implement the LASSO (Tibshirani1996) and the square-root-lasso (Belloni et al. 2011, 2014).–These estimators can be used to select controls (pdslasso) or instruments (ivlasso) from a large set of variables (possibly numbering more than thenumber of observations), in a setting where the researcher is interested inestimating the causal impact of one or more (possibly endogenous) causal variables of interest.–Two approaches are implemented in pdslasso and ivlasso: (1) The "post-double-selection" (PDS) methodology of Belloni et al. (2012, 2013, 2014,2015, 2016). (2) The "post-regularization" (CHS) methodology ofChernozhukov, Hansen and Spindler (2015). For instrumental variableestimation, ivlasso implements weak-identification-robust hypothesis tests and confidence sets using the Chernozhukov et al. (2013) sup-score test.Regression for experiments•Note that in the model for outcome YY = A d + X g + e•We really should never care about the “effect” of any element of X conditional on A and other elements of X, i.e. we should not care one whit about estimates of g•In expectation, A and X are uncorrelated; we just want a data-driven way to eliminate chance correlation between X and A for any X that also has effects on Y in order to reduce the variance of our estimates of d•These and other points arose in email correspondence in 2016-2017 with David Judkins who has used LASSO in subsequent studies (Judkins 2019)Okay, LASSO, but what kind?•Chetverikov, Liao, and Chernozhukov(2019) show “the cross-validated LASSO estimator achieves the fastest possible rate of convergence in the prediction norm up toa small logarithmic factor”•Drukker(2019) suggests the plug-in estimator has better small-sample performance in simulations (not reported)• A bootstrap could give out-of-sample performance measures akin to RandomForest regressionsSimulations•Suppose we have hundreds of candidate regressors, all distributed lognormal, all uncorrelated with each other• A few are correlated with Y (every 20th)•How big an improvement might we expect with the xporegress cross-fit partialing-out lasso linear regression with plug-in optimal lambda?Typical Simulation Results 10,000 iterationswith N=100Regressions use allavailable controls,zero to 80+Horizontal linesshow performanceof xporegress withCV or plug-inselection optionsConclusions•As we add useless regressors, MSE increases and the occasional useful regressor does not (necessarily) make up for that, but xporegress does better in every realistic case examined •Alternatives in e.g.Judkins (2019) can introduce bias or introduce size errors (rejection rates deviating from nominal size) but xporegress is safe on both frontsCredit (blame) for the title to TimOne Weird Trick (Nichols 2021)11ReferencesAhrens, A., C. Hansen, and M.E. Schaffer. 2018. pdslasso and ivlasso: Progams for post-selection and post-regularization OLS or IV estimation and inference. /c/boc/bocode/s458459.htmlBelloni, A., D. Chen, V. Chernozhukov, and C. Hansen. 2012. Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica80: 2369–2429.Belloni, A., and V. Chernozhukov. 2013. Least squares after model selection in high-dimensional sparse models. Bernoulli19: 521–547.Belloni, A., V. Chernozhukov, and C. Hansen. 2013. Inference for high-dimensional sparse econometric models. In Advances in Economics and Econometrics: 10th World Congress, Vol. 3: Econometrics, Cambridge University Press: Cambridge, 245-295. Belloni, A., V. Chernozhukov, and C. Hansen. 2014a. High-dimensional methods and inference on structural and treatment effects. Journal of Economic Perspectives,28(2): 29–50.Belloni, A., V. Chernozhukov, and C. Hansen. 2014b. Inference on treatment effects after selection among high-dimensional controls. The Review of Economic Studies, 81(2):608–650.Belloni, A., Chernozhukov, V., Hansen, C. and Kozbur, D. 2016. Inference in High Dimensional Panel Models with an Application to Gun Control. Journal of Business and Economic Statistics 34(4):590-605.Belloni, A., Chernozhukov, V. and Wang, L. 2011. Square-root lasso: Pivotal recovery of sparse signals via conic programming. Biometrika98:791-806.Belloni, A., V. Chernozhukov, and L. Wang. 2014. Pivotal estimation via square-root-lasso in nonparametric regression. Annals of Statistics 42(2):757-788.Belloni, A., V. Chernozhukov, and Y. Wei. 2016. Post-selection inference for generalized linear models with many controls. Journal of Business & Economic Statistics34: 606–619.Bühlmann, P., and S. Van de Geer. 2011. Statistics for High-Dimensional Data: Methods, Theory and Applications. Berlin: Springer.Chernozhukov, V., D. Chetverikov, M. Demirer, E. Duflo, C. Hansen, W. Newey, and J. Robins. 2018. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1): C1–C68.36 / 36Chernozhukov, V., D. Chetverikov, K. and Kato. 2013. Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors. Annals of Statistics 41(6):2786-2819.Chernozhukov, V. Hansen, C., and Spindler, M. 2015. Post-selection and post-regularization inference in linear models with many controls and instruments. American Economic Review: Papers & Proceedings,105(5):486-490.Chetverikov, D., Z. Liao, and V. Chernozhukov. 2019. On cross-validated Lasso. arXiv Working Paper No. arXiv:1605.02214. /abs/1605.02214.Drukker, D. 2019. Using the lasso in Stata for inference in high-dimensional models. Presentation at London Stata Conference 5-6 September 2019.Freedman D. A. (2008a). On regression adjustments to experimental data. Adv. in Appl. Math. 40: 180–193. MR2388610Freedman, D. A. (2008b). On regression adjustments in experiments with several treatments. Ann. Appl. Stat. 2: 176–196. MR2415599Hastie, T., R. Tibshirani, and J. Friedman. 2009. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New York: Springer.Hastie, T., R. Tibshirani, and M. Wainwright. 2015. Statistical Learning with Sparsity: The Lasso and Generalizations. Boca Rotaon, FL: CRC Press.Judkins, D. 2019. “Covariate Selection in Small Randomized Studies.” https:///meetings/jsm/2019/onlineprogram/AbstractDetails.cfm?abstractid=307372Kallus, N. 2018. Optimal a priori balance in the design of controlled experiments. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 80(1), 85–112.Lin, Winston. 2013. "Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique." Ann. Appl. Stat. 7(1): 295 -318.Spindler, M., V. Chernozhukov, and Hansen, C. 2016. High-dimensional metrics. https:///package=hdm.Tibshirani, R. 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society, Series B58: 267–288.Yamada, H. 2017. The Frisch-Waugh-Lovell Theorem for the lasso and the ridge regression. Communications in Statistics -Theory and Methods 46(21):10897-10902.Zou, H. 2006. The adaptive Lasso and its oracle properties. Journal of the American Statistical Association101: 1418–1429.Zou, H., and T. Hastie. 2005. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B67: 301–320.12 One Weird Trick (Nichols 2021) ***********************。

第十四章 Stata编程基础

【例14-9】本例使用数据集abdata.dta，该数据集是一个面板数据，是140个国家 1976年到1984年的各种宏观指标。在这里的关键变量是id代表每个国家的标号，year代表年份，一个面板数据通常定义这两个变量以便进行各种面板回归和统计。其他的变量包括就业率emp （%）、平均工资wage（指标）、投资占GDP的百分比cap（%）。表14-4 罗列了部分数据这里要求用tsset命令查看该面板数据的结构，并使用return list返回相关的结果。
本章的内容包括do文件和log文件的介绍、局部宏和全局宏、标量和矩阵、循环语句以及如何利用return list和ereturn list命令获得Stata 命令的结果，这些内容都是Stata编程的基础。
Page 2
STATA从入门到精通
14.1 do文件和Log文件 14.1.1 do文件的编写
下面让我们从最简单的r类命令入手。理论上说，所有的Stata命令都会将其所有的输出项存放在一个返回列表中。输入命令：
return list
就可以获得这些输出项。
Page 20
STATA从入门到精通
【例14-8】打开数据集wage.dta，使用describe命令描述该数据集，然后返回describe命令中的输出项。
Page 14
STATA从入门到精通
显示定义好的所有标量：
. scalar list 删掉标量a和b：
. scalar drop a b
再比如如下两个命令，要求定义标量，并且将这个标量用于定义新的变量： .scalar root2 = sqrt(2.0) //生成一个标量，其赋值为2的开根号 .generate DOuble rootGDP = gdp*root2 //将这个标量用于定义新的变量

stata16中文入门教程.pdf说明书

Stata软件入门教程李昂然浙江大学社会学系Email: ********************版本：2020/02/051. 导论本教程将快速介绍Stata软件（版本16）的一些基本操作技巧和知识。

对于详细的Stata介绍和入门，小伙伴们可以参考Stata官方的英文手册以及教程所提供的学习资料。

跟其他大多数统计软件一样，Stata可以同时通过下拉菜单以及命令语句来操作。

初学者可以通过菜单选项来逐步熟悉Stata，但是命令语句的使用是Stata用户的最佳选择。

因此，本教程将着重介绍命令语句的使用。

对于中文用户来讲，在打开Stata之后，可以通过下拉菜单选项中的用户界面语言选择将中文设置为默认语言。

同时，也可以在命令窗口中输入set locale ui zh_CN来设置中文显示。

在选择完语言后，记得重新启动Stata。

需要提醒大家，虽然Stata用户界面可以显示中文，但是统计分析的结果仍然将以英文显示。

本教程中使用的案列数据源自中国家庭追踪调查（China Family Panel Studies）。

具体数据出自本人于2019年发表于Chinese Sociological Review上“Unfulfilled Promise of Educational Meritocracy? Academic Ability and China’s Urban-Rural Gap in Access to Higher Education”一文中使用的数据。

关于数据的具体问题，请联系本人。

同时，本教程提供相应的do file和数据文件给同学们下载，同学们可以根据do file复制本教程的全部内容。

下载地址为我个人网站：https://angranli.me/teaching/温馨提示：关于Stata操作的大多数疑问，都可以在官方手册上找到答案。

同时，在Stata中输入help [command]便可以查看关于命令使用的详细信息。

stata操作介绍之基础部分(一)讲述

录入相应的数值
2.用STATA的数据编辑器 ①进入数据编辑器进入stata界面，在命令栏键入edit或在stata的window下拉菜单中单击data editor 编辑图标 (注意：是浏览图标，点击后只能浏览，不能编辑)即可进入 stata数据编辑器。 ②数据编辑 stata 数据编辑器界面：此时进入了数据全屏幕编辑状态。
3、绘图功能 4、与其他软件的区别
1.5 工具书、论坛推荐
• Stata工具书： 1、Stata实用教程——王天夫、李博柏著（基础教程） 2、应用Stata做统计分析——汉密尔顿著;郭志刚等译（最全教程） • Stata学习相关资料 1、经管之家论坛：/forum-67-1.html 2、Stata官方论坛：/links/resources.html
• 命令回顾窗口：即 review 窗口，位于界面左上方，所有执行过的命令会依次在该窗口中列出，选中某一行单击后命令即被自动拷贝到命令窗口中；如果需要重复执行，用鼠标双击相应的命令行即可。
• 变量名窗口：位于界面左下方，列出当前数据集中的所有变量名称，。除以上四个默认打开的窗口外，在 Stata 中还有数据编辑窗口、程序文件编辑窗口、帮助窗口、绘图窗口、Log 窗口等，如果需要使用，可以用 Window 或 Help 菜单将其打开。
有点stata数据编辑器第一格即可复制全部数据。复制会问你是否把第一行作为变量。
方法二：导入的方式先做好excel数据文件，并以“xml 表格(*.xml）”的形式保存，注意不能以“xml 数据(*.xml）”的形式保存。而且注意，保存时不能在第一行中输入变量名，只能全部为数据。
• 数据的输出可通过命令直接输出和使用菜单栏输出： 1、命令输出格式 outsheet [ varlist ] using filename [ if ] [ in ] [ ,opt ] 2、使用菜单栏输出 File>>Export>>Excel spreadsheet(*.xls,*xlsx)>>选中要输出的，设置文件名，再点击确认即可（也可以选择其它输出格式）。

stata中文教程

在 Stata 中可以使用命令行方式直接建立数据集，首先使用 input
命令制定相应的变量名称，然后一次录入数据，最后使用 end 语句
表明数据录入结束。
例 1 在某实验中得到如下数据，请在 Stata 中建立数据集。
观测数据
X13579
Y 2 4 6 8 10
解：此处需要建立两个变量 X、Y，分别录入相应数值，Stata 中
169.2 174.7 185.4 175.8 173.5 175.9 175.9 173.2 174.8 177.2
171.9 166.0 177.3 175.2 179.8 175.7 180.8 171.4 178.9 175.0 183.7 171.6 172.9 173.6 177.7 172.4
169.5 177.0 183.6 170.3 178.8 181.1 182.9 177.8 164.1 169.1
176.3 169.4 171.1 172.9 177.0 179.8 178.2 174.4 169.2 176.4
178.3 165.0 175.8 181.0 177.6 177.4 178.7 175.1 181.8 171.3
3．命令回顾窗口：即 review 窗口，位于界面左上方，所有执行过的命令会依次在该窗口中列出，单击后命令即被自动拷贝到命令窗口中；如果需要重复执行，用鼠标双击相应的命令即可。
4．变量名窗口：位于界面左下方，列出当前数据及中的所有变量名称，。
除以上四个默认打开的窗口外，在 Stata 中还有数据编辑窗口、程序文件编辑窗口、帮助窗口、绘图窗口、Log 窗口等，如果需要使用，可以用 Window 或 Help 菜单将其打开。
变量格式等均会被自动正确设置，见图 6 和图 7。

stata 编程教程(PPT)

~162 seconds
Matrix algebra
clear mata mata rseed(14170) N = 3000 rA = rnormal(5, 5, 0, 1) rB = rnormal(5, N, 0, 1) rC = rnormal(N, 5, 0, 1) d = rnormal(1, N, 0, 1) V = (rA, rB \ rC, diag(d)) V_fast = luinv(rA - cross(rB', d :^ -1, rC)) V_fast
Fixed Effects with large data sets
gen x = invnormal(uniform()) gen id_fe = invnormal(uniform()) gen t_fe = invnormal(uniform()) by id: replace id_fe = id_fe[1] sort t id by t: replace t_fe = t_fe[1] gen y = 2 + x + id_fe + t_fe + 100 * invnormal(uniform())
QUESTION: Is there a repeating decimal in base 10 that is not repeating in base 2?
பைடு நூலகம்
Precision issues in Mata
mata A = (1e10, 2e10 \ 2e-10, 3e-10) A rank(A) luinv(A) A_inv = (-3e-10, 2e10 \ 2e-10, -1e10) I = A * A_inv I end

STATA基本操作入门

STATA基本操作入门1.数据导入在STATA中，可以导入多种格式的数据文件，如Excel、CSV和文本文件。

最常用的命令是"import excel"和"import delimited"。

例如，要导入名为"data.xlsx"的Excel文件，可以使用以下命令：```import excel using "data.xlsx", sheet("Sheet1") firstrow clear```这里，"using"指定了文件路径和文件名，"sheet"指定了工作表名称（如果有多个工作表），"firstrow"表示第一行是变量名。

2.数据清洗在导入数据后，通常需要进行数据清洗，包括处理缺失值、异常值和重复值等。

STATA提供了一些常用的命令来处理这些问题。

- 缺失值处理：使用"drop"命令删除带有缺失值的观测值，使用"egen"命令创建新变量来表示缺失值。

- 异常值处理：可以使用描述性统计命令（如"summarize"）来查找异常值，并使用"drop"命令删除异常值所对应的观测值。

- 重复值处理：使用"deduplicate"命令删除重复的观测值，或使用"egen"命令创建新变量来表示重复值。

3.变量操作在STATA中，可以对变量进行各种操作，如创建变量、重命名变量、计算变量和合并变量等。

- 创建变量：可以使用"generate"命令创建新变量，并赋予其数值或字符值。

- 重命名变量：使用"rename"命令将变量重命名为新的名称。

- 计算变量：使用"egen"命令计算新变量，例如，可以使用"egen mean_var = mean(var)"计算变量"var"的均值，并将结果赋值给新的变量"mean_var"。

Stata编程基础

202 302 102 2
DE UK
DE FR UK US
DE UK
DE FR UK US
1971q1 1971q1
1971q2 1971q2 1971q2 1971q2
Page
8
STATA从入门到精通
14.2.2 全局宏
全局宏的内涵与定义方式不同于局部宏。它往往被用来存储整个过程中所需要的用到的宏。例如，当我们需要存储一些当前的数据，且这些数据将会被所有的程序或默认路径下的数据集与do文件所使用时，我们就可以创建一个全局宏。
Page 5
STATA从入门到精通
Page
6
STATA从入门到精通
【例14-4】数据集gdp4cty.dta是美国、英国、德国和法国GDP的季度数据，从1971年的第1季度到1995年的第4季度，该数据集中重要的变量如表14-1所示，部分数据罗列在了表14-2中。本例要求利用局部宏和foreach语句生成每一国家的GDP时间趋势图。
引言
这一章讨论Stata中一些常用的编程语句。我们将讲解do文件与log文件，用户可以将任何顺序的Stata命令存入一个文本文件或do文件中，并通过Stata中的do命令或do文件编辑器来执行。通过使用Stata的 do文件，用户可以避免重复键入相同的程序，使得一些重复性编程问题的处理变得更加方便。
Country 1 : US Country 2 : UK Country 3 : DE Country 4 : FR 【例14-3】请定义一个计数宏count从1到4，定义一个包含四个国家简写（US UK DE FR）的局部宏country，然后通过使用foreach语句（循环语句foreach会在）和display语句生成如下样式的结果： 1 US 2 UK 3 DE 4 FR

《stata基础》PPT课件

a
2
Stata有什么优势？
1。Stata 可以通过菜单和命令两种方式操作，命令语句简洁明快，易学易记。
2。强大的帮助信息。本地帮助 Help 命令名在线帮助Findit 命令名
3。始终处于计量经济学和统计学的最前沿。许多Stata 程序员会针对计量经济学发展编写一些最新的程序（ADO 文件）， Stata提供了严谨、简练而灵活的程序语句，用户可以编写自己的命令和函数，同时可随时到Stata 网站寻找并下载最新的升级文件。下载后可以直接使用，也可以自行修改、添加功能。
describe------des 得到正确命令缩写的简单方法：看help。
a
17
几条最简单的命令
use 打开数据文件，一般加clear选型清空内存中现有数据。 sysuse 打开系统数据文件。 describe 描述数据 edit 利用数据编辑器进行数据编辑 list 类似于edit，但只能显示不能修改数据。
a
3学习有Biblioteka 么困难？1。不同于以往的软件较多的运用菜单， Stata较多运用命令操作。（菜单很难记住和找到）。
2。目前国内相关教材较少。
3。有些计量方法需要编程（如极大似然估计），编程需要一定的计算机基础（不是我们学习的重点）。
a
4
版本说明
最新版本Stata 12
我们使用的 Stata10
cd "D:\abc” 或者直接 file------open
a
16
Stata命令格式
1。Stata的命令一定要区分大小写，除了极个别的情况下，stata命令全部用小写。 2。大部分命令可以缩写。使用缩写可以使stata的命令书写大为简化：例如： display-------di

第一章 stata编程基础

Page 3
STATA从入门到精通
研究工具及其比较
Mathematica，数学软件，使用方便，命令简单，对数据拟合、模型拟合处理的很好，图形也非常漂亮。但是统计功能不是很强大。
Matlab软件，数值分析完美，编程能力很强，工具包齐全（因此也很占空间）。需要较强的编程能力。数学建模必备工具。 Limdep，专业的统计软件。命令复杂，界面不友好，学习费时，效率不高。 Stata，兼有eviews,spss,limdep,gauss的长处，使用简单，把傻瓜菜单和命令、编程完美结合起来，计量模型基本齐全。 R，新生代统计与计量软件，优势在于开源，程序包资源十分丰富，能够共享最新的研究进展。适合学术研究。记住：软件的自由度和易操作性往往是成反比的！
Page 4
STATA从入门到精通
研究工具选择
入门级别，计量选Eviews，统计选SPSS。中级，建议使用Stata，掌握基本编程能力。高级，SAS和R，以及Matlab。实际情况可能是各种软件结合起来用，根据需要来选择。
其他工具还有：Amos（处理结构方程模型），Lingo（运筹优化），1stOpt（综合优化）等等
Page 12
STATA从入门到精通
1.4 Stata学习资源
Stata之所以受到广泛使用和关注的原因，除了软件本身的特点之外，还在与Stata公司为用户提供了很好的学习、讨论、研发平台。Stata 公司提供完整的使用手册，包含统计样本建立、解释、模型与语法、文献等超过一千六百页的出版品。除了之外，Stata软件可以透过网络实时更新每天的最新功能，更可以得知世界各地的使用者对于 Stata公司提出的问题与相应的解决之道。
/stat/stata/

Stata 16 系列用户指南说明书

Stata16—Under the HoodBill RisingStataCorp LLC2019Italian Stata Users Group Meeting26September2019FirenzeContents1Introduction11.1Goals (1)2Frames22.1Basic Frames (2)2.2Linking Frames (3)2.3Copying,Putting,and Posting (10)2.4Side Gains from Frames (10)3Report Generation Additions113.1Report Generation Additions (11)4Conclusion144.1Conclusion (14)1Introduction1.1GoalsGoals•Learn the basics of the frames feature in Stata16•See what is new in report generation,aka dynamic documentsMethods•For frames,it will be easy to demonstrate commands and capture their output•For the dynamic documents,demonstrating commands isﬁne,but the output are documents,so the presentation will become much less deﬁnite•We’ll be working in a series of folders which correspond to each of the topicsIf you copied the italy19_rising.zip folder and expanded theﬁlesMake the resulting folder your working directoryThe examples here will work relative to that directory2Frames2.1Basic FramesFrames in Stata16•Frames were introduced in Stata16•At their simplest,they are a way to have multiple datasets open at once•They are also something which acts like mergeBut they can save space•Lastly,there are some things which get sped up because of framesBasics of Frames•Think of a frame as a place to hold dataThe data can be in a dataset or simply in the frame•Each frame has an internal Stata nameTheﬁrst frame,which exists when you start Stata,is called default,by defaultStarting Simple:Frames for Multiple Datasets•First,go to the frames folder.cd frames•Open a dataset.use visit_info•Create a second frame.frame create patients•Open another dataset in that other frame.frame patients:use patient_infoGlancing at the Datasets•Open the data editor,to see the dataset.edit•Switch back and forth between frames via cwf.cwf patients•Or switch back and forth using frame change.frame change default•Or switch back and forth using the frames dialog.db framesChanging Frame Names•The default frame has a forgetable name in our caseit forces us to remember which dataset has this special status•We can change the name of the default frame name to something more informative.frame rename default visits•We can then look at what frames we have.frame dirpatients4x4;patient_info.dtavisits9x5;visit_info.dtaThe numbers given are observations×variablesOr if you prefer rows×columns2.2Linking FramesLinking Datasets Using Frames•It would make sense to combine the information in the visit_info and patient_info datasets This is normally a task for the merge command•Instead of using merge,you can link together datasets in framesThis can be good for very long datasetsIt has some other advantages(and disadvantages)How to Link•The possible link types are1:1and m:1There isﬁne;the1:m really is not needed because all that need be done is to switch the active frame •In this example there can be multiple visits per patient,so we need to have the visits frame active .cwf visits•Now we can link on patid.frlink m:1patid,frame(patients)(3observations in frame visits unmatched)Upshot of Linking•A new variable gets created in the dataset in the active frameBy default,this is named after the frame which was linked•You can tell indirectly which observations matched up in the active frameThose which matched have non-missing values for the linking variableThose which did not match up with data in the linked dataset have missing variables for the linking variable •You cannot tell which observations did not match in the linked frameThis is similar to having_merge values of1and2onlyUsing Variables from a Linked Frame•The frval()function allows you to use values from a variable in the linked frame without actually copying the variable into the current frameWhich saves space if the active frame is long•We could list all the visits from the female patients.list patid-doctor if frval(patients,gender)=="Female"+-----------------------------------------------------+|patid visitdt illness insura~e doctor||-----------------------------------------------------|1.|905oct2015Cold HDHP|3.|120oct2015Pneu.|7.|929dec2015Flu.|9.|923feb2016Sore Throat HMO Smith|+-----------------------------------------------------+•This function can be used in any exp anywhere.gen ins_diff=insurance!=frval(patients,insurance)This shows where the insurance diﬀers in the two datasets.list patid visitdt insurance if ins_diff+------------------------------+|patid visitdt insura~e||------------------------------|1.|905oct2015HDHP|3.|120oct2015.|4.|2512nov2015PPO|5.|415nov2015.|6.|2530nov2015PPO||------------------------------|7.|929dec2015.|8.|61618jan2016HMO|+------------------------------+Adding Variables from a Linked Frame•You can bring over variables from a linked dataset.frget birthdate,from(patients)(3missing values generated)(1variable copied from linked frame)•frget copies the data as well as all metadata from the linked variable•This is similar to.merge m:1patid using patient_info,keepusing(birthdate)As it turns out,linking has better behavior for value labels,as we will see•This is good for computing age.do genage.gen age=year(visitdt)-year(birthdate)///>-(31*month(visitdt)+day(visitdt)///><31*month(birthdate)+day(birthdate))(3missing values generated).end of do-file•Here are the ages.list patid visitdt birthdate age+-------------------------------------+|patid visitdt birthdate age||-------------------------------------|1.|905oct201515jun198728|2.|419oct201528may199817|3.|120oct201518nov200311|4.|2512nov2015..|5.|415nov201528may199817||-------------------------------------|6.|2530nov2015..|7.|929dec201515jun198728|8.|61618jan2016..|9.|923feb201615jun198728|+-------------------------------------+Adding a Variable Whose Name Exists•If you want to bring over a variable whose name matches one of the variable names in the active frame You can generate a new variable with a diﬀerent name.frget pat_insurance=insurance,from(patients)(3missing values generated)(1variable copied from linked frame)You can use a preﬁx or a suﬃx.frget insurance,from(patients)prefix(another_)(3missing values generated)(1variable copied from linked frame)If you don’t try to change the conﬂicting name,you will get an errorGood Value Label Behavior•If the variable you bring over has a value labelIf the value label does not exist in the active frame,the value label comes overIf the value label exists in the activer frame and the deﬁnitions match,then nothing need be doneIf the value label exists in the activer frame and the deﬁnitions do not match,then the brought-over value label gets renamedThis is better behavior than with merge,which simply issues a warningRunning Commands in Another Frame•In this example,the value label instype exists in both datasets•It would be good to look at the deﬁnitions•We would like to do this without having to switch back and forth between framesIn the visits frame,which is active.label list instypeinstype:1HDHP2HMO3PPOIn the patients dataset.frame patients:label list instypeinstype:1HDHP2HMO3PPOIgnoring that the visits frame is active.frame visits:label list instypeinstype:1HDHP2HMO3PPO•In any case,we can see that the value labels are all deﬁned wellOpening a Dataset with Conﬂicts•Suppose our patient_info dataset were not quite so nice•The patient_ohno datasetﬁts this billWe will want to link to this•Let’s look at it the frames way•First create a frame.frame create ohno•Now open up the dataset in that frame.frame ohno:use patient_ohno•And look at it.frame ohno:codebook------------------------------------------------------------------------------------------id Personal ID ------------------------------------------------------------------------------------------type:numeric(byte)range:[1,16]units:1unique values:4missing.:0/4tabulation:Freq.Value111419116------------------------------------------------------------------------------------------birthdate Patient Birth Date ------------------------------------------------------------------------------------------type:numeric daily date(int)range:[8028,16027]units:1or equivalently:[24dec1981,18nov2003]units:daysunique values:4missing.:0/4tabulation:Freq.Value1802824dec198111002715jun198711402728may199811602718nov2003------------------------------------------------------------------------------------------gender Patient Gender ------------------------------------------------------------------------------------------type:string(str6)unique values:2missing"":0/4tabulation:Freq.Value2"Female"2"Male"------------------------------------------------------------------------------------------insurance Insurance Type ------------------------------------------------------------------------------------------type:numeric(long)label:instyperange:[1,2]units:1unique values:2missing.:0/4tabulation:Freq.Numeric Label21HMO22PPOThings to Note•The patid is now called just id•The insurance variable is encoded diﬀerently,but still has the instype value labelThis would be a big problem when using merge,updateLinking to Dataset with Diﬀering Key Names•We can still use frlink to link to a dataset where the key variables have diﬀerent names Key:variable list which identiﬁes individual variables in one dataset•To do this,we must specify the keyvarlist in the frame()option.frlink m:1patid,frame(ohno id)(3observations in frame visits unmatched)Avoiding A Dangerous Data Error•Just to drive home the point,check that the instype value labels diﬀerFirst in the active frame.label list instypeinstype:1HDHP2HMO3PPONow in the linked dataset.frame ohno:label list instypeinstype:1HMO2PPO3HDHP•Try to bring in the insurance variable from the ohno frame.frget insurance,from(ohno)prefix(ohno_)(3missing values generated)(1variable copied from linked frame)•Look at the value labels.label listinstype1:1HMO2PPO3HDHPinstype:1HDHP2HMO3PPO•Stata renamed the value label from frget to avoid a data error!This is better behavior than in mergeNotes about Linking•You can use frget to grab many variables from the linked datasetfrget varlist...•You could grab all but some variables by using the exclude()optionfrget_all,exclude(notthisvarlist)•This is like using the keepusing()option in merge except that it allows excluding instead of just including variablesStatic Linking Requires Care•Changing the key in the active frame is dangerous!•Here is such a dangerous change.replace patid=9if patid==4&visitdt==mdy(10,19,2015)(1real change made)•Now go and get the gender variable.frget gender,from(patients)(3missing values generated)(1variable copied from linked frame)•Because the linking is static,you can get odd results.tabulate patid genderPersonal|Patient GenderID|Female Male|Total-----------+----------------------+----------1|10|14|01|19|31|4-----------+----------------------+----------Total|42|6Rebuilding Links•If you are unsure of the state of the links,you should rebuild them.frlink rebuild patientsrebuilding variable patients;executing------------------------------------------------------------------------------------------->frlink m:1patid,frame(patients)(3observations in frame visits unmatched)------------------------------------------------------------------------------------------variable patients successfully rebuilt•Now go and grab the gender variable again.drop gender.frget gender,from(patients)(3missing values generated)(1variable copied from linked frame)•Now there are no problems.tabulate patid genderPersonal|Patient GenderID|Female Male|Total-----------+----------------------+----------1|10|14|01|19|40|4-----------+----------------------+----------Total|51|6Clearing out•The equivalent to clear for frames is.clear framesThis gets rid all data and frames and changes the active frame name to default:.frames dirdefault0x0frames reset is a synonym•In case you wondered,clear all runs a clear frames2.3Copying,Putting,and PostingFrames as Holding Areas•You can also use frames for holding dataIn this case,they are something of a substitute for temporaryﬁlesThey are also faster,especially in networked environments•frput will copy data to another frameThe opposite of frget•frcopy will copy an entire frame to another frameIt will also create the frame to use the copy,making it a nice manual preserve•frame post can be used to post observationsSimilar to post,but without tmpﬁles2.4Side Gains from Framespreserve and Frames•The preserve command now uses frames for preserving in Stata/MPThis happens forﬁles under1GB by defaultThe maximum size can be changed using set max_preservemem•This speeds up commands which use preserve heavilygrexample for looking at graph examples•This is especially useful when on a network where temporaryﬁles end up being stored on a server,instead of locallyLinking Many Datasets•You can have up to100frames at once•This means you can link together100datasets if need be•This could be useful in very wide datasets3Report Generation Additions3.1Report Generation AdditionsReport Generation Additions•The report generation(aka dynamic document)tools have been extended•dyndoc now has a docx option which produces a docx document directly from markdown •putdocx has many additions for headers and footers,as well as a way to make narrative easier to use •html2docx converts web pages(html)to Microsoft Word compatible documents(docx)•docx2pdf converts docxﬁles to pdfﬁles•There are a few other additions;these are the ones we’ll look atGetting Started•We’ll start with the docx option for dyndocx•Let’s move to the proper location.cd../dyndocLooking at a dyndocﬁle•Take a look at the paper.mdﬁle.doedit paper.md•This is an example markdownﬁle using Stata’s dynamic tagsYou can see that Stata16now has syntax highlighting for markdownThe md extension is what alerted the Do-ﬁle Editor to use this highlightingYou can change the language being highlighted•Note that the dyndoc version has changed to2Making an htmlﬁle•As in Stata15,this can be turned into a webpage.dyndoc paper.mdThe output is not shown,because it would include all the output needed to make the htmlﬁle •We can click on the link to open the pageConverting to docx•We could then convert this to a docxﬁle.html2docx paper.html,saving(paper_conv.docx)•Clicking the link will open the docxﬁle in Microsoft Word•The resultingﬁle needs someﬁxing up,but we’ll do this laterGoing Directly from Markdown to docx•We could get the same result by using the new docx option for dyndoc.dyndoc paper.md,docxAgain,the output is not shown•This will look exactly like the preceding example,because in the background,Stata is running plain dyndoc then running html2docx•Generally,this worked wellThere is some wrapping of Stata output,howeverThis is not present here,but there are other html-only things,like special characters,which might need cleaning upTidying Up Wrapping•Doing this conversion is nice,but it sometimes needs some tidying up due to wrappingThe font size of10pt for theﬁxed-width font allows77characters per line for letter size paper with standard one-inch marginsIf your Stata window is wide,commands like describe and codebook will draw dashed line the entire width of the your window•There are a few things which can helpUse a set linesize command to set the linesize to90or lessChange the margins in the resulting docx documentMake a style sheet(css)for the document and«dd_include»the style sheetSee theﬁrst example in the dyndoc PDF documentationWorking With putdocx•Theﬁles for putdocx are in the putdocx folder.cd../putdocx•First take a look at how putdocx looked in Stata15.doedit putdocx15.do•You can see here that there is no narrative modeEverything is a Stata command•You also cannot put Stata code into the document without repeating itOnce as simple text in aﬁxed-width fontOnce as code that gets runMaking the docx Document•Doing the do-ﬁle will make a docx document.do putdocx15.do•On the Mac,you can open the resultingﬁle from the Command window.!open putdocx15.docxNew putdocx Features in Stata16•Stata16allows headers and footers•Headers and footers can change through the document with sections•Headers and footers can work across appendingﬁles•There is now something like a narrative mode•Open up putdocx16.do to see these.doedit putdocx16.doHeaders and Footers to Start•They get constructed in a couple of steps•Here are the steps for a footerUse putdocx begin,footer(name)to name the footerUse putdocx paragraph,tofooter(name)Then add to the paragraphUsing tables is good for multi-piece footers•For headers,simply use header in place of footer aboveHeaders and Footer Changes•When sections change,you can change the header and/or footer•Simply use putdocx sectionbreak in place of putdocx begin from aboveNarrative Mode•While putdocx is mostly all Stata command as before,there are now text blocks:putdocx textblock begin starts a new paragraph which is simply textputdocx textblock append appends to the current paragraphputdocx textblock end ends a text blockputdocx textfile allows inserting aﬁle as a text block•These should make documents with a lot of plain narrative(i.e.most documents)much easier to work withMaking the docx Document•Doing the do-ﬁle will make a docx document.do putdocx16.do•Open the resultingﬁle from the Command window.!open putdocx16.docxOther Changes•While these are most of the changes,there have also been a few changes to markdown,which goes from markdown to html without processing Stata code putexcel had2syntax changesputexcel close has become putexcel saveputexcel has changed picture()to image()Of course,version conrol will protect your Stata15.1and earlier do-ﬁles!4Conclusion4.1ConclusionConclusion•Frames are something brand new in Stata16•The dynamic document(aka report)generation has had some nice additionsIndexCclear frames command,see frames reset command codebook command,6computing ages,4cwf command,see frame change commandDdynamic documents,see report generationdyndoc command,10,11Fframe change command,2frame create command,2frame dir command,2frame post command,9frame rename command,2frames,1–10commands in non-active frames,2,5diﬀering value label deﬁnitions,7linking,3–9diﬀerent key variables,7rebuilding links,8,9preserve command behavior,10frames dir command,9frames reset command,9frcopy command,9frget command,4,5,7,8frlink command,3,7frput command,9frval()function,3Hhtml2docx command,11Mmerge command,4Ppreserve command,10putdocx command,12headers and footers,12,13putdocx narrative mode,see putdocx textblock com-mandputdocx sectionbreak command,13putdocx textblock command,13Rreport generation,10–13Vvalue labels,5。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

4 版本随着Stata版本的变化，相应的命令也会有些变化。这样，
较早版本的命令可能就没法在现在的版本中使用，而现在的某些命令可能也没法在以后的版本中使用。为了让现在的程序能够在其他的版本中继续使用，我们可以在程序中声明所使用的版本；这样，在更新版本的Stata中运行这个程序，Stata就会做相应的调整，按当前版本的方式来翻译程序的命令。要声明版本，可输入命令： version 10.1 这里，我们使用的是10.1版的Stata。对于其他版本的Stata，在version后面输入相应的版本编号即可。
此外，局部宏（以及全局宏）可以组合使用。例如，局部宏`i'为数值6，
宏`x6' 为字符newvar，则宏`x`i''就指代字符newvar。另外，在组合时，我们可以通过大括号来设定运算的优先级。
如果要清除一个局部宏，可将其内容设置为空。这可以通过如下三种
方式实现： ①local macname ②local macname "" ③local macname = "" 这里，macname指宏的名称。而如果我们在程序中直接使用了一个没有被定义的宏，则Stata会将其当做一个内容为空的宏来处理。
例如，我们可以写这样一段命令： *this is an example use /*get data*/ "C:\Stata10\data\sample.dta" summarize age education /// occupation tab region // obtain summary statistics 如果去掉注释，上面的命令即为： use "C:\Stata10\data\sample.dta" summarize age education occupation
综合了以上几点，我们前面的do文件可以修改为这样的形

式： capture program drop examp program examp display “this is an example” end examp 这里，第一行先检查是否有已定义的examp程序，如果有就将其从内存中删除。第二到四行是定义程序examp，最后一行是执行程序examp。
下面，我们再在命令窗口输入命令： examp
也就是说，我们要执行程序examp。这时，Stata就会显示： this is an example
在随后的时间里，我们如果还想显示“this is an
example”，直接输入命令“exa种方式，即，先在do文件中写出程序的命令并保存，再键入“do filename”，然后就可以在随后的时间使用该程序命令了。
对于do文件中的命令，值得注意的是，每一行命令都需要

结束于一个硬回车（包括最后一行）；除非通过 “#delimit”命令设置其他符号为换行符。例如，输入命令： #delimit . 则我们设置以英文的句号作为换行符，也就是说，Stata只有遇到英文句号才会认为这一句命令结束。设置其他的换行符对于将很长的命令分成几行很有帮助，因为这时我们可以在想分行的地方输入回车符，而Stata又不会认为这是一句命令的结束。而如果我们想重新以硬回车（carriage return）为换行符，可输入以下命令： #delimit cr
我们可以采取交互的方式定义程序（即，在Stata命令窗口中输入程序
的各行命令），但实际应用中，程序经常是被保存到一个do文件或 ado文件中去，从而方便以后的应用。在程序或do文件中，我们可能需要加入注释，从而方便以后或他人的阅读。通常，我们还会声明版本，从而使得程序能够在以后更高版本的Stata中继续使用。此外，局部宏、全局宏、临时变量、临时矩阵和临时文件等也会经常被使用。
2 Stata程序和Do文件 Stata处理程序和处理do文件的方式是一样的，包括参数的传递、结果

的表达等。但do文件和程序也存在一些小的差别。例如，要激发一个 do文件，我们需要键入“do filename”，而要激发一个程序，我们只需要键入程序名称就可以。此外，键入“do filename”之后，Stata会显示do文件中的命令以及执行结果；而键入程序名之后，Stata只会显示其执行结果。下面，我们重点讲一下，通常情况下，将程序放到do文件中去需要注意的问题。例如，我们编写了一个简单的程序： program examp display “this is an example” end 并把它保存到名为“examp.do”的文件中，且把文件置于当前目录下。下面，我们要执行这个do文件，就在Stata命令窗口输入如下的命令： do examp
tab region
此外，对于交互方式的命令，注释只可采取第一种方式，即在句首加
上“*”。
6 宏宏是Stata程序的变量，它用一个字符串（宏的名称）来代表另一个字

符串（宏的内容）。宏分为局部宏（local macro）和全局宏（global macro）。局部宏只属于其所定义的程序，不能从其他程序中调用。而全局宏一旦被定义，就会留在内存，且可以被其他程序使用。局部宏的名称最多有31个字符，它的定义方式为： local 宏的名称宏的内容或： local 宏的名称=表达式例如，我们输入命令： local nv “this is a newvar” 就定义了一个叫做nv的局部宏，其内容为this is a newvar。如果我们要引用这个局部宏的内容，其格式为：`nv’。注意，左边的引号为标准键盘左上角的重音符（数字1左边的键），右边的引号为通常的单引号（回车键左边的键）。定义完毕之后，如果我们输入： `nv’ 我们就相当于输入了： this is a newvar
5 注释有时，我们想在命令中加入注释，从而方便以后或他人的阅读。要在

do文件或ado文件中加入注释，可以采取如下几种方式： 1.以“*”来开始一行。这样，该行就会被当做注释。 2.将注释放在“/*”和“*/”之间。该种格式可以置于句中的任何位置。此外，在行末使用“/*”，并在下一行行首使用“*/”，可以将很长的命令分成两行。 3.将注释置于双斜线“//”之后。如果双斜线之前有命令，则双斜线与命令之间至少要有一个空格。 4.将注释置于三斜线“///”之后。如果三斜线之前有命令，斜线与命令之间也是至少要有一个空格。此外，对于“///”，其下一行的命令会被认为是前面命令的继续。三斜线也可单独置于行尾，从而将很长的命令分成几行。
例如，我们可以输入如下的命令来显示这个宏的内容： display “`nv’” 注意，这里，宏`nv’外面的双引号必不可少，因为如果不加引号，我
们相当于输入了如下的命令： display this is a newvar Stata会显示错误提示：this not found。只有加上外面的双引号，才表示我们要显示一个字符串。否则，Stata会将其当做变量来处理。当然，如果宏的内容确实为存在的变量名，而我们要显示这个变量，就不必加上双引号。
实验操作指导 1 Do文件 do文件是一种文本文件，其扩展名为“.do”。要创建一个do文件，

可以通过菜单栏中Window的下拉选项Do-file Editor来打开Do文件编辑器，也可以直接点击工具栏的图标。而要执行一个do文件，可以键入以下命令： do filename 这里，filename指相应的do文件的文件名。但需要注意的一点是，这个filename.do文件需要放在当前目录下，只有这样，才可以不写文件的路径；否则，需要在文件名前写出完整路径（而如果路径中有中文字符，一定要将全部路径和文件名置于英文双引号之间）。要查看当前目录，我们可输入命令： cd 当然，我们也可以先将当前目录更改到我们偏好的一个文件夹下，然后再将do文件存放其中。例如，如下命令可以将当前目录更改到d盘 data文件夹下： cd “D:\data” 这里，需要注意的一点是，cd命令要求其后的文件夹原来就存在。
实验内容及数据来源本实验中，我们会讲解do文件的创建和执行方法、定
界符的修改、程序和do文件的联系、ado文件的创建和保存、注释的添加方法、版本的声明、局部宏和全局宏的定义及引用以及临时变量、临时矩阵和临时文件的定义和使用等内容。
本实验主要讲解编写程序的一些基本操作，不需要使
用数据文件。
当然，我们也可采取一种更为简洁的方式，即在do文件的最后一行加
上程序名，这样，当键入“do filename”的时候，Stata就会在加载完程序后就执行程序。但需要注意的是，程序一旦被定义，Stata就不允许对其重新定义。这样，如果我们随后又输入一遍“do filename”， Stata就会显示错误提示。要解决这个问题，我们可以在do文件的第一行输入这样的命令： program drop 程序名
3 Ado文件如果想自动加载并运行程序内容，我们可以将程序保存到ado

（automatically do）文件中（同样是利用do文件编辑器，保存时选择扩展名为ado），以后，直接输入程序名就可以使用该程序。但需要注意的是，ado文件的文件名和其中的程序名必须一致。值得注意的是，如果在Stata运行的过程中改变了某个ado文件的命令语句，则在重新运行这个ado文件前，要先将Stata内存中的ado文件清除。即，输入命令： discard 否则，Stata还是会运行原来的那个ado文件。个人编写的ado文件通常被存放在两个地方，一个是当前目录，另一个是个人ado目录。个人ado目录通常位于“C:\ado\personal”，要查看其具体位置，可输入命令： personal 而要查看或改变当前目录，可使用命令“cd”。