phmm - Poisson GLM, Cox PH, & degrees of freedom
- 格式:pdf
- 大小:98.71 KB
- 文档页数:8
Poisson 自回归模型的平稳性检验赵志文;毕利;张美丽【期刊名称】《吉林大学学报(理学版)》【年(卷),期】2015(53)2【摘要】利用经验似然方法研究 Poisson 自回归模型的平稳性检验问题,通过构造相关检验的经验似然比检验统计量,在零假设下获得了检验统计量的极限分布,并通过 Monte Carlo 数值模拟考察该检验方法的有限样本特征。
%We studied the stationarity test problem for Poisson autoregressive model using the empirical likelihood method.The empirical likelihood ratio statistic was established and its limiting distribution was obtained under null hypothesis.The finite sample property was also examined through Monte Carlo simulations.【总页数】6页(P235-240)【作者】赵志文;毕利;张美丽【作者单位】吉林师范大学数学学院,吉林四平 136000;吉林师范大学数学学院,吉林四平 136000;海军大连舰艇学院基础部,辽宁大连 116000【正文语种】中文【中图分类】O212.1【相关文献】1.一种基于Poisson过程的双险种Poisson-Geometric过程模型研究 [J], 成军祥;杨婷婷2.非齐次POISSON过程转换为POISSON过程的方法 [J], 刘宝慧3.门限 Poisson 对数线性自回归模型的参数估计 [J], 张旭利;杨凯;王德辉4.多项式Poisson代数上的有限维单Poisson模 [J], 陈炜红; 吕家凤5.Poisson (Co)homology of a Class of Frobenius Poisson Algbras [J], Mengyao Wang因版权原因,仅展示原文概要,查看原文内容请购买。
传染病链二项分布资料的Poisson回归模型
叶小华;荀鹏程;于浩;陈峰
【期刊名称】《中国卫生统计》
【年(卷),期】2005(022)006
【摘要】目的本文旨在介绍Poisson回归模型在具有链结构的传染病资料分析中的应用.方法借助极大似然法,对五口之家的普通感冒资料分别拟合五种Poisson回归模型,其中"代数"以亚变量的形式进入模型,而"已感染者数"以连续性指示变量进入模型.结果本文介绍的模型以Greenwood和Reed-Frost链二项分布模型为特例,Poisson回归模型与相应的链二项分布模型分析结果往往非常近似.结论用Poisson回归模型处理和分析传染病链二项分布资料更简便易行,必要时可同时分析协变量的作用.
【总页数】3页(P377-379)
【作者】叶小华;荀鹏程;于浩;陈峰
【作者单位】南京医科大学流行病与卫生统计学系,210029;南京医科大学流行病与卫生统计学系,210029;南京医科大学流行病与卫生统计学系,210029;南京医科大学流行病与卫生统计学系,210029
【正文语种】中文
【中图分类】R1
【相关文献】
1.链二项分布模型在传染病资料分析中的应用 [J], 荀鹏程;顾海雁;陈峰
2.二项分布、Poisson分布与指数分布之间关系的探讨 [J], 王丽芳
3.Poisson回归模型和负二项回归模型在林火预测领域的应用 [J], 孙龙;尚喆超;胡海清
4.Poisson回归模型估计启东肝病高危人群14年随访资料的PAR [J], 恽振先
5.Poisson和Cox回归模型在队列随访资料分析中的应用和对比 [J], 项永兵;高玉堂;金凡;杨工;邓春勤
因版权原因,仅展示原文概要,查看原文内容请购买。
Poisson回归中过度离散的检验方法*曾平1△ 赵晋芳2 刘桂芬2【摘要】@@ 在数理统计中,Poisson分布有着悠久的历史,最早可追溯到1838年.对当时广泛研究的二项分布,在事件的发生概率p很小、试验次数n很大的情况下,法国数学家Poisson(1)推导出了二项分布的极限分布,为了纪念他而称为Poisson分布.其早期一个著名的应用例子是Bortkiewicz(1898)观察到普鲁士的骑兵部队中每年被马踢死的士兵数服从Poisson分布(2).【期刊名称】中国卫生统计【年(卷),期】2011(028)002【总页数】2在数理统计中,Poisson分布有着悠久的历史,最早可追溯到1838年。
对当时广泛研究的二项分布,在事件的发生概率p很小、试验次数n很大的情况下,法国数学家Poisson〔1〕推导出了二项分布的极限分布,为了纪念他而称为Poisson分布。
其早期一个著名的应用例子是Bortkiewicz(1898)观察到普鲁士的骑兵部队中每年被马踢死的士兵数服从Poisson分布〔2〕。
Poisson回归也被Ernst(1863)用来计算血红细胞的数目〔3〕,此后Poisson回归在农业、生物医学和人口学等方面得到广泛应用,已经成为计数资料的基本统计模型。
Poisson回归在应用中需要满足一个十分重要的假设:事件的条件均值等于条件方差,称为等离散(equal-dispersion)。
然而计数资料常表现为事件的方差大于均值,从而使得计数数据存现出比Poisson分布下名义方差更大的变异。
事件的条件方差超过条件均数称为过度离散(over-dispersion)。
对过度离散的计数资料,Poisson回归常常低估参数估计值的标准误,导致出现较大的统计量,从而增大Ⅰ类错误,夸大解释变量效应。
因此对计数资料过度离散的识别和检验就具有重要的意义,这是正确应用Poisson回归的前提之一。
过度离散检验过度离散检验(overdispersion test)有基于残差和样本均数方差等多种不同的检验方法,本文主要介绍其中几种方法。
Package‘DynNom’October12,2022Type PackageTitle Visualising Statistical Models using Dynamic NomogramsVersion5.0.2Author Amirhossein Jalali,Davood Roshan,Alberto Alvarez-Iglesias,John NewellMaintainer Amirhossein Jalali<*****************>Description Demonstrate the results of a statistical model object as a dynamic nomogram in an RStu-dio panel or web browser.The package provides two generics functions:DynNom,which dis-play statistical model objects as a dynamic nomogram;DNbuilder,which builds re-quired scripts to publish a dynamic nomo-gram on a web server such as the<https://www.shinyapps.io/>.Current version of'Dyn-Nom'supports stats::lm,stats::glm,sur-vival::coxph,rms::ols,rms::Glm,rms::lrm,rms::cph,mgcv::gam and gam::gam model objects. License GPL-2Depends magrittrImports survival(>=2.38-3),shiny,ggplot2(>2.1.0),plotly,stargazer,prediction,rms,dplyr,compare,BBmiscSuggests gam,mgcvNeedsCompilation noRepository CRANDate/Publication2022-09-0910:53:07UTCR topics documented:DNbuilder (2)DynNom (4)getclass.DN (7)getdata.DN (8)getpred.DN (9)Index101DNbuilder Publishing a dynamic nomogramDescriptionDNbuilder is a generic function which builds required scripts to publish a dynamic nomogram ona web server such as the https://www.shinyapps.io/.This application can then access througha URL and be used independent of R.DNbuilder supports a large number of model objects from avariety of packages.UsageDNbuilder(model,data=NULL,clevel=0.95,m.summary=c("raw","formatted"), covariate=c("slider","numeric"),ptype=c("st","1-st"),DNtitle=NULL,DNxlab=NULL,DNylab=NULL,DNlimits=NULL,KMtitle=NULL,KMxlab=NULL,KMylab=NULL)DNbuilder.core(model,data,clevel,m.summary,covariate,DNtitle,DNxlab,DNylab,DNlimits)DNbuilder.surv(model,data,clevel,m.summary,covariate,ptype,DNtitle,DNxlab,DNylab,KMtitle,KMxlab,KMylab)Argumentsmodel an lm,glm,coxph,ols,Glm,lrm,cph,mgcv::gam or gam::gam model objects.data a dataframe of the accompanying dataset for the model(if required).clevel a confidence level for constructing the confidence interval.If not specified,a 95%level will be used.m.summary an option to choose the type of the model output represented in the’Summary Model’tab."raw"(the default)returns an unformatted summary of the model;"formatted"returns a formatted table of the model summary using stargazerpackage.covariate an option to choose the type of input control widgets used for numeric values."slider"(the default)picks out sliderInput;"numeric"picks out numericInput.ptype an option for coxph or cph model objects to choose the type of plot which dis-plays in"Survival plot"tab."st"(the default)returns plot of estimated survivorprobability(S(t))."1-st"returns plot of estimated failure probability(1-S(t)).DNtitle a character vector used as the app’s title.If not specified,"Dynamic Nomogram"will be used.DNxlab a character vector used as the title for the x-axis in"Graphical Summary"tab.If not specified,"Probability"will be used for logistic model and Cox proportionalmodel objects;or"Response variable"for other model objects.DNylab a character vector used as the title for the y-axis in"Graphical Summary"tab (default is NULL).DNlimits a vector of2numeric values used to set x-axis limits in"Graphical Summary"tab.Note:This also removes the’Set x-axis ranges’widget in the sidebar panel.KMtitle a character vector used as KM plot’s title in"Survival plot"tab.If not specified, "Estimated Survival Probability"for ptype="st"and"Estimated Probability"for ptype="1-st"will be used.KMxlab a character vector used as the title for the x-axis in"Survival plot"tab.If not specified,"Follow Up Time"will be used.KMylab a character vector used as the title for the y-axis in"Survival plot"tab.If not specified,"S(t)"for ptype="st"and"F(t)"for ptype="1-st"will be used.ValueA new folder called’DynNomapp’will be created in the current working directory which containsall the required scripts to deploy this dynamic nomogram on a host server such as the https: //www.shinyapps.io/.This folder includes ui.R,server.R,global.R and data.RData which needs to publish the app.A user guide textfile(README.txt)will be also added to explain how to deploy the app using thesefiles.Please cite as:Jalali,A.,Roshan,D.,Alvarez-Iglesias,A.,Newell,J.(2019).Visualising statistical models using dynamic nomograms.R package version5.0.Author(s)Amirhossein Jalali,Davood Roshan,Alberto Alvarez-Iglesias,John NewellMaintainer:Amirhossein Jalali<**********************>ReferencesBanks,J.2006.Nomograms.Encyclopedia of Statistical Sciences.8.Easy web applications in R.https:///Frank E Harrell Jr(2017).rms:Regression Modeling Strategies.R package version4.5-0.https: ///package=rms/See AlsoDynNom,getpred.DNExamples##Not run:#Simple linear regression modelsfit1<-lm(uptake~Plant+conc+Plant*conc,data=CO2)DNbuilder(fit1)t.data<-datadist(swiss)options(datadist= t.data )ols(Fertility~Agriculture+Education+rcs(Catholic,4),data=swiss)%>%DNbuilder(clevel=0.9,m.summary="formatted")#Generalized regression modelsfit2<-glm(Survived~Age+Class+Sex,data=as.data.frame(Titanic),weights=Freq,binomial("probit"))DNbuilder(fit2,DNtitle="Titanic",DNxlab="Probability of survival")counts<-c(18,17,15,20,10,20,25,13,12)outcome<-gl(3,1,9)treatment<-gl(3,3)d<-datadist(treatment,outcome)options(datadist="d")Glm((2*counts)~outcome+treatment,family=poisson(),data=data.frame(counts,outcome,treatment))%>%DNbuilder()#Proportional hazard modelscoxph(Surv(time,status)~age+strata(sex)+ph.ecog,data=lung)%>%DNbuilder()data.kidney<-kidneydata.kidney$sex<-as.factor(data.kidney$sex)levels(data.kidney$sex)<-c("male","female")coxph(Surv(time,status)~age+strata(sex)+disease,data.kidney)%>%DNbuilder(ptype="1-st")d<-datadist(veteran)options(datadist="d")fit3<-cph((Surv(time/30,status))~rcs(age,4)*strat(trt)+diagtime+strat(prior)+lsp(karno,60),veteran)DNbuilder(fit3,DNxlab="Survival probability",KMtitle="Kaplan-Meier plot",KMxlab="Time(Days)",KMylab="Survival probability")#Generalized additive modelsmgcv::gam(Fertility~s(Agriculture)+Education+s(Catholic),data=swiss)%>%DNbuilder(DNlimits=c(0,110),m.summary="formatted")fit4<-gam::gam(Fertility~Education+Catholic+s(Agriculture),fit=FALSE,data=swiss) DNbuilder(fit4)##End(Not run)if(interactive()){data(rock)lm(area~I(log(peri)),data=rock)%>%DNbuilder()}DynNom Dynamic nomogram to visualise statistical modelsDescriptionDynNom is a generic function to display the results of statistical model objects as a dynamic nomo-gram in an’RStudio’panel or web browser.DynNom supports a large number of model objects froma variety of packages.UsageDynNom(model,data=NULL,clevel=0.95,m.summary=c("raw","formatted"),covariate=c("slider","numeric"),ptype=c("st","1-st"),DNtitle=NULL,DNxlab=NULL,DNylab=NULL,DNlimits=NULL,KMtitle=NULL,KMxlab=NULL,KMylab=NULL)DynNom.core(model,data,clevel,m.summary,covariate,DNtitle,DNxlab,DNylab,DNlimits) DynNom.surv(model,data,clevel,m.summary,covariate,ptype,DNtitle,DNxlab,DNylab,KMtitle,KMxlab,KMylab)Argumentsmodel an lm,glm,coxph,ols,Glm,lrm,cph,mgcv::gam or gam::gam model objects.data a dataframe of the accompanying dataset for the model(if required).clevel a confidence level for constructing the confidence interval.If not specified,a95%level will be used.m.summary an option to choose the type of the model output represented in the’SummaryModel’tab."raw"(the default)returns an unformatted summary of the model;"formatted"returns a formatted table of the model summary using stargazerpackage.covariate an option to choose the type of input control widgets used for numeric values."slider"(the default)picks out sliderInput;"numeric"picks out numericInput.ptype an option for coxph or cph model objects to choose the type of plot which dis-plays in"Survival plot"tab."st"(the default)returns plot of estimated survivorprobability(S(t))."1-st"returns plot of estimated failure probability(1-S(t)).DNtitle a character vector used as the app’s title.If not specified,"Dynamic Nomogram"will be used.DNxlab a character vector used as the title for the x-axis in"Graphical Summary"tab.Ifnot specified,"Probability"will be used for logistic model and Cox proportionalmodel objects;or"Response variable"for other model objects.DNylab a character vector used as the title for the y-axis in"Graphical Summary"tab(default is NULL).DNlimits a vector of2numeric values used to set x-axis limits in"Graphical Summary"tab.Note:This also removes the’Set x-axis ranges’widget in the sidebar panel.KMtitle a character vector used as KM plot’s title in"Survival plot"tab.If not specified,"Estimated Survival Probability"for ptype="st"and"Estimated Probability"for ptype="1-st"will be used.KMxlab a character vector used as the title for the x-axis in"Survival plot"tab.If not specified,"Follow Up Time"will be used.KMylab a character vector used as the title for the y-axis in"Survival plot"tab.If not specified,"S(t)"for ptype="st"and"F(t)"for ptype="1-st"will be used.ValueA dynamic nomogram in a shiny application providing individual predictions which can be used asa model visualisation or decision-making tools.The individual predictions with a relative confidence interval are calculated using the predict function,displaying either graphically as an interactive plot in the Graphical Summary tab or a table in the Numerical Summary tab.A table of model output is also available in the Model Summary tab.In the case of the Cox proportional hazards model,an estimated survivor/failure function will be additionally displayed in a new tab.Please cite as:Jalali,A.,Roshan,D.,Alvarez-Iglesias,A.,Newell,J.(2019).Visualising statistical models using dynamic nomograms.R package version5.0.Author(s)Amirhossein Jalali,Davood Roshan,Alberto Alvarez-Iglesias,John NewellMaintainer:Amirhossein Jalali<**********************>ReferencesBanks,J.2006.Nomograms.Encyclopedia of Statistical Sciences.8.Easy web applications in R.https:///Frank E Harrell Jr(2017).rms:Regression Modeling Strategies.R package version4.5-0.https: ///package=rms/See AlsoDNbuilder,getpred.DNExamples##Not run:#Simple linear regression modelsfit1<-lm(uptake~Plant+conc+Plant*conc,data=CO2)DynNom(fit1)t.data<-datadist(swiss)options(datadist= t.data )ols(Fertility~Agriculture+Education+rcs(Catholic,4),data=swiss)%>%DynNom(clevel=0.9,m.summary="formatted")#Generalized regression modelsfit2<-glm(Survived~Age+Class+Sex,getclass.DN7 data=as.data.frame(Titanic),weights=Freq,family=binomial("probit"))DynNom(fit2,DNtitle="Titanic",DNxlab="Probability of survival")counts<-c(18,17,15,20,10,20,25,13,12)outcome<-gl(3,1,9)treatment<-gl(3,3)d<-datadist(treatment,outcome)options(datadist="d")Glm((2*counts)~outcome+treatment,family=poisson(),data=data.frame(counts,outcome,treatment))%>%DynNom()#Proportional hazard modelscoxph(Surv(time,status)~age+strata(sex)+ph.ecog,data=lung)%>%DynNom()data.kidney<-kidneydata.kidney$sex<-as.factor(data.kidney$sex)levels(data.kidney$sex)<-c("male","female")coxph(Surv(time,status)~age+strata(sex)+disease,data.kidney)%>%DynNom(ptype="1-st")d<-datadist(veteran)options(datadist="d")fit3<-cph((Surv(time/30,status))~rcs(age,4)*strat(trt)+diagtime+strat(prior)+lsp(karno,60),veteran)DynNom(fit3,DNxlab="Survival probability",KMtitle="Kaplan-Meier plot",KMxlab="Time(Days)",KMylab="Survival probability") #Generalized additive modelsmgcv::gam(Fertility~s(Agriculture)+Education+s(Catholic),data=swiss)%>%DynNom(DNlimits=c(0,110),m.summary="formatted")fit4<-gam::gam(Fertility~Education+Catholic+s(Agriculture),fit=FALSE,data=swiss) DynNom(fit4)##End(Not run)if(interactive()){data(rock)lm(area~I(log(peri)),data=rock)%>%DynNom()}getclass.DN Extract class and family of a model objectDescriptiongetclass.DN extracts class and family of a model object(supported in DynNom).8getdata.DNUsagegetclass.DN(model)Argumentsmodel an lm,glm,coxph,ols,Glm,lrm,cph,mgcv::gam or gam::gam model objects. ValueA list including the model class and the family name of the model(if relevant).See AlsoDynNom,DNbuilderExamplesfit1<-glm(Survived~Age+Class+Sex,data=as.data.frame(Titanic),weights=Freq,family=binomial("probit"))getclass.DN(fit1)library(survival)fit2<-coxph(Surv(time,status)~age+strata(sex)+ph.ecog,data=lung)getclass.DN(fit2)getdata.DN Extract dataset from a model objectDescriptiongetdata.DN extracts dataset that was used to produce the model object(supported in DynNom). Usagegetdata.DN(model)Argumentsmodel an lm,glm,coxph,ols,Glm,lrm,cph,mgcv::gam or gam::gam model objects. ValueA data.frame containing the dataset used in thefitted model object.See AlsoDynNom,DNbuildergetpred.DN9Examplesfit1<-glm(Survived~Age+Class+Sex,data=as.data.frame(Titanic),weights=Freq,family=binomial("probit"))getdata.DN(fit1)library(survival)fit2<-coxph(Surv(time,status)~age+strata(sex)+ph.ecog,data=lung)getdata.DN(fit2)getpred.DN Extract predictions from a Model ObjectDescriptiongetpred.DN extracts class,family and inverse of link function from a model object(supported in DynNom).Usagegetpred.DN(model,newd,set.rms=F)Argumentsmodel an lm,glm,coxph,ols,Glm,lrm,cph,mgcv::gam or gam::gam model objects.newd a data frame of predictors for predictionset.rms a logical value indicating if data should be updated in the model object(required for rms model objects in DNbuilder).ValueA list including the prediction(pred)and the standard error of prediction(SEpred).See AlsoDynNom,DNbuilderExamplesfit1<-glm(Survived~Age+Class+Sex,data=as.data.frame(Titanic),weights=Freq,family=binomial("probit"))getpred.DN(fit1,newd=data.frame(Class="1st",Sex="Male",Age="Child"))Index∗dynamic nomogramDNbuilder,2DynNom,4∗model visualisationDNbuilder,2DynNom,4∗shinyDNbuilder,2DynNom,4DNbuilder,2,6,8,9DynNom,3,4,8,9getclass.DN,7getdata.DN,8getpred.DN,3,6,910。
咨询QQ:3025393450有问题百度搜索“”就可以了欢迎登陆官网:/datablogR语言进行数值模拟:模拟泊松回归模型数据分析报告来源:大数据部落模拟回归模型的数据验证回归模型的首选方法是模拟来自它们的数据,并查看模拟数据是否捕获原始数据的相关特征。
感兴趣的基本特征是平均值。
我喜欢这种方法,因为它可以扩展到广义线性模型(logistic,Poisson,gamma,...)和其他回归模型,比如t -regression。
您的标准回归模型假设存在将预测变量与结果相关联的真实/固定参数。
但是,当我们执行回归时,我们只估计这些参数。
因此,回归软件返回表示系数不确定性的标准误差。
我将用一个例子来证明我的意思。
示范我将使用泊松回归来证明这一点。
我模拟了两个预测变量,使用50的小样本。
n <- 50set.seed(18050518) xc的系数为0.5 ,xb的系数为1 。
我对预测进行取幂,并使用该rpois()函数生成泊松分布结果。
# Exponentiate prediction and pass to rpois()咨询QQ:3025393450有问题百度搜索“”就可以了欢迎登陆官网:/datablogsummary(dat)xc xb yMin. :-2.903259Min. :0.00Min. :0.001st Qu.:-0.6487421st Qu.:0.001st Qu.:1.00Median :-0.011887 Median :0.00Median :2.00Mean : 0.006109 Mean :0.38Mean :2.023rd Qu.: 0.8085873rd Qu.:1.003rd Qu.:3.00Max. : 2.513353Max. :1.00Max. :7.00接下来是运行模型。
Call:glm(formula = y ~ xc + xb, family = poisson, data = dat)Deviance Residuals:Min 1Q Median 3Q Max-1.9065-0.9850-0.13550.5616 2.4264Coefficients:Estimate Std. Error z value Pr(>|z|)(Intercept) 0.208390.15826 1.3170.188xc 0.461660.09284 4.9736.61e-07 ***xb 0.809540.20045 4.0395.38e-05 ***---Signif. codes:0‘***’ 0.001‘**’ 0.01‘*’ 0.05‘.’ 0.1‘ ’ 1 (Dispersion parameter for poisson family taken to be 1)Null deviance:91.087on 49degrees of freedomResidual deviance:52.552on 47degrees of freedom AIC:161.84咨询QQ:3025393450有问题百度搜索“”就可以了欢迎登陆官网:/datablogNumber of Fisher Scoring iterations:5估计的系数与人口模型相距不太远,.21代表截距而不是0,.46而不是.5,而0.81而不是1。
cox模型的偏似然函数r 解释说明1. 引言1.1 概述引言部分将着重介绍本篇文章的主题,即偏似然函数R对Cox模型的解释说明。
我们将会探讨Cox模型在生存分析中的应用,并通过具体案例分析来演示其参数估计和解释性分析。
此外,我们还将对偏似然函数概念进行简要说明,并介绍如何利用R语言进行偏似然函数的应用。
1.2 文章结构本文主要分为五个章节来逐步展开我们关于Cox模型和偏似然函数R的讨论。
首先,引言部分将提供一个概述并介绍文章结构。
在第二章中,我们将对Cox 模型进行简介,并解释偏似然函数的概念。
第三章将以实际数据为基础,探索Cox模型在生存分析中的应用,并涵盖数据准备、描述性统计分析、参数估计与解释性分析以及结果讨论与推断性分析。
在第四章中,我们将总结研究的主要发现并提出局限性,并建议进一步研究方向。
最后一个章节为参考文献部分。
1.3 目的本篇文章旨在深入探讨和解释Cox模型的偏似然函数R在生存分析中的应用。
通过详细讨论相关概念,并结合实际案例进行分析,我们希望读者能够更好地理解和应用偏似然函数R以推断影响生存时间的因素。
此外,我们还将探讨Cox 模型在现实场景中的局限性,并提出一些可能的未来研究方向。
通过本篇文章,读者将能够对Cox模型和偏似然函数R有一个全面而深入的了解,并将其应用于相关领域的研究和实践中。
以上为“1. 引言”部分内容。
2. Cox模型的偏似然函数R解释说明2.1 Cox模型简介Cox模型是一种常用的生存分析方法,用于研究事件发生时间和影响因素之间的关系。
它基于假设比例风险(Proportional Hazards)假设,即各个时间点上不同个体相对风险的比例保持不变。
Cox模型能够同时考虑多种协变量对风险的影响,并得出协变量的影响程度。
2.2 偏似然函数的概念在Cox模型中,为了估计协变量对风险比例的影响,需要使用偏似然函数进行参数估计。
偏似然函数衡量了给定观测数据下参数估计值与真实参数之间的差异。
1.根据以下表数据求生成函数在204=t 的Kaplan-Meier 估计并用R 绘出生存函数曲线。
其中变量),min(i i i C T X =,i T 是第一次检查的时间,i C 是审查时间,示性函数i δ表示为:当第i 个观察值缺失时为0;否则为1.分析:根据以上数据来求生成函数在204=t 的Kaplan-Meier 估计并用R 绘出生存函数曲线。
步骤:首先对观察死亡的时间进行排序:)()1(...k t t <<,i d 为在时刻)(i t 死亡的个体数;i n 为在时刻)(i t 面临危险的个体数,则在)(i t 时刻危险函数的估计为:ii i n d /ˆ=λ,从而得到生存函数在t 时刻的Kaplan-Meier 估计为: ∏∏≤≤-=-=tttt iii i i n d t S )()()1()ˆ1()(ˆλ 结论:经R 语言计算得到:生成函数在204=t 的Kaplan-Meier 估计为:0.1702381。
R 语言程序:cancer<-data.frame(times=c(214,184,150,70,16,141,210,132,30,204,84,36,38,69),stat us=c(1,1,1,1,1,1,0,1,0,1,1,1,0,1)); with(cancer,Surv(times,status))summary(survfit(Surv(times,status),data=cancer))fit1<-survfit(Surv(times, status),data=cancer, conf.type="none")plot(fit1, main="KM forCancer", xlab="Time(days)", ylab="survival rate") Call: survfit(formula = Surv(times, status), data = cancer)time n.risk n.event survival std.err lower 95% CI upper 95% CI16 14 1 0.929 0.0688 0.8030 1.000 36 12 1 0.851 0.0973 0.6803 1.000 69 10 1 0.766 0.1191 0.5648 1.000 70 9 1 0.681 0.1329 0.4646 0.998 84 8 1 0.596 0.1409 0.3748 0.947 132 7 1 0.511 0.1442 0.2936 0.888 141 6 1 0.426 0.1431 0.2202 0.823 150 5 1 0.340 0.1375 0.1543 0.751 184 4 1 0.255 0.1268 0.0965 0.676 204 3 1 0.170 0.1094 0.0483 0.600 214 1 1 0.000 NA NA NA(1-1/14)*(1-1/12)*(1-1/10)*(1-1/9)*(1-1/8)*(1-1/7)*(1-1/6)*(1-1/5)*(1-1/4)*(1-1/3) [1] 0.17023810501001502000.00.20.40.60.81.0KM forCancerTime(days)s u r v i v a l r a te2.根据一下数据求生存函数在1408=t 的Kaplan-Meier 估计。
R语言中的泊松分布z检验1. 泊松分布简介泊松分布是一种离散型概率分布,常用于描述单位时间或单位面积内随机事件发生的次数。
泊松分布的概率质量函数(PMF)可以表示为:其中,λ是事件发生的平均次数。
2. 泊松分布z检验的背景和目的泊松分布z检验用于判断观测数据是否符合泊松分布的假设。
它的背景和目的如下:•背景:假设我们有一组观测数据,我们希望知道这组数据是否符合泊松分布的假设。
•目的:通过进行泊松分布z检验,我们可以判断观测数据是否与泊松分布假设一致,从而对数据的统计特征进行评估和推断。
3. 泊松分布z检验的步骤泊松分布z检验的步骤如下:步骤1:假设检验的设定•零假设(H0):观测数据符合泊松分布。
•备择假设(H1):观测数据不符合泊松分布。
步骤2:计算观测数据的样本均值和样本方差•样本均值(X̄):观测数据的平均值。
•样本方差(S^2):观测数据的方差。
步骤3:计算泊松分布的理论均值和理论方差•泊松分布的理论均值(μ):根据观测数据的样本均值计算,即μ = X̄。
•泊松分布的理论方差(σ2):根据观测数据的样本均值计算,即σ2 = X̄。
步骤4:计算标准化统计量•标准化统计量(z):根据观测数据的样本均值、样本方差和泊松分布的理论均值、理论方差计算,即z = (X̄ - μ) / sqrt(σ^2 / n)。
步骤5:计算p值•p值:根据标准化统计量和备择假设计算,即p值 = P(Z > |z|)。
步骤6:假设检验的结论•如果p值小于显著性水平(通常为0.05),则拒绝零假设,认为观测数据不符合泊松分布。
•如果p值大于等于显著性水平,则接受零假设,认为观测数据符合泊松分布。
4. R语言中的泊松分布z检验实现在R语言中,我们可以使用poisson.test()函数进行泊松分布z检验。
下面是一个示例代码:```{r} # 生成符合泊松分布的观测数据 observed <- rpois(n = 100, lambda = 2)进行泊松分布z检验result <- poisson.test(x = observed)输出检验结果print(result)在这个示例代码中,我们首先使用`rpois()`函数生成100个符合泊松分布的观测数据。
Poisson GLM,Cox PH,°rees of freedomMichael C.DonohueDivision of Biostatistics and BioinformaticsUniversity of California,San DiegoApril2,20121IntroductionWe discuss connections between the Cox proportional hazards model and Poisson generalized linear models as described in Whitehead(1980).Wefit a sample dataset using coxph()and glm() and show that the model degrees of freedom differ by the number of events.2A simple Cox PH example2.1Generate dataWe generate proportional hazards mixed model data.>options(width=75)>library(phmm)>n<-50#total sample size>nclust<-5#number of clusters>clusters<-rep(1:nclust,each=n/nclust)>beta0<-c(1,2)>set.seed(13)>Z<-cbind(Z1=sample(0:1,n,replace=TRUE),+Z2=sample(0:1,n,replace=TRUE),+Z3=sample(0:1,n,replace=TRUE))>b<-cbind(rep(rnorm(nclust),each=n/nclust),+rep(rnorm(nclust),each=n/nclust))>Wb<-matrix(0,n,2)>for(j in1:2)Wb[,j]<-Z[,j]*b[,j]>Wb<-apply(Wb,1,sum)1>T<--log(runif(n,0,1))*exp(-Z[,c('Z1','Z2')]%*%beta0-Wb)>C<-runif(n,0,1)>time<-ifelse(T<C,T,C)>event<-ifelse(T<=C,1,0)>sum(event)[1]31>phmmd<-data.frame(Z)>phmmd$cluster<-clusters>phmmd$time<-time>phmmd$event<-event2.2Fit the Cox PH model>fit.ph<-coxph(Surv(time,event)~Z1+Z2,+phmmd,method="breslow",x=TRUE,y=TRUE)>summary(fit.ph)Call:coxph(formula=Surv(time,event)~Z1+Z2,data=phmmd,method="breslow", x=TRUE,y=TRUE)n=50,number of events=31coef exp(coef)se(coef)z Pr(>|z|)Z10.8549 2.35130.3918 2.1820.02909*Z2 1.0888 2.97080.3684 2.9550.00312**---Signif.codes:0‘***’0.001‘**’0.01‘*’0.05‘.’0.1‘’1exp(coef)exp(-coef)lower.95upper.95Z1 2.3510.4253 1.091 5.067Z2 2.9710.3366 1.443 6.116Concordance=0.71(se=0.055)Rsquare=0.237(max possible=0.984)Likelihood ratio test=13.55on2df,p=0.001141Wald test=13.52on2df,p=0.001158Score(logrank)test=14.63on2df,p=0.0006671>fit.ph$loglik[2][1]-95.97131Next we create data tofit an auxilary Poisson model as described in Whitehead(1980)using the pseudoPoisPHMM()function provided in the phmm package.This function also extracts the linear predictors as estimated from the Cox PH model so that we can calculate likelihoods and degrees of freedom.22.3Likelihood and degrees of freedom for Poisson GLM from Cox PHparameters>ppd<-as.data.frame(as.matrix(pseudoPoisPHMM(fit.ph)))>#pois likelihood>poisl<-c()>eventtimes<-sort(phmmd$time[phmmd$event==1])>for(h in1:length(eventtimes)){+js<-ppd$time==eventtimes[h]&ppd$m>=1#j star+j<-ppd$time==eventtimes[h]+if(sum(js)>1)stop("tied event times")+poisl<-c(poisl,+ppd[js,"N"]*exp(-1)*exp(ppd[js,"linear.predictors"])/+sum(ppd[j,"N"]*exp(ppd[j,"linear.predictors"])))+}Poisson likelihood:>sum(log(poisl))[1]-66.5633>sum(log(poisl))-fit.ph$loglik[2][1]29.40801Poisson degrees of freedom>length(fit.ph$coef)+sum(phmmd$event)[1]332.4Fit auxiliary Poisson GLMWefit an auxiliary Poisson GLM and note that the parameter estimates for z1and z2are identical to the coxph()fit,and the likelihood and degrees of freedom are as expected.>ppd$t<-as.factor(ppd$time)>fit.glm<-glm(m~-1+t+z1+z2+offset(log(N)),+ppd,family=poisson)>summary(fit.glm)Call:glm(formula=m~-1+t+z1+z2+offset(log(N)),family=poisson, data=ppd)Deviance Residuals:Min1Q Median3Q Max3-0.9685-0.7531-0.55530.4293 1.6823Coefficients:Estimate Std.Error z value Pr(>|z|)t0.000277233256778163-5.0494 1.0704-4.717 2.39e-06*** t0.000285092717793308-5.0035 1.0679-4.685 2.79e-06*** t0.000382448373472765-4.9876 1.0683-4.669 3.03e-06*** t0.00559427171447325-4.9388 1.0655-4.635 3.57e-06*** t0.00764335258097282-4.8875 1.0625-4.600 4.22e-06*** t0.00808285780728387-4.8648 1.0635-4.574 4.78e-06*** t0.0216256697018544-4.8013 1.0609-4.526 6.02e-06*** t0.0219649983261458-4.7930 1.0622-4.512 6.41e-06*** t0.0233956453029104-4.7681 1.0634-4.4847.34e-06*** t0.0235837855332384-4.7069 1.0598-4.4418.95e-06*** t0.0237625311885084-4.6797 1.0612-4.410 1.03e-05*** t0.027482795605763-4.6127 1.0572-4.363 1.28e-05*** t0.0278642961804028-4.5890 1.0573-4.340 1.42e-05*** t0.0316525538364514-4.5401 1.0576-4.293 1.76e-05*** t0.0357745779481545-4.5147 1.0578-4.268 1.97e-05*** t0.0366185731334857-4.4351 1.0529-4.212 2.53e-05*** t0.066999301944422-4.3869 1.0556-4.156 3.24e-05*** t0.0742904888064418-4.3572 1.0557-4.127 3.67e-05*** t0.09491415021304-4.2493 1.0513-4.042 5.30e-05*** t0.125132209250348-4.2151 1.0513-4.010 6.08e-05*** t0.132722661166308-4.1798 1.0513-3.9767.01e-05*** t0.140357744467437-4.0667 1.0439-3.8969.79e-05*** t0.163527928343998-3.9258 1.0448-3.7570.000172*** t0.193971448733795-3.7760 1.0443-3.6160.000299*** t0.204887967162952-3.7054 1.0458-3.5430.000396*** t0.227852125295401-3.6459 1.0457-3.4860.000490*** t0.266238317485871-3.5253 1.0513-3.3530.000799*** t0.276177426334698-3.2951 1.0356-3.1820.001464** t0.360993505812205-3.2039 1.0353-3.0950.001970** t0.426697507683412-2.7934 1.0367-2.6940.007051** t0.511995413073629-1.8487 1.0105-1.8300.067323.z10.85490.3918 2.1820.029092*z2 1.08880.3684 2.9550.003123** ---Signif.codes:0‘***’0.001‘**’0.01‘*’0.05‘.’0.1‘’1 (Dispersion parameter for poisson family taken to be1) Null deviance:1743.184on121degrees of freedom Residual deviance:71.127on88degrees of freedomAIC:199.134Number of Fisher Scoring iterations:6>fit.ph$coefZ1Z20.8549497 1.0888337>logLik(fit.glm)'log Lik.'-66.5633(df=33)>logLik(fit.glm)[1]-sum(log(poisl))[1]-1.421085e-14The additional parameter estimates correspond to the estimated log baseline hazard,which we verify using the basehaz()function.>bh<-basehaz(fit.ph,centered=FALSE)>log(bh$hazard-c(0,bh$hazard[1:(length(bh$hazard)-1)]))[1:10][1]-5.049378-5.003546-4.987633-4.938810-4.887479-4.864823-Inf[8]-4.801254-4.793001-4.7680723Extending to PHMM3.1Fit PHMM>fit.phmm<-phmm(Surv(time,event)~Z1+Z2+(Z1+Z2|cluster),+phmmd,Gbs=100,Gbsvar=1000,VARSTART=1,+NINIT=10,MAXSTEP=100,CONVERG=90)>summary(fit.phmm)Proportional Hazards Mixed-Effects Model fit by MCMC-EMModel:Surv(time,event)~Z1+Z2+(Z1+Z2|cluster)Data:phmmdLog-likelihood:Conditional Laplace RIS-83.3-122.7-122.5Fixed effects:Surv(time,event)~Z1+Z2Estimate Std.ErrorZ10.80840.5195Z2 1.50090.5269Random effects:(Z1+Z2|cluster)Estimated variance-covariance matrix:5(Intercept)Z1Z2(Intercept)0.220.00000.0000Z10.000.38640.0000Z20.000.00000.4009Number of Observations:50Number of Groups:53.2Likelihood and degrees of freedom for Poisson GLMM from PHMMparameters>ppd<-as.data.frame(as.matrix(pseudoPoisPHMM(fit.phmm)))>poisl<-c()>eventtimes<-sort(phmmd$time[phmmd$event==1])>for(h in1:length(eventtimes)){+js<-ppd$time==eventtimes[h]&ppd$m>=1#j star+j<-ppd$time==eventtimes[h]+if(sum(js)>1)stop("tied event times")+poisl<-c(poisl,+ppd[js,"N"]*exp(-1)*exp(ppd[js,"linear.predictors"])/+sum(ppd[j,"N"]*exp(ppd[j,"linear.predictors"])))+}Poisson likelihood:>sum(log(poisl))[1]-93.33492>sum(log(poisl))-fit.phmm$loglik[1]Conditional-10.03456Poisson degrees of freedom>#Poisson GLMM degrees of freedom length(unique(x$cluster))*x$nrandom+x$nfixed >traceHat(fit.phmm,"pseudoPois")#+2*sum(phmmd$event)[1] 6.57453863.3Fit auxiliary Poisson GLMMWefit an auxiliary Poisson GLMM,although with a general variance-covariance matrix for the random effects(phmm()onlyfits models with diagonal variance-covariance matrix).>library(lme4)>ppd$t<-as.factor(ppd$time)>fit.lmer<-lmer(m~-1+t+z1+z2++(z1+z2|cluster)+offset(log(N)),+data=ppd,family=poisson)>summary(fit.lmer)@coefsEstimate Std.Error z value Pr(>|z|)t0.000277233256778163-5.9512956 1.1621163-5.121084 3.037838e-07t0.000285092717793308-5.8056724 1.1490883-5.052416 4.362559e-07t0.000382448373472765-5.7868566 1.1507599-5.028726 4.937484e-07t0.00559427171447325-5.6896123 1.1452649-4.967944 6.766637e-07t0.00764335258097282-5.5818834 1.1399068-4.8967899.741524e-07t0.00808285780728387-5.5730569 1.1412531-4.883279 1.043363e-06t0.0216256697018544-5.3492694 1.1177552-4.785725 1.703708e-06t0.0219649983261458-5.3458676 1.1185289-4.779374 1.758422e-06t0.0233956453029104-5.3007245 1.1213999-4.726881 2.279943e-06t0.0235837855332384-5.0000069 1.0919868-4.578816 4.676152e-06t0.0237625311885084-4.9356002 1.0944292-4.509748 6.490459e-06t0.027482795605763-4.9049547 1.0929053-4.4879967.189630e-06t0.0278642961804028-4.8723449 1.0932413-4.4567888.319670e-06t0.0316525538364514-4.8144685 1.0953376-4.395420 1.105589e-05t0.0357745779481545-4.7629359 1.0974195-4.340123 1.424028e-05t0.0366185731334857-4.4645437 1.0678178-4.180998 2.902326e-05t0.066999301944422-4.3400404 1.0743862-4.039553 5.355311e-05t0.0742904888064418-4.3164308 1.0737771-4.019857 5.823338e-05t0.09491415021304-4.2590295 1.0717742-3.9738127.073125e-05t0.125132209250348-4.1958685 1.0721767-3.9134119.100127e-05t0.132722661166308-4.1804168 1.0723683-3.8983039.686912e-05t0.140357744467437-4.0478261 1.0565985-3.830997 1.276250e-04t0.163527928343998-3.8492343 1.0551018-3.648211 2.640724e-04t0.193971448733795-3.5671855 1.0538864-3.3847917.123241e-04t0.204887967162952-3.4476935 1.0549744-3.268035 1.082969e-03t0.227852125295401-3.3890549 1.0542687-3.214603 1.306252e-03t0.266238317485871-3.2575659 1.0623132-3.066483 2.165928e-03t0.276177426334698-3.0812878 1.0453359-2.947653 3.201961e-03t0.360993505812205-2.8597107 1.0454713-2.735332 6.231745e-03t0.426697507683412-2.3978047 1.0400162-2.305546 2.113604e-02t0.511995413073629-1.6399451 1.0154918-1.614927 1.063265e-01z10.75038260.4575567 1.639977 1.010099e-01z2 1.54673170.4543462 3.404302 6.633339e-047>fit.phmm$coefZ1Z20.8084308 1.5008833>logLik(fit.lmer)'log Lik.'-69.97819(df=39)>sum(log(poisl))-logLik(fit.lmer)[1][1]-23.35673>log(fit.phmm$lambda)[1:10][1]-5.875868-5.752520-5.729987-5.634761-5.529504-5.520721-Inf [8]-5.335017-5.330244-5.2741398。