1 Analysis of a Case-Control Study Concerning Bladder Cancer and Occupational Exposures
- 格式:pdf
- 大小:78.45 KB
- 文档页数:13
1(经典老题)case-control study&cohort study的异同点2 什么是选择偏倚?病例对照研究,队列研究,现况调查会出现哪些选择偏倚?选择偏倚,不同类型(就研究的暴露、结局特征而言) 的个体入选研究的概率不同。
使研究对象与目标人群的特征存在系统误差,而使效应估计值与真值之间发生偏差。
既可产生于研究初期对象的入选(病例对照研究和横断面研究),也可产生于收集资料过程中出现失访或无应答(队列研究)病例对照研究:入院率偏倚/伯克森偏倚;检出征候偏倚/暴露偏倚;现患病例-新发病例偏倚/奈曼偏倚;队列研究:拒绝偏倚;志愿者偏倚;失访偏倚;无应答偏倚现况调查:现患病例-新发病例偏倚/奈曼偏倚;无应答偏倚3 clinical study设置对照的意义确定新疗法是否有效消除或减少非研究因素对结果的影响确定治疗的毒副反应4 clinical trial的对照类型按分组方法:随机对照;非随机对照按对照性质(措施):标准对照;安慰剂对照;空白对照按研究设计(时间性):同期对照;前后对照;交叉对照;历史对照5 混杂因素的概念和形成条件混杂因素:与研究的因素和研究的疾病均有关,若在比较的人群组中分布不均可以掩盖或夸大因素和疾病之间真正联系的因素。
形成条件:与所研究的疾病相关,与所研究的暴露因素相关,且不是研究因素与研究疾病因果链的中间环节。
6 给了两个研究,分别是病例对照研究和队列研究,让你比较(年年考...)然后研究如何继续进行,给出步骤,不需要计算(OR值 RR值之类吧...)7 每一届都有一道流行病的题必考:病例对照研究和队列研究(异同)这次出的是两个实例比较问题1 写出这两个研究的类型1.女性绝经患乳腺癌与未绝经患乳腺癌患病与未患病对照属于病例对照研究2.青石棉污染,死亡与未污染死亡追踪访问数年属于队列研究问题2 比较两研究的不同点问题3 给出了患病死亡人数让你写出具体计算步骤(不要求计算)8 原发性高血压病因在遗传背景轻度异常的基础上,加上不健康生活习惯的诱发而发病。
《临床流行病学》复习资料natural history of disease ),通过周密设计、准确的测量,对临床的诊断方法、治疗效果及其预后进行综合评价的一门学科。
Clinical Epidemiology: The application of the logical and quantitative concepts and methods of epidemiology to problems (diagnostic, prognostic, therapeutic, and preventive) encountered in the clinical delivery of care to individual patients. The population aspect of epidemiology is present because these individual patients are members of conceptual populations.,rate ratio)。
比值>1表示危险效应(因素),比0~attributable risk percent,AR%。
是患者渴望取悦于他们的医师,使医师感到其治疗活动是成功的,是患者的一种心理、生理效应,对治疗效果将会产生正向效应(positive effect)。
当然,有时会因厌恶某医生或不信任某医生而产生效果的负向效应(negative5%的人,即使不治疗,经过一而是由研究设计者来安排和控制全部试验,这样可以极大地减少两者由于主观因素造成的信息偏倚对临床试验结果的影响,这是此法的优点。
缺点实施、分析和解释的各个阶段由于设计的不合理,使研究的结果与事物的真实结果之间产生差异,出现的系统误(发病、死亡)比较两组或各组的发病率或死亡率,从而判断暴露因素与发病有无因果联系及联系程度大小的一种观察性研究方法。
risk factor)或),而把导致疾病发病概率降低的暴露称为保护因素),所以暴露既可以是致病因素也可是保护因素。
病例—对照研究中混杂因素的控制及其统计分析方法发表时间:2011-06-02T11:42:42.450Z 来源:《中外健康文摘》2011年第7期供稿作者:李新刘海英关红阳[导读] 病例-对照研究是最常用的流行病学研究方法之一。
在研究过程中,各种混杂因素对研究结果有着不同程度的影响。
李新刘海英关红阳(辽宁中医药大学辽宁沈阳 110032)【中图分类号】R197.3 【文献标识码】A 【文章编号】1672-5085(2011)7-0359-03【摘要】病例-对照研究是最常用的流行病学研究方法之一。
在研究过程中,各种混杂因素对研究结果有着不同程度的影响。
在众多的控制混杂因素方法中:匹配设计可以保证病例组与对照组在匹配的混杂因素方面分布一致,避免其混杂影响;分析某危险因素与疾病的关系时,如果存在可能的混杂因素,可以按混杂因素的特征进行分层分析;研究分析多种危险因素与研究疾病的联系,可以利用多元统计分析方法,获得更多的信息,得到更科学的结论。
【关键词】病例-对照研究混杂因素设计分层多元分析【Abstract】case-control study is one of the most commonly used method of epidemiological studies. In the course of the study, a variety of confounding factors have great effect on the results of the study. In many of the control of confounding factors method: matching design can guarantees that case groups and control groups in terms of matching confounding factors having their distribution consistent; analysis of a mixture of risk factors and diseases, if there is a possible confounding factors, you can press the confounding factors features tiered analysis; research and analysis of the various risk factors and of the Association, you can use multivariate statistical analysis for more information and get more scientific conclusions.【Key words】Case-control study Keywords confounding factors design layered multivariate analysis病例-对照研究是最常用的流行病学研究方法之一,它广泛应用于探讨疾病病因、干预措施、项目评价以及公共卫生与医学实践的许多方面。
巢式病例对照研究(Nested case-contro l study)特点及常见问题解答作者:MedSci来源:MedSci巢式病例对照研究(nested case contro l study,NCCS)又被译做嵌入式病例对照研究,也称队列内病例对照研究(case contro l studywithin a cohort),是在全队列内套用病例对照设计。
这一设计方案于1973年由美国流行病学家M antel最早提出,称其为综合性病例对照研究(synthe tic case contro l study),1982年正式提出巢式病例对照研究(nested case-contro l study ),这是将队列研究与病例对照研究相结合的一种研究方法。
它在国际流行病学协会出版的《流行病学词典》(John st主编,1988)中的定义为:N CCS 是这样一种病例对照研究,它的病例和对照人群都从一个全队列(即通常所说的队列研究)中选取。
因为病例和对照的某些资料已经存在,一些潜在混杂因素的影响就能被减少或避免。
如病例和对照的暴露资料等均在疾病或死亡未发生前获得,故不存在暴露与疾病的时间顺序问题,观察偏倚也能得到很好地控制。
1.概念:是在队列内套用病例对照研究的一种设计,其研究对象是在队列研究的基础上确定的,以队列中所有的病例作为病例组,再根据病例发病时间,在研究队列的非病例中随机匹配一个或多个对照,组成对照组。
但是其研究方法和分析方法仍与病例对照研究相同(主要是配比病例对照研究)。
此种研究设计尤其适合于研究因素包括有复杂的化学或生化分析的前瞻性研究。
2.研究类型:由于巢式病例对照研究是在队列研究的基础上设计和实施的,因此也有前瞻性,回顾性,双向性三类。
巢式病例对照研究(Nested case-control study)特点及常见问题解答作者:MedSci 来源:MedSci巢式病例对照研究(nested case control study,NCCS)又被译做嵌入式病例对照研究,也称队列内病例对照研究(case control study within a cohort),是在全队列内套用病例对照设计。
这一设计方案于1973年由美国流行病学家Mantel最早提出,称其为综合性病例对照研究(synthetic case control study),1982年正式提出巢式病例对照研究(nested case-control study),这是将队列研究与病例对照研究相结合的一种研究方法。
它在国际流行病学协会出版的《流行病学词典》(John st主编,1988)中的定义为:NCCS 是这样一种病例对照研究,它的病例和对照人群都从一个全队列(即通常所说的队列研究)中选取。
因为病例和对照的某些资料已经存在,一些潜在混杂因素的影响就能被减少或避免。
如病例和对照的暴露资料等均在疾病或死亡未发生前获得,故不存在暴露与疾病的时间顺序问题,观察偏倚也能得到很好地控制。
1.概念:是在队列内套用病例对照研究的一种设计,其研究对象是在队列研究的基础上确定的,以队列中所有的病例作为病例组,再根据病例发病时间,在研究队列的非病例中随机匹配一个或多个对照,组成对照组。
但是其研究方法和分析方法仍与病例对照研究相同(主要是配比病例对照研究)。
此种研究设计尤其适合于研究因素包括有复杂的化学或生化分析的前瞻性研究。
2.研究类型:由于巢式病例对照研究是在队列研究的基础上设计和实施的,因此也有前瞻性,回顾性,双向性三类。
3.步骤:1)确定研究队列人群(注意:人群进入队列时都没有发生所要的疾病,和队列研究相同)2)收集队列内每个成员的相关信息和生物标本。
3)预定期间的随访4)确定随访期内发生的某疾病全部病例组成病例组5)随访结束时还未发生所研究疾病的人群随机抽取一定数量组成对照组,一般采用比配的方式为每个病例抽取一定数量的对照组成对照组。
case-control study名词解释
病例对照研究(case-control study)是一种观察性研究设计,用于探索疾病或健康状况与一个或多个潜在因素之间的关系。
这种研究方法通过比较患有特定疾病的个体(病例)与未患该疾病的个体(对照)来评估暴露于某种因素与疾病之间的关联。
在病例对照研究中,研究者首先选择一组已经患有感兴趣疾病的个体作为病例组,然后选择一组没有该疾病的个体作为对照组。
接下来,研究者回顾性地收集两组个体关于过去暴露于潜在风险因素的信息。
通过比较病例组和对照组的暴露历史,研究者可以估计特定暴露因素与疾病之间的关联强度。
病例对照研究的关键特点包括:
1. 回顾性:数据收集通常是回顾性的,即在选定病例和对照后,研究者会询问他们过去的暴露情况。
2. 匹配:为了控制混杂因素,有时研究者会选择与病例在某些关键特征(如年龄、性别)上相似的对照。
3. 比率:研究结果通常以比值比(odds ratio, OR)的形式表示,这是一种衡量暴露与疾病之间关联强度的指标。
4. 效率:与队列研究相比,病例对照研究通常成本较低,时间较短,适用于罕见疾病的研究。
病例对照研究的局限性包括回忆偏倚(由于参与者对过去暴露的记忆可能不准确)、选择偏倚(如果病例和对照的选择不是随机的)以及无法确定因果关系(只能提供关联证据)。
病例对照研究是流行病学研究中的一种重要工具,它帮助科学家们发现了许多疾病的风险因素,从而为公共卫生干预提供了依据。
Analysis of a Case-Control Study ConcerningBladder Cancer and Occupational Exposures*byF. Geller1, W. Urfer2 and K. Golka11 - Institut für Arbeitsphysiologie an der Universität Dortmund, 44139 Dortmund, Germany2 - Fachbereich Statistik, Universität Dortmund, 44221 Dortmund, GermanyAbstractThis work presents the results from a case-control study about occupationalexposures as risk factors for bladder cancer. Odds ratio analysis and logisticregression give results, which show the influence of different exposures topolycyclic aromatic hydrocarbons and an exposure to paints on bladder canceretiology. Further an outlook on the upcoming studies about genetic predispo-sitions as additional risk factors is given.1. IntroductionThe StudyThe relationships between assumed carcinogenic substances and bladder cancer were assessed from a case-control study conducted between 1992 and 1996 in Northrhine-Westphalia, Germany. The analysis was conducted by means of smoking-adjusted odds ratios and logistic regression.The study concerned 156 cases with bladder cancer and 336 controls with prostate cancer. The main focus was the role of polycyclic aromatic hydrocarbons (PAH), since most of the considered exposures contained substances of this group. Data were collected from male patients who had applied for a cancer rehabilitation treatment.A questionnaire was sent to all patients, containing questions about occupational history, occupational exposures to potential carcinogenic substances, smoking habits, diseases of relatives and hobbies. The exposures to chemicals, paints, bitumen, tar, pitch and other combustion fumes were divided into seldom, often and permanent. Problems occured for pitch and chemicals. The question about pitch was placed unfortunately on the questionnaire leading to many missings. For chemicals a huge variety of substances was stated, including common detergents. Thephysician reported medical informations, including height, weight, further diseases and staging and grading of the tumor.1991).Occupational Exposures as Risk Factors for Bladder Cancer* Research supported by Deutsche Forschungsgemeinschaft through SFB 475 and GK “Toxikologie und Umwelthygiene“.The relationship between occupational exposures and bladder cancer is examined since Rehn (1895) reported an undue incidence of bladder tumors in a group of men employed in the manufacture of fuchsine. He concluded that aniline was the most suspicious of the substances used in this process. Up to now it is known that an exposure to aromatic amines or bioavailable azo dyes (which are cleaved in the human body to aromatic amines) lead to an elevated bladder cancer risk. The simplifying two stage model of carcinogenesis divides the whole process in two phases:•the initiation sets the stage for carcinogenesis by changing the genetic information. Initiating agens bind to DNA and cause an irreversible genetic damage. In the further development of the tumor these substances may play no role.•the promotion is necessary for an onset of disease within life time. The effect of promotors is reversible. After cessation of exposure the effect vanishes over time.Aromatic amines have to be considered as initiating agens. Especially the pathway from the initiating agens in the liver to the formation of the DNA-adduct in the bladder epithelium is shown in fig. 1.The role of polycyclic aromatic hydrocarbons is mainly on the stage of promotion. Therefore the frequency of exposure is more important than for aromatic amines.2.Evaluation of Non-Occupational FactorsThere were differences in the distribution of non-occupational factors for age, smoking and further tumors in the two groups. The cases had a mean age of 62.5 compared to 65.4 years for the controls. The differences became obvious when four age groups were considered. The youngest patients, under 55 years, contained 17% of the cases and only 7% of the controls. While 49% of the cases had an age between 55 and 64, 36% of the controls fell in this age group. For an age between 65 and 74 years the effect was contrary, with frequencies of 25% for the cases and 49% for the controls. The age group above 74 years showed no differences. The smoking habits underlined the role of smoking for bladder cancer etiology. For the cases there were 43% smokers reported compared to 31% for the controls. Further tumors were observed for 10% of the cases and 5% of the controls. For obesity, measured by a Broca-index greater than one, a non-significant odds ratio smaller than one was observed, since obesity is an assumed risk factor for prostate cancer.3.Odds Ratio AnalysisThe analysis of the crude and smoking-adjusted odds ratios was divided in frequent and ever exposure.“Frequent“ exposure compared the answers often and permanent with seldom and no exposure. “Ever“exposure combined seldom, often and permanent exposure. The ratios for the PAH-group were always higher for a frequent exposure. For paints an ever exposure had the higher value. Table 1 shows the crude and smoking-adjusted odds ratio estimates for the exposures and the significant non-occupational factors. Significant odds ratios were obtained for the three exposures paints (1.69), bitumen (2.93), tar (2.49) and the cumulative measure any PAH (1.67). For work in a coke oven plant an exceeded crude odds ratio of 2.16 was calculated.Crude Smoking-adjusted Risk Factor Odds Ratio95% - CI Odds Ratio95% - CI(Mantel-Haenszel)Occupational Factors Aromatic Amines- ever exposed to paints1.69 1.12 -2.55 1.69 1.10 - 2.61 Polycyclic Aromatic Hydrocarbons- frequently exposed to bitumen 2.93 1.34 - 6.41 2.92 1.32 - 6.48 - frequently exposed to tar 2.49 1.27 - 4.89 2.09 1.04 - 4.21 - frequently exposed to pitch*3.490.89 - 13.62 3.060.77 - 12.10 - frequently exposed to other combustion fumes1.350.77 -2.38 1.210.68 - 2.15 work in a coke oven plant 2.160.76 - 6.13 1.920.58 - 6.29 work at a blast furnace1.620.75 - 3.49 1.730.77 - 3.90 - frequently exposed to any PAH 1.67 1.08 -2.60 1.560.98 - 2.47 Chemicals- frequently exposed to chemicals*1.360.88 -2.12 1.290.81 - 2.05Non-Occupational FactorsSmoking1.79 1.20 -2.68 Age : younger than 65 years 2.59 1.75 -3.83 2.28 1.51 - 3.44 Obesity ( Broca-Index > 1 )0.700.43 - 1.130.730.44 - 1.24 Further tumors2.291.13 - 4.632.120.99 - 4.54Legend:* - uncertain informationTable 1 : Crude and smoking-adjusted odds ratio estimates with 95% confidence intervals.ResultsThe estimated smoking-adjusted odds ratios were significantly higher for − an ever exposure to paints, OR ∧=1.69, 95%-C.I.=(1.10, 2.61),− a frequent exposure to tar, OR ∧=2.09, 95%-C.I.=(1.04, 4.21).− a frequent exposure to pitch, OR ∧=3.06, 95%-C.I.=(0.77, 12.10),Elevated risks were observed for these occupational exposures:− work in a coke oven plant, OR ∧=1.92, 95%-C.I.=(0.58, 6.29),− work at a blast furnace, OR ∧=1.73, 95%-C.I.=(0.77, 3.90).The occupational exposures to other combustion fumes and chemicals showed only slightly increased odds ratio estimates about 1.2 .When the power of the study is considered, the low frequencies of exposed in the control group made it hard to detect significant ratios. The frequencies for bitumen, tar, pitch, work in a coke oven plant and work at a blast-furnace were smaller than 0.1 and led to poor values for the power, see Geller (1997).The analysis of odds ratios for smokers and non-smokers did not provide uniform results, see Geller (1997).4.Logistic Regression AnalysisA logistic regression model was fitted to the data, including the non-occupational factors age, smoking and further tumors. Age group 2 (55 - 64 years) was chosen as reference group since the most cases reported an age in this group. For 52 patients important informations were missing, leaving 137 cases and 303 controls. Table 2 shows the model of the main effects. The only occupational factors which met the entry criteria of a p-value less than 0.2 in the multiple model were paints and bitumen.Estimated Odds RatioParameter EstimateStandard Error P-value of the χ2 -Test Variable$β $S Intercept-1.0010.2100.0001 Age group 1 ( younger than 55 years) 2.020.7040.3540.047 Age group 3 ( 65 - 74 years )0.41-0.8960.2530.0004 Age group 4 (older than 74 years) 1.090.0820.4000.837 Smoking1.700.5320.2260.018 Ever exposed to paints1.530.4220.2420.081 Frequently exposed to bitumen2.350.8530.4470.056 Further tumors2.430.8880.4120.031Table 2 : Logistic regression model for main effects.Since age and smoking often interact with other factors, two important questions arise:• Are the bladder cancer patients with an occupational exposure younger than those without ?•Does smoking strengthen the effect of occupational exposures ?Table 3 contains the model with interactions, which led to the following conclusions:•The occupational exposures interact with younger ages at first diagnosis.•Smoking is related with further tumors and also with younger ages at first diagnosis but shows no association to the exposures.The interactions caused substantial changes in the parameter estimates for the main effects. The parameter estimates went down and the standard errors grew. Especially for smoking and bitumen the p-values increased above 0.5 , underlining the importance of the interactions for these effects. There were three interactions for the younger patients, diminishing the effect of age group 1. The effect of further tumors was to a certain extent explained by the interaction with smoking.5. Summary Measures for the Fit of the Logistic Regression ModelThe discussion about the fit of the model is based on covariate patterns, which describe a single set of values for the covariates in the model. Since the logistic regression model contains age groups instead of age in years, the obtainable number of covariate patterns is limited. There are 64 possible patterns with observations for 39 of them.Pearson-χ2-Test and Deviance-TestThe Pearson-χ2-statistic X2 and the deviance-statistic D test the nullhypothesis that the model is correct in every aspect. Under H0 the teststatistics are supposed to be χ2-distributed with J-(p+1) degrees offreedom, with J denoting the number of observed covariate patterns and (p+1) the number of parameters in the model (p effects plus intercept). The model has 7 main effects and 4 inter-actions. For 39 covariate patterns there are observations, leading to 27 degrees of freedom:Test Statistic DF p-value Pearson-χ2X2 = 25.95270.521Deviance D = 31.60270.247The p-values differ notable. The value of the Pearson-statistic is satisfying, whilst the result for the Deviance is too small. It has to be stated that the assumptions for the χ2-distribution are violated, since the m j in some covariate patterns are very small. For 12 of the 39 patterns there is just one observation. Especially the small p-value for the deviance does not necessary result from a poor fit of the model. It has to be kept in mind that the model contains two variables which do not improve the model. The effect of bitumen is mainly modelled by the interaction younger than 65 × bitumen and age group 4 has no clear differences to the reference group in any model.The Hosmer-Lemeshow-TestBecause of the high number of covariate patterns with very few observations, another approach shall be introduced here. The chosen form of the Hosmer-Lemeshow-test is based on a grouping of the data according to their estimated probabilities. In these groups the observed and the expected frequencies are compared.If possible, SAS® divides the data into 10 groups of size n/10. Since the number of groups has to be higher than the number of parameters, an individual division into 13 groups is done. The data include many ties, for some patterns there are more than 33.8 (= n/13) observations, so that the huge covariate patterns with more than 50 observations limit the options for grouping, as seen in table 4. The 89 observations for covariate pattern 5 are randomly assigned to two groups, in other cases the frequent observed pattern build a group on their own. The 4 groups with these patterns include already 204 observations. The rest of 236 observations has to be divided into 9 groups. It can be arranged, that almost every group contains at least 20 observations, only group 3 has a size of 13 persons.The resulting expected group frequencies are also given in table 4. The accordance is reckonable.The statistic $C is under the nullhypothesis χ2 - distributed with (G-2) degrees of freedom, where G denotes the number of groups. The distribution assumption is based on m-asymptotics, that means if J < n is fixed and n increases every m j will tend to become large. This is justified, when the expected frequencies are large enough. This condition is basically fulfilled, because only group 4 has a smaller number of cases and group 13 has a smaller number of controls.Test Statistic DF p-valueHosmer-Lemeshow$C = 7.143110.787The test provides a p-value of 0.787, stating a good fit of the model. Therefore all the covariate patterns with very few observations probably caused the deviations in the other statistics. Just this effect is controlled by grouping. A disadvantage of this variant of the Hosmer-Lemeshow-test is caused by the fact, that the probabilities are solely grouped with the aim of considerably large groupPattern No.Estimated Obs.Cases Controls Contrib. to case-prob.expected observed expected observed$C70.14235 4.97530.0330Group 10.14235 4.97530.03300.000350.16689 14.751674.2573Group 20.166457.46737.54380.033 Group 30.166447.29936.71350.480 230.16720.330 1.672390.18210 1.8218.189210.19410.1900.811Group 40.18113 2.35110.65120.945 370.21128 5.90622.1022550.21310.2100.79160.2245 1.122 3.883530.24520.490 1.512Group 50.215367.73828.27280.01290.2465914.511444.4945Group 60.2465914.511444.49450.024 380.2805 1.401 3.60430.2925 1.462 3.543130.29310 2.9337.077Group 70.28920 5.79614.21140.011 410.305278.231118.7716540.32010.3210.680100.3215 1.612 3.393Group 80.3083310.161422.8419 2.09510.33121 6.96514.0416350.35720.712 1.290420.38910.3900.611330.4005 2.002 3.00320.41910.4200.581Group 90.3493010.48919.52210.322 110.4205623.532432.4732Group100.4205623.532432.47320.016 490.44810.4510.550150.479157.1977.818430.49412 5.923 6.089Group110.4842813.561114.44170.938 250.5313 1.591 1.41280.5705 2.853 2.152570.6044 2.412 1.592400.64110.6400.361450.6717 4.704 2.303Group120.6102012.20107.8010 1.016 270.7154 2.863 1.141460.74810.7500.251590.7723 2.3230.680470.8197 5.747 1.260120.85310.8510.150610.87610.8810.120160.88110.8810.120440.8872 1.7720.230630.94010.9410.060Group130.8092116.9819 4.022 1.251 Table 4: Grouping for the Hosmer-Lemeshow-test, groups consist of the covariate patterns listed directly above.sizes. Since this study contains far more controls than cases, most observations have a $πj smaller than 0.5. Therefore the first 8 groups contain probabilities between 0.142 und 0.312, whilst group 13 includes all probabilities from 0.715 up to 0.940. On the other hand a grouping based on percentiles would have led to many groups of small size. It has to be stated that the properties of the Hosmer-Lemeshow-test in the given setting, where the number of covariate patterns is substantly smaller than n are hardly investigated in yet.Classification tablesClassification tables examine the predictive power of a model. Therefore the observations are classified into cases and controls according to their estimated probabilities. Usually the division is made for $π = 0.5. Since the number of observations differs that much for the two collectives, a division at $π = 0.311 makes more sense, because the small fraction of 31.1% cases in the study is taken into account. Elsewise the observations would mainly be classified into the larger group.Here the sensitivity, the specifity, the false positive and the false negative rate are of interest.The sensitivity φ is the proportion of cases that are also predicted to be cases, whilst the specifity ψgives the proportion of controls that are predicted to be controls. The false positive rate describes the proportion of predicted cases that are observed as controls and the false negative rate states the proportion of predicted controls which are in reality cases.Classified Prob. of classification Observed Cases Controls Sum Cases ControlsCases6869137φ = 0.496 f. neg. = 0.240 Controls85218303 f. pos. = 0.556ψ = 0.719Sum153287440Table 5: Classification table with classification at an estimated probability of 0.311.Tables 5 and 6 show the results for $π= 0.311 and $π= 0.5. As expected, the controls are better predicted than the cases. The total rate of right predictions for $π = 0.311 is 65%, for $π = 0.5 a slightly higher proportion of 68.6% is achieved. The differences in the false rates are small. The poor sensitivity for $π = 0.5 is striking, resulting from the low number of predicted cases, only 57 compared to 137 observed.Classified Prob. of classification Observed Cases Controls Sum Cases ControlsCases28109137φ = 0.204 f. neg. = 0.285 Controls29274303 f. pos. = 0.509ψ = 0.904Sum57383440Table 6: Classification table with classification at an estimated probability of 0.5.In conclusion the general fit of the model can be rated as good. Many covariate patterns with just one observation violate the conditions of the Pearson-χ2-test and the Deviance-test, resulting in p-values which are not very informative. The Hosmer-Lemeshow-test is based on a grouping of the data and gives a p-value of 0.79, which means that the model fits quite good. Since there are far more controls than cases in the study, this status was better modelled. The classification table for $π = 0.311 fulfilled a minimum requirement for a good model: the addition of sensitivity und specifity came out as 1.22, a value greater than one. Next local weaknesses of the model were examined by regression diagnostics leading to the following results:•High leverage values were obtained for the covariate patterns without occupational exposure, which is closely related to the high number of observations for these patterns.High Pearson- and Deviance-residuals were mainly reported for patterns with an exposure to paints, which all have small numbers of observation.For further details about the fit of the model see Geller (1997).6. DiscussionThe higher odds ratios for an ever exposure to paints and for frequent exposure to the PAH-group is probably related to the role of these substances in cancerogenesis. The interactions with the age groups support the theses that:•aromatic amines mainly initiate tumors very early and lead to quite young ages at first diagnosis (an ever exposure to paints has a positive interaction with an age younger than 55).•polycyclic aromatic hydrocarbons mainly promote tumors and the age at first diagnosis is thereby lessened (a frequent exposure to bitumen has a positive interaction with an age younger than 65).7. OutlookThe next step in investigating these occupation-related diseases is to search for individual susceptibility factors, which promote carcinogenesis and can interfere in many ways on the path from exposure to disease. Sen and Margolin (1995) combined aspects of genetic toxico-logy with epidemiologic research. The main ideas of this combination are shown in fig. 2.11Exposure PrognosisEnvironmentLife StyleFig. 2.: The way from exposure to disease (Sen and Margolin, 1992).12Interesting features of this topic are genetic predispositions which are related to certain exposures. In the development of bladder cancer the impacts of polymorphic enzymes involved in the metabolism of aromatic amines (N-acetyltransferase 2) and polycyclic aromatic hydrocarbons (glutathione transferase µ), see for example Golka et al. (1997) have to be considered. Further investigation may examine if an association effects the latency period for the patients.References1.Bolt, H.-M.; Heitmann, P.; Gieseler, F.; Reich, S.E.; Urfer, W.; Golka, K. (1997):“ Risikofaktoren für Harnblasentumoren bei Krebsnachsorgepatienten in Nordrhein-West-falen. ” In Verhandl. Dt. Ges. Arbeitsmed. Umweltmed. 37; 127-131.2.Breslow, N.E.; Day, N.E. (1980): “ Statistical Methods in Cancer Research. Volume 1: TheAnalysis of Case-Control Studies. ” International Agency for Research on Cancer, Scientific Publications No. 32, Lyon.3.Geller, F. (1997): “ Berufliche Expositionen als Risikofaktoren für Harnblasenkrebs:Auswertung einer Fall-Kontroll-Studie mittels logistischer Regression. ” Diplomtheses, Fachbereich Statistik, Universität Dortmund.4.Golka, K.; Reckwitz, T.; Kempkes, M.; Cascorbi, I.; Blaszkewicz, M.; Reich, S.E.;Roots, I.; Soekeland, J.; Schulze, H.; Bolt, H.M. (1997): “ N-acetyltransferase 2 (NAT2) and glutathione S-transferase µ(GSTM1) in bladder-cancer patients in a highly industrialized area. ” Int. J. Occup. Environ. Health 1997; 3: 105-110.5.Hosmer, D.W.; Lemeshow, S. (1989): “ Applied Logistic Regression. ” Wiley, New York.ng, N.P.; Kadlubar, F.F. (1991): “ Aromatic and heterocyclic amine metabolism andphenotyping in humans. ” In Gledhill, B.L. (Ed.): New horizons in biological dosimetry. Proc.of the International Symposium on Trends in Biological Dosimetry, held in Levici, Italy, Oct.23-27, 1990; 33-47. Wiley-Liss, New York.7.Sen, P.K.; Margolin, B.H. (1995): “ Inhalation Toxicology: Awareness, Identifiability,statistical perspectives and risk assessment. ” Sankhya 1995; 57,Series B, 252-276.13。