Subjective Bayesian Analysis Principles and Practice
- 格式:pdf
- 大小:166.07 KB
- 文档页数:18
《统计学》_各章关键术语(中英⽂对照)第⼆部分各章关键术语(中英⽂对照)第1章统计学(statistics)随机性(randomness)描述统计学(descriptive statistics)推断统计学(inferential statistics)总体(population)母体(parent)(parent population)样本、⼦样(sample)调查对象总体(respondents population)有限总体(finite population)调查的理论总体(survey’s heoretical population)超总体(super population)变量(variable)数据(data)原始数据(original data)派⽣数据(derived data)定类尺度(nominal scale)定类尺度变量(nominal scale level variable)定类尺度数据(nominal scale level data)定序尺度(ordinal scale)定序尺度变量(ordinal scale level variable)定序尺度数据(ordinal scale level data)定距尺度(interval scale)定距尺度变量(interval scale level variable)定距尺度数据(interval scale level data)定⽐尺度(ratio scale)定⽐尺度变量(ratio scale level variable)定⽐尺度数据(ratio scale level data)分类变量(categorical variable)定性变量、属性变量(qualitative variable)数值变量(numerical variable)定量变量、数量变量(quantitative variable)绝对数变量(absolute number level variable)绝对数数据(absolute number level data)⽐率变量(ratio level variable)⽐率数据(ratio level data)实验数据(experimental data)调查数据(survey data)观察数据(observed data)第2章随机性(randomness)随机现象(random phenomenon)随机试验(random experiment)事件(event)基本事件(elementary event)复合事件(union of event)必然事件(certain event)不可能事件(impossible event)基本事件空间(elementary event space)互不相容事件(mutually exclusive events)统计独⽴(statistical independent)统计相依(statistical dependence)概率(probability)古典⽅法概率(classical method probability)相对频数⽅法概率(relative frequency method probability)主观⽅法概率(subjective method probability)⼏何概率(geometric probability)条件概率(conditional probability)全概率公式(formula of total probability)贝叶斯公式(Bayes’ formula)先验概率(prior probability)后验概率(posterior probability)随机变量(random variable)离散型随机变量(discrete type random variable)连续型随机变量(continuous type random variable)概率分布(probability distribution)特征数(characteristic number)位置特征数(location characteristic number)数学期望(mathematical expectation)散布特征数(scatter characteristic number)⽅差(variance)标准差(standard deviation)变异系数(variable coefficient)贝努⾥分布(Bernoulli distribution)⼆点分布(two-point distribution) 0-1分布(zero-one distribution)贝努⾥试验(Bernoulli trials)⼆项分布(binomial distribution)超⼏何分布(hyper-geometric distribution)正态分布(normal distribution)正态概率密度函数(normal probability density function)正态概率密度曲线(normal probability density curve)正态随机变量(normal random variable)卡⽅分布(chi-square distribution)F_分布(F-distribution)t_分布(t-distribution) “学⽣”⽒t_分布(Student’s t-distribution)列联表(contingency table)联合概率分布(joint probability distribution)边缘概率分布(marginal probability distribution)条件分布(conditional distribution)协⽅差(covariance)相关系数(correlation coefficient)第3章统计调查(statistical survey)数据收集(collection of data)统计单位(statistical unit)统计个体(statistical individual)社会经济总体(socioeconomic population)调查对象总体(respondents population)有限总体(finite population)标志(character)标志值(character value)属性标志(attributive character )品质标志(qualitative character )数量标志(numerical indication)不变标志(invariant indication)变异(variation)调查条⽬(item of survey)指标(indicator)统计指标(statistical indicator)总量指标(total amount indicator)绝对数(absolute number)统计单位总量(total amount of statistical unit )标志值总量(total amount of indication value)(total amount of character value)时期性总量指标(time period total amount indicator)流量指标(flow indicator)时点性总量指标(time point total amount indicator)存量指标(stock indicator)平均指标(average indicator)平均数(average number)相对指标(relative indicator)相对数(relative number)动态相对指标(dynamic relative indicator)发展速度(speed of development)增长速度(speed of growth)增长量(growth amount)百分点(percentage point)计划完成相对指标(relative indicator of fulfilling plan)⽐较相对指标(comparison relative indicator)结构相对指标(structural relative indicator)强度相对指标(intensity relative indicator)基期(base period)报告期(given period)分组(classification)(grouping)统计分组(statistical classification)(statistical grouping)组(class)(group)分组设计(class divisible design)(group divisible design)互斥性(mutually exclusive)包容性(hold)分组标志(classification character)(grouping character)按品质标志分组(classification by qualitative character)(grouping by qualitative character)按数量标志分组(classification by numerical indication)(grouping by numerical indication)离散型分组标志(discrete classification character)(discrete grouping character)连续型分组标志(continuous classification character)(continuous grouping character)单项式分组设计(single-valued class divisible design)(single-valued group divisible design)组距式分组设计(class interval divisible design)(group interval divisible design)组界(class boundary)(group boundary)频数(frequency)(frequency number)频率(frequency)组距(class interval)(group interval)组限(class limit)(group limit)下限(lower limit)上限(upper limit)组中值(class mid-value)(group mid-value)开⼝组(open class)(open-end class)(open-end group)开⼝式分组(open-end grouping)等距式分组设计(equal class interval divisible design)(equal group interval divisible design)不等距分组设计(unequal class interval divisible design)(unequal group interval divisible design)调查⽅案(survey plan)抽样调查(sample survey)有限总体概率抽样(probability sampling in finite populations)抽样单位(sampling unit)个体抽样(elements sampling)等距抽样(systematic sampling)整群抽样(cluster sampling)放回抽样(sampling with replacement)不放回抽样(sampling without replacement)分层抽样(stratified sampling)概率样本(probability sample)样本统计量(sample statistic)估计量(estimator)估计值(estimate)⽆偏估计量(unbiased estimator)有偏估计量(biased estimator)偏差(bias)精度(degree of precision)估计量的⽅差(variance of estimates)标准误(standard error)准确度(degree of accuracy)均⽅误差(mean square error)估计(estimation)点估计(point estimation)区间估计(interval estimate)置信区间(confidence interval)置信下限(confidence lower limit)置信上限(confidence upper limit)置信概率(confidence probability)总体均值(population mean)总体总值(population total)总体⽐例(population proportion)总体⽐率(population ratio)简单随机抽样(simple random sampling)简单随机样本(simple random sample)研究域(domains of study)⼦总体(subpopulations)抽样框(frame)估计量的估计⽅差(estimated variance of estimates)第4章频数(frequency)(frequency number)频率(frequency)分布列(distribution series)经验分布(empirical distribution)理论分布(theoretical distribution)品质型数据分布列(qualitative data distribution series)数量型数据分布列(quantitative data distribution series)单项式数列(single-valued distribution series)组距式数列(class interval distribution series)频率密度(frequency density)分布棒图(bar graph of distribution)分布直⽅图(histogram of distribution)分布折线图(polygon of distribution)累积分布数列(cumulative distribution series)累积分布图(polygon of cumulative distribution)位置特征(location characteristic)位置特征数(location characteristic number)平均值、均值(mean)平均数(average number)权数(weight number)加权算术平均数(weighted arithmetic average)加权算术平均值(weighted arithmetic mean)简单算术平均数(simple arithmetic average)简单算术平均值(simple arithmetic mean)加权调和平均数(weighted harmonic average)加权调和平均值(weighted harmonic mean)简单调和平均数(simple harmonic average)简单调和平均值(simple harmonic mean)加权⼏何平均数(weighted geometric average)加权⼏何平均值(weighted geometric mean)简单⼏何平均数(simple geometric average)简单⼏何平均值(simple geometric mean)绝对数数据(absolute number data)⽐率类型数据(ratio level data)中位数(median)众数(mode)耐抗性(resistance)散布特征(scatter characteristic)散布特征数(scatter characteristic number)极差、全距(range)四分位差(quartile deviation)四分间距(inter-quartile range)上四分位数(upper quartile)下四分位数(lower quartile)在外截断点(outside cutoffs)平均差(mean deviation)⽅差(variance)标准差(standard deviation)变异系数(variable coefficient)第5章随机样本(random sample)简单随机样本(simple random sample)参数估计(parameter estimation)矩(moment)矩估计(moment estimation)修正样本⽅差(modified sample variance)极⼤似然估计(maximum likelihood estimate)参数空间(space of paramete)似然函数(likelihood function)似然⽅程(likelihood equation)点估计(point estimation)区间估计(interval estimation)假设检验(test of hypothesis)原假设(null hypothesis)备择假设(alternative hypothesis)检验统计量(statistic for test)观察到的显著⽔平(observed significance level)显著性检验(test of significance)显著⽔平标准(critical of significance level)临界值(critical value)拒绝域(rejection region)接受域(acceptance region)临界值检验规则(test regulation by critical value)双尾检验(two-tailed tests)显著⽔平(significance level)单尾检验(one-tailed tests)第⼀类错误(first-kind error)第⼀类错误概率(probability of first-kind error)第⼆类错误(second-kind error)第⼆类错误概率(probability of second-kind error)P_值(P_value)P_值检验规则(test regulation by P_value)经典统计学(classical statistics)贝叶斯统计学(Bayesian statistics)第6章⽅差分析(analysis of variance,ANOVA)⽅差分析恒等式(analysis of variance identity equation)单因⼦⽅差分析(one-factor analysis of variance)双因⼦⽅差分析(two-factor analysis of variance)总变差平⽅和(total variation sum of squares)总平⽅和SST(total sum of squares)组间变差平⽅和(among class(group) variation sum of squares),回归平⽅和SSR(regression sum of squares)组内变差平⽅和(within variation sum of squares)误差平⽅和SSE(error sum ofsquares)⽪尔逊χ2统计量(Pearson’s chi-statistic)分布拟合(fitting of distrbution)分布拟合检验(test of fitting of distrbution)⽪尔逊χ2检验(Pearson’s chi-square test)列联表(contingency table)独⽴性检验(test of independence)数量变量(quantitative variable)属性变量(qualitative variable)对数线性模型(loglinear model)回归分析(regression analysis)随机项(random term)随机扰动项(random disturbance term)回归系数(regression coefficient)总体⼀元线性回归模型(population linear regression model with a single regressor)总体多元线性回归模型(population multiple regression model with a single regressor)完全多重共线性(perfect multicollinearity)遗漏变量(omitted variable)遗漏变量偏差(omitted variable bias)⾯板数据(panel data)⾯板数据回归(panel data regressions)⼯具变量(instrumental variable)⼯具变量回归(instrumental variable regressions)两阶段最⼩平⽅估计量(two stage least squares estimator)随机化实验(randomized experiment)准实验(quasi-experiment)⾃然实验(natural experiment)普通最⼩平⽅准则(ordinary least squares criterion)最⼩平⽅准则(least squares criterion)普通最⼩平⽅(ordinary least squares,OLS)最⼩平⽅(least squares)最⼩平⽅法(least squares method)第7章简单总体(simple population)复合总体(combined population)个体指数:价⽐(price relative),量⽐(quantity relative)总指数(general index)(combined index)统计指数(statistical indices)类指数、组指数(class index)动态指数(dynamic index)⽐较指数(comparison index)计划完成指数(index of fulfilling plan)数量指标指数(quantitative indicator index)物量指数(quantitative index)(quantity index)(quantum index)质量指标指数(qualitative indicator index)价格指数、物价指数(price index)综合指数(aggregative index)(composite index)拉斯贝尔指数(Laspeyres’ index)派许指数(Paasche’s index)阿斯·杨指数(Arthur Young’s index)马歇尔—埃奇沃斯指数(Marshall-Edgeworth’s index)理想指数(ideal index)加权综合指数(weighted aggregate index)平均指数(average index)加权算术平均指数(weighted arithmetic average index)加权调和平均指数(weighted harmonic average index)因⼦互换(factor-reversal)购买⼒平价(purchasing power parity,PPP)环⽐指数(chain index)定基指数(fixed base index)连环替代因素分析法(factor analysis by chain substitution method)不变结构指数、固定构成指数(index of invariable construction)结构指数、结构影响指数(structural index)第8章截⾯数据(cross-section data)时序数据(time series data)动态数据(dynamic data)时间数列(time series)发展⽔平(level of development)基期⽔平(level of base period)报告期⽔平(level of given period)平均发展⽔平(average level of development)序时平均数(chronological average)增长量(growth quantity)平均增长量(average growth amount)发展速度(speed of development)增长速度(speed of growth)增长率(growth rate)环⽐发展速度(chained speed of development)定基发展速度(fixed base speed of development)环⽐增长速度(chained growth speed)定基增长速度(fixed base growth speed)平均发展速度(average speed of development)平均增长速度(average speed of growth)平均增长率(average growth rate)算术图(arithmetic chart)半对数图(semilog graph)时间数列散点图(scatter diagram of time series)时间数列折线图(broken line graph of time series)⽔平型时间数列(horizontal patterns in time series data)趋势型时间数列(trend patterns in time series data)季节型时间数列(season patterns in time series data)趋势—季节型时间数列(trend-season patterns in time series data)⼀次指数平滑平均数(simple exponential smoothing mean)⼀次指数平滑法(simple exponential smoothing method)最⼩平⽅法(leas square method)最⼩平⽅准则(least squares criterion)原资料平均法(average of original data method)季节模型(seasonal model)(seasonal pattern)长期趋势(secular trends)季节变动(变差)(seasonal variation)季节波动(seasonal fluctuations)不规则变动(变差)(erratic variation)不规则波动(random fluctuations)时间数列加法模型(additive model of time series)时间数列乘法模型(multiplicative model of time series)。
贝叶斯混合效应模型1. 引言贝叶斯混合效应模型(Bayesian Mixed Effects Model)是一种用于统计建模的方法,常用于分析具有层次结构和重复测量的数据。
该模型结合了贝叶斯统计学和混合效应模型的思想,能够对个体差异和群体差异进行建模,并通过后验分布进行参数估计。
本文将介绍贝叶斯混合效应模型的基本概念、建模步骤以及在实际数据分析中的应用。
同时还将讨论该模型的优点和限制,并给出一些相关资源供读者进一步学习和探索。
2. 贝叶斯统计学基础在介绍贝叶斯混合效应模型之前,我们先来回顾一下贝叶斯统计学的基本概念。
2.1 贝叶斯公式贝叶斯公式是贝叶斯统计学的核心思想,它描述了如何根据观察到的数据更新对参数的信念。
设θ为待估参数,x为观测到的数据,则根据贝叶斯公式,后验概率可以表示为:P(θ|x)=P(x|θ)P(θ)P(x)其中,P(x|θ)为似然函数,表示在给定参数θ的情况下观测到数据x的概率;P(θ)为先验概率,表示对参数θ的先前信念;P(x)为边缘概率,表示观测到数据x的概率。
2.2 贝叶斯模型贝叶斯统计学将参数视为随机变量,并引入先验分布来描述对参数的不确定性。
在贝叶斯模型中,我们可以通过似然函数和先验分布来计算后验分布,从而得到关于参数的更准确的推断。
常见的贝叶斯模型包括线性回归模型、混合效应模型等。
其中,混合效应模型是一种广泛应用于多层次数据分析中的方法。
3. 混合效应模型基础混合效应模型(Mixed Effects Model),也称为多层次线性模型(Hierarchical Linear Model),是一种用于分析具有层次结构和重复测量的数据的统计建模方法。
3.1 模型结构混合效应模型将数据分为不同层次,并假设每个层次具有不同的随机效应。
模型的基本结构可以表示为:y ij=X ijβ+Z ij b i+ϵij其中,y ij表示第i个个体在第j个层次上的观测值;X ij和Z ij分别为固定效应和随机效应的设计矩阵;β为固定效应系数;b i为第i个个体的随机效应;ϵij为误差项。
APPLICATION OF BAYESIAN REGULARIZED BP NEURALNETWORK MODEL FOR TREND ANALYSIS,ACIDITY ANDCHEMICAL COMPOSITION OF PRECIPITATION IN NORTHCAROLINAMIN XU1,GUANGMING ZENG1,2,∗,XINYI XU1,GUOHE HUANG1,2,RU JIANG1and WEI SUN21College of Environmental Science and Engineering,Hunan University,Changsha410082,China;2Sino-Canadian Center of Energy and Environment Research,University of Regina,Regina,SK,S4S0A2,Canada(∗author for correspondence,e-mail:zgming@,ykxumin@,Tel.:86–731-882-2754,Fax:86-731-882-3701)(Received1August2005;accepted12December2005)Abstract.Bayesian regularized back-propagation neural network(BRBPNN)was developed for trend analysis,acidity and chemical composition of precipitation in North Carolina using precipitation chemistry data in NADP.This study included two BRBPNN application problems:(i)the relationship between precipitation acidity(pH)and other ions(NH+4,NO−3,SO2−4,Ca2+,Mg2+,K+,Cl−and Na+) was performed by BRBPNN and the achieved optimal network structure was8-15-1.Then the relative importance index,obtained through the sum of square weights between each input neuron and the hidden layer of BRBPNN(8-15-1),indicated that the ions’contribution to the acidity declined in the order of NH+4>SO2−4>NO−3;and(ii)investigations were also carried out using BRBPNN with respect to temporal variation of monthly mean NH+4,SO2−4and NO3−concentrations and their optimal architectures for the1990–2003data were4-6-1,4-6-1and4-4-1,respectively.All the estimated results of the optimal BRBPNNs showed that the relationship between the acidity and other ions or that between NH+4,SO2−4,NO−3concentrations with regard to precipitation amount and time variable was obviously nonlinear,since in contrast to multiple linear regression(MLR),BRBPNN was clearly better with less error in prediction and of higher correlation coefficients.Meanwhile,results also exhibited that BRBPNN was of automated regularization parameter selection capability and may ensure the excellentfitting and robustness.Thus,this study laid the foundation for the application of BRBPNN in the analysis of acid precipitation.Keywords:Bayesian regularized back-propagation neural network(BRBPNN),precipitation,chem-ical composition,temporal trend,the sum of square weights1.IntroductionCharacterization of the chemical nature of precipitation is currently under con-siderable investigations due to the increasing concern about man’s atmospheric inputs of substances and their effects on land,surface waters,vegetation and mate-rials.Particularly,temporal trend and chemical composition has been the subject of extensive research in North America,Canada and Japan in the past30years(Zeng Water,Air,and Soil Pollution(2006)172:167–184DOI:10.1007/s11270-005-9068-8C Springer2006168MIN XU ET AL.and Flopke,1989;Khawaja and Husain,1990;Lim et al.,1991;Sinya et al.,2002; Grimm and Lynch,2005).Linear regression(LR)methods such as multiple linear regression(MLR)have been widely used to develop the model of temporal trend and chemical composition analysis in precipitation(Sinya et al.,2002;George,2003;Aherne and Farrell,2002; Christopher et al.,2005;Migliavacca et al.,2004;Yasushi et al.,2001).However, LR is an“ill-posed”problem in statistics and sometimes results in the instability of the models when trained with noisy data,besides the requirement of subjective decisions to be made on the part of the investigator as to the likely functional (e.g.nonlinear)relationships among variables(Burden and Winkler,1999;2000). On the other hand,recently,there has been increasing interest in estimating the uncertainties and nonlinearities associated with impact prediction of atmospheric deposition(Page et al.,2004).Besides precipitation amount,human activities,such as local and regional land cover and emission sources,the actual role each plays in determining the concentration at a given location is unknown and uncertain(Grimm and Lynch,2005).Therefore,it is of much significance that the model of temporal variation and precipitation chemistry is efficient,gives unambiguous models and doesn’t depend upon any subjective decisions about the relationships among ionic concentrations.In this study,we propose a Bayesian regularized back-propagation neural net-work(BRBPNN)to overcome MLR’s deficiencies and investigate nonlinearity and uncertainty in acid precipitation.The network is trained through Bayesian reg-ularized methods,a mathematical process which converts the regression into a well-behaved,“well-posed”problem.In contrast to MLR and traditional neural networks(NNs),BRBPNN has more performance when the relationship between variables is nonlinear(Sovan et al.,1996;Archontoula et al.,2003)and more ex-cellent generalizations because BRBPNN is of automated regularization parameter selection capability to obtain the optimal network architecture of posterior distri-bution and avoid over-fitting problem(Burden and Winkler,1999;2000).Thus,the main purpose of our paper is to apply BRBPNN method to modeling the nonlinear relationship between the acidity and chemical compositions of precipitation and improve the accuracy of monthly ionic concentration model used to provide pre-cipitation estimates.And both of them are helpful to predict precipitation variables and interpret mechanisms of acid precipitation.2.Theories and Methods2.1.T HEORY OF BAYESIAN REGULARIZED BP NEURAL NETWORK Traditional NN modeling was based on back-propagation that was created by gen-eralizing the Widrow-Hoff learning rule to multiple-layer networks and nonlinear differentiable transfer monly,a BPNN comprises three types ofAPPLICATION OF BAYESIAN REGULARIZED BP NEURAL NETWORK MODEL 169Hidden L ayerInput a 1=tansig(IW 1,1p +b 1 ) Output L ayer a 2=pu relin(LW 2,1a 1+b 2)Figure 1.Structure of the neural network used.R =number of elements in input vector;S =number of hidden neurons;p is a vector of R input elements.The network input to the transfer function tansig is n 1and the sum of the bias b 1.The network output to the transfer function purelin is n 2and the sum of the bias b 2.IW 1,1is input weight matrix and LW 2,1is layer weight matrix.a 1is the output of the hidden layer by tansig transfer function and y (a 2)is the network output.neuron layers:an input layer,one or several hidden layers and an output layer comprising one or several neurons.In most cases only one hidden layer is used (Figure 1)to limit the calculation time.Although BPNNs with biases,a sigmoid layer and a linear output layer are capable of approximating any function with a finite number of discontinuities (The MathWorks,),we se-lect tansig and pureline transfer functions of MATLAB to improve the efficiency (Burden and Winkler,1999;2000).Bayesian methods are the optimal methods for solving learning problems of neural network,which can automatically select the regularization parameters and integrates the properties of high convergent rate of traditional BPNN and prior information of Bayesian statistics (Burden and Winkler,1999;2000;Jouko and Aki,2001;Sun et al.,2005).To improve generalization ability of the network,the regularized training objective function F is denoted as:F =αE w +βE D (1)where E W is the sum of squared network weights,E D is the sum of squared net-work errors,αand βare objective function parameters (regularization parameters).Setting the correct values for the objective parameters is the main problem with im-plementing regularization and their relative size dictates the emphasis for training.Specially,in this study,the mean square errors (MSE)are chosen as a measure of the network training approximation.Set a desired neural network with a training data set D ={(p 1,t 1),(p 2,t 2),···,(p i ,t i ),···,(p n ,t n )},where p i is an input to the network,and t i is the corresponding target output.As each input is applied to the network,the network output is compared to the target.And the error is calculated as the difference between the target output and the network output.Then170MIN XU ET AL.we want to minimize the average of the sum of these errors(namely,MSE)through the iterative network training.MSE=1nni=1e(i)2=1nni=1(t(i)−a(i))2(2)where n is the number of sample set,e(i)is the error and a(i)is the network output.In the Bayesian framework the weights of the network are considered random variables and the posterior distribution of the weights can be updated according to Bayes’rule:P(w|D,α,β,M)=P(D|w,β,M)P(w|α,M)P(D|α,β,M)(3)where M is the particular neural network model used and w is the vector of net-work weights.P(w|α,M)is the prior density,which represents our knowledge of the weights before any data are collected.P(D|w,β,M)is the likelihood func-tion,which is the probability of the data occurring,given that the weights w. P(D|α,β,M)is a normalization factor,which guarantees that the total probability is1.Thus,we havePosterior=Likelihood×PriorEvidence(4)Likelyhood:A network with a specified architecture M and w can be viewed as making predictions about the target output as a function of input data in accordance with the probability distribution:P(D|w,β,M)=exp(−βE D)Z D(β)(5)where Z D(β)is the normalization factor:Z D(β)=(π/β)n/2(6) Prior:A prior probability is assigned to alternative network connection strengths w,written in the form:P(w|α,M)=exp(−αE w)Z w(α)(7)where Z w(α)is the normalization factor:Z w(α)=(π/α)K/2(8)APPLICATION OF BAYESIAN REGULARIZED BP NEURAL NETWORK MODEL171 Finally,the posterior probability of the network connections w is:P(w|D,α,β,M)=exp(−(αE w+βE D))Z F(α,β)=exp(−F(w))Z F(α,β)(9)Setting regularization parametersαandβ.The regularization parameters αandβdetermine the complexity of the model M.Now we apply Bayes’rule to optimize the objective function parametersαandβ.Here,we haveP(α,β|D,M)=P(D|α,β,M)P(α,β|M)P(D|M)(10)If we assume a uniform prior density P(α,β|M)for the regularization parame-tersαandβ,then maximizing the posterior is achieved by maximizing the likelihood function P(D|α,β,M).We also notice that the likelihood function P(D|α,β,M) on the right side of Equation(10)is the normalization factor for Equation(3). According to Foresee and Hagan(1997),we have:P(D|α,β,M)=P(D|w,β,M)P(w|α,M)P(w|D,α,β,M)=Z F(α,β)Z w(α)Z D(β)(11)In Equation(11),the only unknown part is Z F(α,β).Since the objective function has the shape of a quadratic in a small area surrounding the minimum point,we can expand F(w)around the minimum point of the posterior density w MP,where the gradient is zero.Solving for the normalizing constant yields:Z F(α,β)=(2π)K/2det−1/2(H)exp(−F(w MP))(12) where H is the Hessian matrix of the objective function.H=β∇2E D+α∇2E w(13) Substituting Equation(12)into Equation(11),we canfind the optimal values for αandβ,at the minimum point by taking the derivative with respect to each of the log of Equation(11)and set them equal to zero,we have:αMP=γ2E w(w MP)andβMP=n−γ2E D(w MP)(14)whereγ=K−αMP trace−1(H MP)is the number of effective parameters;n is the number of sample set and K is the total number of parameters in the network. The number of effective parameters is a measure of how many parameters in the network are effectively used in reducing the error function.It can range from zero to K.After training,we need to do the following checks:(i)Ifγis very close to172MIN XU ET AL.K,the network may be not large enough to properly represent the true function.In this case,we simply add more hidden neurons and retrain the network to make a larger network.If the larger network has the samefinalγ,then the smaller network was large enough;and(ii)if the network is sufficiently large,then a second larger network will achieve comparable values forγ.The Bayesian optimization of the regularization parameters requires the com-putation of the Hessian matrix of the objective function F(w)at the minimum point w MP.To overcome this problem,the Gauss-Newton approximation to Hessian ma-trix has been proposed by Foresee and Hagan(1997).Here are the steps required for Bayesian optimization of the regularization parameters:(i)Initializeα,βand the weights.After thefirst training step,the objective function parameters will recover from the initial setting;(ii)Take one step of the Levenberg-Marquardt algorithm to minimize the objective function F(w);(iii)Computeγusing the Gauss-Newton approximation to Hessian matrix in the Levenberg-Marquardt training algorithm; (iv)Compute new estimates for the objective function parametersαandβ;And(v) now iterate steps ii through iv until convergence.2.2.W EIGHT CALCULATION OF THE NETWORKGenerally,one of the difficult research topics of BRBPNN model is how to obtain effective information from a neural network.To a certain extent,the network weight and bias can reflect the complex nonlinear relationships between input variables and output variable.When the output layer only involves one neuron,the influences of input variables on output variable are directly presented in the influences of input parameters upon the network.Simultaneously,in case of the connection along the paths from the input layer to the hidden layer and along the paths from the hidden layer to the output layer,it is attempted to study how input variables react to the hidden layer,which can be considered as the impacts of input variables on output variable.According to Joseph et al.(2003),the relative importance of individual input variable upon output variable can be expressed as:I=Sj=1ABS(w ji)Numi=1Sj=1ABS(w ji)(15)where w ji is the connection weight from i input neuron to j hidden neuron,ABS is an absolute function,Num,S are the number of input variables and hidden neurons, respectively.2.3.M ULTIPLE LINEAR REGRESSIONThis study attempts to ascertain whether BRBPNN are preferred to MLR models widely used in the past for temporal variation of acid precipitation(Buishand et al.,APPLICATION OF BAYESIAN REGULARIZED BP NEURAL NETWORK MODEL173 1988;Dana and Easter,1987;MAP3S/RAINE,1982).MLR employs the following regression model:Y i=a0+a cos(2πi/12−φ)+bi+cP i+e i i=1,2,...12N(16) where N represents the number of years in the time series.In this case,Y i is the natural logarithm of the monthly mean concentration(mg/L)in precipitation for the i th month.The term a0represents the intercept.P i represents the natural logarithm of the precipitation amount(ml)for the i th month.The term bi,where i(month) goes from1to12N,represents the monotonic trend in concentration in precipitation over time.To facilitate the estimation of the coefficients a0,a,b,c andφfollowing Buishand et al.(1988)and John et al.(2000),the reparameterized MLR model was established and thefinal form of Equation(16)becomes:Y i=a0+αcos(2πi/12)+βsin(2πi/12)+bi+cP i+e i i=1,2,...12N(17)whereα=a cosϕandβ=a sinϕ.a0,α,β,b and c of the regression coefficients in Equation(17)are estimated using ordinary least squares method.2.4.D ATA SET SELECTIONPrecipitation chemistry data used are derived from NADP(the National At-mospheric Deposition Program),a nationwide precipitation collection network founded in1978.Monthly precipitation information of nine species(pH,NH+4, NO−3,SO2−4,Ca2+,Mg2+,K+,Cl−and Na+)and precipitation amount in1990–2003are collected in Clinton Crops Research Station(NC35),North Carolina, rmation on the data validation can be found at the NADP website: .The BRBPNN advantages are that they are able to produce models that are robust and well matched to the data.At the end of training,a Bayesian regularized neural network has the optimal generalization qualities and thus there is no need for a test set(MacKay,1992;1995).Husmeier et al.(1999)has also shown theoretically and by example that in a Bayesian regularized neural network,the training and test set performance do not differ significantly.Thus,this study needn’t select the test set and only the training set problem remains.i.Training set of BRBPNN between precipitation acidity and other ions With regard to the relationship between precipitation acidity and other ions,the input neurons are taken from monthly concentrations of NH+4,NO−3,SO2−4,Ca2+, Mg2+,K+,Cl−and Na+.And precipitation acidity(pH)is regarded as the output of the network.174MIN XU ET AL.ii.Training set of BRBPNN for temporal trend analysisBased on the weight calculations of BRBPNN between precipitation acidity and other ions,this study will simulate temporal trend of three main ions using BRBPNN and MLR,respectively.In Equation(17)of MLR,we allow a0,α,β,b and c for the estimated coefficients and i,P i,cos(2πi/12),and sin(2πi/12)for the independent variables.To try to achieve satisfactoryfitting results of BRBPNN model,we similarly employ four unknown items(i,P i,cos(2πi/12),and sin(2πi/12))as the input neurons of BRBPNN,the availability of which will be proved in the following. 2.5.S OFTWARE AND METHODMLR is carried out through SPSS11.0software.BRBPNN is debugged in neural network toolbox of MATLAB6.5for the algorithm described in Section2.1.Concretely,the BRBPNN algorithm is implemented through“trainbr”network training function in MATLAB toolbox,which updates the weight and bias according to Levenberg-Marquardt optimization.The function minimizes both squared errors and weights,provides the number of network parameters being effectively used by the network,and then determines the correct combination so as to produce a network that generalizes well.The training is stopped if the maximum number of epochs is reached,the performance has been minimized to a suitable small goal, or the performance gradient falls below a suitable target.Each of these targets and goals is set at the default values by MATLAB implementation if we don’t want to set them artificially.To eliminate the guesswork required in determining the optimum network size,the training should be carried out many times to ensure convergence.3.Results and Discussions3.1.C ORRELATION COEFfiCIENTS OF PRECIPITATION IONSFrom Table I it shows the correlation coefficients for the ion components and precipitation amount in NC35,which illustrates that the acidity of precipitation results from the integrative interactions of anions and cations and mainly depends upon four species,i.e.SO2−4,NO−3,Ca2+and NH+4.Especially,pH is strongly correlated with SO2−4and NO−3and their correlation coefficients are−0.708and −0.629,respectively.In addition,it can be found that all the ionic species have a negative correlation with precipitation amount,which accords with the theory thatthe higher the precipitation amount,the lower the ionic concentration(Li,1999).3.2.R ELATIONSHIP BETWEEN PH AND CHEMICAL COMPOSITIONS3.2.1.BRBPNN Structure and RobustnessFor the BRBPNN of the relationship between pH and chemical compositions,the number of input neurons is determined based on that of the selected input variables,APPLICATION OF BAYESIAN REGULARIZED BP NEURAL NETWORK MODEL175TABLE ICorrelation coefficients of precipitation ionsPrecipitation Ions Ca2+Mg2+K+Na+NH+4NO−3Cl−SO2−4pH amountCa2+ 1.0000.4620.5480.3490.4490.6270.3490.654−0.342−0.369Mg2+ 1.0000.3810.9800.0510.1320.9800.1230.006−0.303K+ 1.0000.3200.2480.2260.3270.316−0.024−0.237Na+ 1.000−0.0310.0210.9920.0210.074−0.272NH+4 1.0000.7330.0110.610−0.106−0.140NO−3 1.0000.0500.912−0.629−0.258Cl− 1.0000.0490.075−0.265SO2−4 1.000−0.708−0.245pH 1.0000.132 Precipitation 1.000 amountcomprising eight ions of NH+4,NO−3,SO2−4,Ca2+,Mg2+,K+,Cl−and Na+,and the output neuron only includes pH.Generally,the number of hidden neurons for traditional BPNN is roughly estimated through investigating the effects of the repeatedly trained network.But,BRBPNN can automatically search the optimal network parameters in posterior distribution(MacKay,1992;Foresee and Hagan, 1997).Based on the algorithm of Section2.1and Section2.5,the“trainbr”network training function is used to implement BRBPNNs with a tansig hidden layer and a pureline output layer.To acquire the optimal architecture,the BRBPNNs are trained independently20times to eliminate spurious effects caused by the random set of initial weights and the network training is stopped when the maximum number of repetitions reaches3000epochs.Add the number of hidden neurons(S)from1to 20and retrain BRBPNNs until the network performance(the number of effective parameters,MSE,E w and E D,etc.)remains approximately the same.In order to determine the optimal BRBPNN structure,Figure2summarizes the results for training many different networks of the8-S-1architecture for the relationship between pH and chemical constituents of precipitation.It describes MSE and the number of effective parameters changes along with the number of hidden neurons(S).When S is less than15,the number of effective parameters becomes bigger and MSE becomes smaller with the increase of S.But it is noted that when S is larger than15,MSE and the number of effective parameters is roughly constant with any network.This is the minimum number of hidden neurons required to properly represent the true function.From Figure2,the number of hidden neurons (S)can increase until20but MSE and the number of effective parameters are still roughly equal to those in the case of the network with15hidden neurons,which suggests that BRBPNN is robust.Therefore,using BPBRNN technique,we can determine the optimal size8-15-1of neural network.176MIN XU ET AL.Figure2.Changes of optimal BRBPNNs along with the number of hidden neurons.parison of calculations between BRBPNN(8-15-1)and MLR.3.2.2.Prediction Results ComparisonFigure3illustrates the output response of the BRBPNN(8-15-1)with a quite goodfit.Obviously,the calculations of BRBPNN(8-15-1)have much higher correlationcoefficient(R2=0.968)and more concentrated near the isoline than those of MLR. In contrast to the previous relationships between the acidity and other ions by MLR,most of average regression R2achieves less than0.769(Yu et al.,1998;Baez et al.,1997;Li,1999).Additionally,Figures2and3show that any BRBPNN of8-S-1architecture hasbetter approximating qualities.Even if S is equal to1,MSE of BRBPNN(8-1-1)ismuch smaller and superior than that of MLR.Thus,we can judge that there havebeen strong nonlinear relationships between the acidity and other ion concentration,which can’t be explained by MLR,and that it may be quite reasonable to apply aAPPLICATION OF BAYESIAN REGULARIZED BP NEURAL NETWORK MODEL177TABLE IISum of square weights(SSW)and the relative importance(I)from input neurons to hidden layer Ca2+Mg2+K+Na+NH+4NO−3Cl−SO2−4 SSW 2.9589 2.7575 1.74170.880510.4063 4.0828 1.3771 5.2050 I(%)10.069.38 5.92 2.9935.3813.88 4.6817.70neural network methodology to interpret nonlinear mechanisms between the acidity and other input variables.3.2.3.Weight Interpretation for the Acidity of PrecipitationTo interpret the weight of the optimal BRBPNN(8-15-1),Equation(15)is used to evaluate the significance of individual input variable and the calculations are illustrated in Table II.In the eight inputs of BRBPNN(8-15-1),comparatively, NH+4,SO2−4,NO−3,Ca2+and Mg2+have greater impacts upon the network and also indicates thesefive factors are of more significance for the acidity.From Table II it shows that NH+4contributes by far the most(35.38%)to the acidity prediction, while SO2−4and NO−3contribute with17.70%and13.88%,respectively.On the other hand,Ca2+and Mg2+contribute10.06%and9.38%,respectively.3.3.T EMPORAL TREND ANALYSIS3.3.1.Determination of BRBPNN StructureUniversally,there have always been lowfitting results in the analysis of temporal trend estimation in precipitation.For example,the regression R2of NH+4and NO−3 for Vhesapeake Bay Watershed in Grimma and Lynch(2005)are0.3148and0.4940; and the R2of SO2−4,NH+4and NO−3for Japan in Sinya et al.(2002)are0.4205, 0.4323and0.4519,respectively.This study also applies BRBPNN to estimate temporal trend of precipitation chemistry.According to the weight results,we select NH+4,SO2−4and NO−3to predict temporal trends using BRBPNN.Four unknown items(i,P i,cos(2πi/12),and sin(2πi/12))in Equation(17)are assumed as input neurons of BRBPNNs.Spe-cially,two periods(i.e.1990–1996and1990–2003)of input variables for NH+4 temporal trend using BRBPNN are selected to compare with the past MLR results of NH+4trend analysis in1990–1996(John et al.,2000).Similar to Figure2with training20times and3000epochs of the maximum number of repetitions,Figure4summarizes the results for training many different networks of the4-S-1architecture to approximate temporal variation for three ions and shows the process of MSE and the number of effective parameters along with the number of hidden neurons(S).It has been found that MSE and the number of effective parameters converge and stabilize when S of any network gradually increases.For the1990–2003data,when the number of hidden neurons(S)can178MIN XU ET AL.Figure4.Changes of optimal BRBPNNs along with the number of hidden neurons for different ions.∗a:the period of1990–2003;b:the period of1990–1996.increase until10,we canfind the minimum number of hidden neurons required to properly represent the accurate function and achieve satisfactory results are at least 6,6and4for trend analysis of NH+4,SO2−4and NO−3,respectively.Thus,the best BRBPNN structures of NH+4,SO2−4and NO−3are4-6-1,4-6-1,4-4-1,respectively. Additionally for NH+4data in1990–1996,the optimal one is BRBPNN(4-10-1), which differs from BRBPNN(4-6-1)of the1990–2003data and also indicates that the optimal BRBPNN architecture would change when different data are inputted.parison between BRBPNN and MLRFigure5–8summarize the comparison results of the trend analysis for different ions using BRBPNN and MLR,respectively.In particular,for Figure5,John et al. (2000)examines the R2of NH+4through MLR Equation(17)is just0.530for the 1990–1996data in NC35.But if BRBPNN method is utilized to train the same1990–1996data,R2can reach0.760.This explains that it is indispensable to consider the characteristics of nonlinearity in the NH+4trend analysis,which can make up the insufficiencies of MLR to some extent.Figure6–8demonstrate the pervasive feasibility and applicability of BRBPNN model in the temporal trend analysis of NH+4,SO2−4and NO−3,which reflects nonlinear properties and is much more precise than MLR.3.3.3.Temporal Trend PredictionUsing the above optimal BRBPNNs of ion components,we can obtain the optimal prediction results of ionic temporal trend.Figure9–12illustrate the typical seasonal cycle of monthly NH+4,SO2−4and NO−3concentrations in NC35,in agreement with the trend of John et al.(2000).APPLICATION OF BAYESIAN REGULARIZED BP NEURAL NETWORK MODEL179parison of NH+4calculations between BRBPNN(4-10-1)and MLR in1990–1996.parison of NH+4calculations between BRBPNN(4-6-1)and MLR in1990–2003.parison of SO2−4calculations between BRBPNN(4-6-1)and MLR in1990–2003.Based on Figure9,the estimated increase of NH+4concentration in precipita-tion for the1990–1996data corresponds to the annual increase of approximately 11.12%,which is slightly higher than9.5%obtained by MLR of John et al.(2000). Here,we can confirm that the results of BRBPNN are more reasonable and im-personal because BRBPNN considers nonlinear characteristics.In contrast with180MIN XU ET AL.parison of NO−3calculations between BRBPNN(4-4-1)and MLR in1990–2003Figure9.Temporal trend in the natural log(logNH+4)of NH+4concentration in1990–1996.∗Dots (o)represent monitoring values.The solid and dashed lines respectively represent predicted values and estimated trend given by BRBPNN method.Figure10.Temporal trend in the natural log(logNH+4)of NH+4concentration in1990–2003.∗Dots (o)represent monitoring values.The solid and dashed lines respectively represent predicted values and estimated trend given by BRBPNN method.。
计算机研究与发展ISSN 1000 1239/CN 11 1777/T PJournal of Computer Research and Development 42(9):1527~1532,2005收稿日期:2003-11-13;修回日期:2004-04-05基金项目:国家自然科学基金项目(60175011,60375011);安徽省自然科学基金项目(03042207);安徽省优秀青年科技基金项目(04042044)Bayes 网络推理结论的解释机制研究汪荣贵 张佑生 高 隽 彭青松(合肥工业大学计算机与信息学院 合肥 230009)(wangrgui@mail hf ah cn)Research on Explanation Function for Reason C onclusions with Bayesian NetworkWang Ronggui,Zhang Yousheng,Gao Jun,and Peng Qingsong(College of Comp uter and I nf or mation ,H ef ei Univer sity of T echnology ,H ef ei 230009)Abstract In this paper,an explanation function about Bayesian network is presented With it,evidences effect deg ree,direction and paths on reason conclusion can be ex plained Necessity factor and sufficiency factor are designed as a measure approach,to valuate evidences effect degree on posteriori distributions By the w ay of qualitatively analysis the character of netw ork structure,notes relative to reason conclusion are find out Based on those notes,and combined w ith the quantitatively analysis,sub chains w hich consist of effect paths are found out,too Those sub chains are valuated to create and explain the effect paths Ex per im ent results show the effectiveness of the ex plain functionKey words Bayesian network;posteriori distribution;1 normal;effect degree;effect direction;effect path摘 要提出一种关于Bayes 网络的解释机制,用于解释证据对推理结论的作用程度、方向及路径 引入必要性和充分性因子作为度量来评价证据对推理结论的作用程度;通过定性分析网络结构特点,找出与推理结论有关的节点,在此基础上,结合定量分析找出组成作用路径的子链,并分析这些子链对推理结论的作用,由此生成和解释证据对推理结论的作用路径 实验结果验证了方法的有效性关键词Bayes 网络;后验概率分布;1 范数;作用程度;作用方向;作用路径中图法分类号 T P1811 引 言Bayes 网络模型[1,2]的知识表示和推理算法基于联合概率分布,不能像MYCIN 等系统那样通过翻译推理链自动生成解释[3,4] Bayes 网络推理结论的解释机制的研究正在形成一个热点[5]Bayes 网络推理的解释方法研究的核心问题在于找到适当的方法评价证据对兴趣节点的后验概率分布的作用大小及路径[2,6] 文本提出一种称为删除法的技术来研究这个问题,在此基础上建立一种Bayes 网络推理的解释方法,实验结果验证了方法的有效性2 证据对结论的作用程度Bayes 网络推理就是在已知证据节点集合E 的取值状态下,算出网络中其余节点(非证据节点)X 的概率分布,即计算后验概率分布P (X |E ) 现在提出一种称为删除法的技术来分析证据对推理结论的作用程度,就是当考察网络结构中的某个(些)证据对推理结论的作用时,就将它(们)从证据集合中删除(即在Bayes 网络模型中将其看成非证据节点),然后计算推理结论的变化 对于Bayes 网络模型中某一特定的非证据节点X ,在证据集合为E 时的推理结论为后验概率分布P(X |E ) 考察E 中的某一证据L 对P (X |E )的作用 从E 中删除L ,得到的集合为E -L,相应的后验概率分布为P (X |E -L ) 使用向量差P (X |E )-P (X |E -L )的1 范数来度量两者间的差异,并记为M (P (X |E ),P (X |E-L)),即M (P (X |E ),P (X |E -L ))=|p i -q i |,(1)其中,P (X =x i |E )=p i ,P (X |E -L )=q i ,i =1,2,!,m ;m 是X 的状态数令 (L ,X )=M (P (X |E ),P (X |E -L))M (P (X |E ),P (X )), (L ,X )为L 对X 的必要性因子 它表示L 的必要性程度在证据集合E 中所占的比例 令 (L ,X )的阈值 =1 (|E |+1),|E |表示集合E 的基数 若 (L ,X )> ,则表明L 对推理结论P(X |E )具有显著的必要性图1是关于大学生学习成绩简化的Bayes 网络,称之为成绩网络 它由一个有向无环图和6个边际或条件概率矩阵组成 各变量的名称分别为智商(X 1)、努力程度(X 2)、应试能力(X 3)、知识掌握程度(X 4)、考试成绩(X 5)与作业成绩(X 6) 网络结构中蕴含的独立性及条件独立性有P (X 1,X 2)=P (X 1)P (X 2);P (X 3|X 1,X 2)=P (X 3|X 1);P (X 4|X 1,X 2,X 3)=P (X 4|X 1,X 2);P (X 5|X 1,X 2,X 3,X 4)=P (X 5|X 3,X 4);P (X 6|X 1,X 2,X 3,X 4,X 5)=P (X 6|X 4) 相应的知识表示为 P(X 1,X 2,X 3,X 4,X 5,X 6)=P (X 1)P (X 2)P (X 3|X 1)P(X 4|X 1,X 2)P(X 5|X 3,X 4)P (X 6|X 4)F ig 1 Bayesian networ ks for student sco re图1 学生学习成绩的Bay es 网络令证据集合E ={X 1=∀高#,X 2=∀勤奋#},需要解释的是X 5的后验概率分布P (X 5|E ) 若要考察E 中的证据∀X 2=勤奋#,对P (X 5|E )的作用,则取L =∀X 2=勤奋#,E -L ={X 1=∀高#} 计算结果(表1)表明,要取得优良的成绩,高智商和勤奋都很重要Table 1 The Necessity Factor of P(X 5|E )for L表1 L 对P (X 5|E)的必要性因子Probabilistic Distribution Probabilistic Distribution of X 5A B C D E 1 Normal Effect Factor(%)P (X 5|E )0 55290 28880 09340 04580 0191P (X 5)0 23980 22260 24720 21470 0756M (P (X 5|E ),P(X 5))0 5896P (X 5|E-L)0 37400 24750 17560 14350 0594M (P (X 5|E ),P(X 5|E-L))0 3014(L ,X 5)51 12表1中的 (L ,X 5)值表示如果不知道X 2的取值状态,对推理结论会造成多大的损失 若 (L ,X )值较大,则说明证据L 在证据集合E 中对形成推理结论P(X |E )的必要性较大,对P (X |E )的作用当然也较大 但是,若 (L ,X )值较小,则并不能由此断定L 对P (X |E )的作用较小 因为L 对推理结论的作用可能与证据集合中其他证据相重叠 如果在原有的证据集合中增加一条证据:作业成绩(X 6)=∀优#,即令证据集合E ={X 1=∀高#,X 2=∀勤奋#,X 6=∀优#},需要解释的是X 5的后验概率分布P (X 5|E ) 若要考察E 中的证据X 2=∀勤奋#对P(X 5|E )的作用,则取L =X 2为该证据,于是E-L={X 1=∀高#,X 6=∀优#} 得到计算结果如表2所示 可以看出,此时L 对X 5的必要性因子 (L ,X 5)较小 这说明X 2的取值状态未知(不知道是否∀勤奋#),对推理结论的作用较小 由于证据1528计算机研究与发展 2005,42(9)∀X6=优#对推理结论的作用与证据∀X2=勤奋#的作用相重叠, (L,X5)较小并不意味着证据∀X2=勤奋#单独对推理结论的作用不重要为消除重叠现象对解释的干扰,现在寻找L对推理结论P(X|E)的充分性程度 若将除L以外的所有证据节点看成非证据节点,则X的后验概率分布从P(X|E)变成P(X|L),相应的必要性因子为 (E-L,X),它度量了除L以外的所有证据E-L对形成推理结论的必要性程度 令(L,X)=1- (E-L,X),称(L,X)为L对X的充分性因子 从表2可以看出,L对X5的充分性因子(L,X5)较大,体现了学习中勤奋的重要性Table2 The Ef fect Factor of P(X5|E)for L 表2 L对P(X5|E)的作用因子Probabilistic DistributionProbabilistic Dis tribution of X5A B C D E1 Normal E ffect Factor(%)P(X5|E)0 57600 29410 08290 03320 0139P(X5)0 23980 22260 24720 21470 0756M(P(X5|E),P(X5))0 8152(L,X5)42 32 P(X5|E-L)0 53640 28490 10100 05480 0228M(P(X5|E),P(X5|E-L))0 0974 (L,X5)11 95 P(X5|L)0 36630 26860 18430 13470 0460M(P(X5|E),P(X5|L))0 4702 (E-L,X5)57 68可以根据表3所示的规则,使用 , (L,X)和(L,X)这3个数据,将证据分为关键证据、必要证据、重要证据、次要证据等4种类型 例如,若取阈值 =1 4,则根据表2、表3得到如下解释:证据集合{智商(X1)=∀高#,努力(X2)=∀勤奋#,作业成绩(X6)=∀优#}中,证据∀努力(X2)=勤奋#是重要证据 ∀努力(X2)=勤奋#与∀作业成绩(X6)=优#的对推理结论的作用有重叠现象 因为若没有证据∀作业成绩(X6)=优#,则∀努力(X2)=勤奋#就有显著的必要性 因此,对于证据集合{智商(X1)=∀高#,努力(X2)=∀勤奋#},∀努力(X2)=勤奋#是关键证据Table3 The Explanation Rule for the Effect Extent to the Inf erence Conclusion表3 对推理结论的作用程度的解释规则(L,X)(L,X)T he Effect Extent of L for X,i e the Effect Extent for the In feren ce Conclusion P(X|E)T ype > > L has important contributi on to inference conclusion,and this kind of contributi on can t be s ubstituted for other evi dences in EKey Evidence> < T hough L itself does not have much contribution to inference conclusion,it is important in E Because if th ere is no L,the contribution of other eviden ces i n E w ill decrease remarkably Necessity Evi dence< > L has important contribution to inference conclusion,w hile part of the contri bution overlap w ith the contributi on of other evidences in E Th en L could be substituted for other evidences in E Important Evi dence< < L i tself does not have much contri bution to inference conclusi on,and it could be substituted forother evidences in E T hen it is unimportant evidence in EM inor Evidence3 证据对结论的作用路径3 1 由P(X|E)生成的Bayes网络根据Bayes网络的结构特征与条件独立性之间的关系,推理结论P(X|E)一般只与网络中部分节点有关 找出证据对推理结论产生影响的路径,只需考察可能与计算P(X|E)有关的节点 对于规模较大的Bayes网络,删除网络中与计算P(X|E)无关的节点可以减少问题的复杂性 不难证明:如果节点N满足如下3个条件之一,那么它与计算P(X| E)无关:∃N不是X或X的某一先驱节点,也不是证据节点或某个证据节点的先驱节点;%N和X被E或E中的元素有向分割(d separation)[7,8];&连接N和X的每条路径中都存在与计算P(X|E)无关的节点删除网络结构中所有与计算P(X|E)无关的节点,就生成一个新的Bayes网络模型,本文称之为由P(X|E)生成的Bayes网络,该Bayes网络的网络结构就是简化后的网络结构,条件概率分布就是原Bayes网络模型中对应的条件概率分布3 2 生成作用路径现在要从由P(X|E)生成的Bayes网络结构中找出所有连接L和X的有向路径(有向链) 所谓1529汪荣贵等:Bayes网络推理结论的解释机制研究连接L和X的有向路径,就是指一些互异的节点{X1,X2,!,X K}组成的节点序列,且满足如下4个条件:∃序列的起点和终点分别对应于L和X,即X1=L,X K=X;%对于X i和X i+1,存在从X i到X i+1,或者从X i+1到X i的有向边;&{X1,X2,!, X K}中每个节点都与计算P(X|E)有关;∋不能含有异于L证据节点 本文设计了如下深度优先穷尽搜索算法,寻找连接L和X的有向路径对于单连通的网络结构,任意两个节点之间只有惟一的连接路径,可以通过分析路径中节点的概率分布的变化,直接生成对该路径的解释 对于多连通网络结构,可能有多个连接L和X的有向路径,它们对P(X|E)的作用程度不一定相同,作用路径间可能存在重叠 为此将连接路径适当分割为若干子链,定量分析每条子链对推理结论P(X|E)的作用程度,生成作用子链 具体的分割算法比较简单,不再赘述 要从这些可能的子链中找出L作用X的作用子链,则需要对这些子链做进一步的定量分析3 3 解释作用路径如果某条作用子链存在于每条连接L与X的有向路径,则该子链显然是L作用X的必经之路,称这种子链为关键子链 对于所有的非关键子链,可以使用删除法来分析它们对推理结论作用程度 如果要考察某条非关键子链对推理结论P(X|E)的作用,就从由P(X|E)生成的Bayes网络结构中删除该链两端点之间的部分,然后通过度量和分析P(X|E)的变化来生成对该链的解释 使用删除法分析子链对推理结论的作用需要处理好如下两个问题:∃从网络结构中删除子链两端点之间的部分,改变了网络结构,子链的某个端点在网络结构中的父节点数可能会减少,其条件概率分布会发生变化,此时如何计算新的条件概率分布;%从网络结构中删除子链,相当于增加了条件独立性假设,改变了知识库的结构,由此可能改变网络节点X的先验概率P(X),如何处理先验概率的这种变化设Bayes网络中某节点A有父节点B0,B1,!, B K,删除A的父节点B0,根据概率的归一性,有:P(A|B1,!,B K)=B0P(A,B0|B1,B2,!,B K)= B0P(B0|B1,B2,!,B K)P(A|B0,B1,B2,!,B K)(3) 可以按上式计算删除B0后的关于节点A的条件概率分布,其中P(A|B0,B1,B2,!,B K)就是删除B0前节点A的条件概率分布,P(B0|B1,B2, !,B K)可由Bay es网络所确定的联合概率分布计算:P(B0|B1,B2,!,B K)=P(B0,B1,B2,!,B K) P(B1,B2,!,B K)从由P(X|E)生成的Bayes网络模型中删除某条作用子链C两端点之间的部分,若端点父节点数发生变化,则可以根据式(3)算出新的条件概率分布,形成一个新的Bayes网络模型 根据该模型可以算出节点X的新的先验概率分布P*(X)和作为推理结论的后验概率分布P*(X|E),进一步算出基于该模型的L对X的充分性因子*(L,X)Table4 The Explanation Rule of Effect Path f or Evidence L to the Influence C onclusion表4 证据L对推理结论作用路径的解释规则d(L,X)M(P,P*)Color Explanati on T ype>!<∀Bulky Black T he effect of the changing for the know ledge database s tructure is not obvious,then difference d(L,X)is mai nly the probabilistic informationpropagated by the sub chain T hen this sub chain transfers mai n probabilistic informationBelong to main path>!>∀Black M(P,P*)being obvious,difference d(L,X)contains part effect ofthe changing of know ledge database structure H ow ever it could not assure the sub chain w i ll transfer mass informationBelong to main path<!<∀Bulky Gray T his sub chain will transfer minor probabilistic information Not belong to main path <!>∀Colorless T he effect of knowledge database changing is obvi ous,and d(L,X)isminor T his situati on i s peculiar使用P(X)与P*(X)之差的1 范数M(P(X), P*(X))度量两者的差异 令d(L,X)=(L,X) -*(L,X),根据如表4所示的解释和表示规则,就可以通过分别考察d(L,X)及M(P(X),P*(X))的大小,生成对作用子链C的解释,并着上相应的颜色 由颜色相同的子链组成的有向路径就是一条作用路径 该作用路径的类型由其颜色确定 表4中的阈值!和∀需要根据经验选取1530计算机研究与发展 2005,42(9)4 应用实例:解释ALAR M 模型的作用路径ALARM 模型是由Beinlich 等人构建的[9],用于监视麻醉状态下病人身体状况及相关医疗设备的工作状态 它有46条边37个节点,包括8个诊断节点(感兴趣的输出节点)、16个证据节点、13个中间节点,图2表示该模型的网络结构,本节用N j 表示图中第j 个节点 如每个节点的医学含义、取值状态、条件或先验概率分布等,可以从网页[10,11]中得到令L =∀N 13=0#,首先找出与计算P (X |E )有关的所有节点,组成由P (X |E )生成的Bayes 网络(如图3(a)所示);然后找出所有连接L 和X 的有向路径,共有3条:{N 13,N 36,N 24};{N 13,N 22,N 35,N 36,N 24};{N 13,N 23,N 35,N 36,N 24} 这些有向路径组成一个关于L 和X 的网络结构,如图3(b)所示;将图3(b )中网络结构分割成若干有向路径的子链,共有5条,它们是C 1={N 36,N 24};C 2={N 13,N 36};C 3={N 35,N 36};C 4={N 13,N 22,N 35};C 5={N 13,N 23,N 35} 节点N 23的概率分布几乎不变,满足命题2的条件,因此C 5不是作用子链 其余4条为所有可能的作用子链,C 1是关键子链;最后,分别计算作用子链C 2,C 3,C 4的(d (L ,X ),M (P (X ),P *(X )))值,计算结果分别为(48 82%,0 0213);(2 35%,0 0324);(2 12%,0 0532),根据表4生成对作用路径的解释,如图3(c)所示证据∀N 13=0#对节点N 24的作用路径有两条:{N 13,N 36,N 24}和{N 13,N 22,N 35,N 36,N 24} 其中{N 13,N 36,N 24}是主要作用路径,{N 13,N 22,N 35,N 36,N 24}是次要作用路径 证据N 13主要通过N 36来作用N 24,而节点N 22,N 35对N 24的作用不大 事实上,节点N 36表示输氧管道的通风状态,与节点N 24的因果关系显然十分密切F ig 3 Finding and Explaining the effect path (a)Nodes r elated to the computing process of P (X |E);(b)Dir ect path connecting L and X ;and (c)T he effect path of L to X图3 寻找并解释作用路径 (a)与计算P(X |E )有关的节点;(b)连接L 和X 的有向路径;(c)L 对X 的作用路径F ig 2 N etw ork structure of the AL ARM model图2 AL ARM 模型的网络结构1531汪荣贵等:Bayes 网络推理结论的解释机制研究5 总 结使用经典的概率理论处理不确定性信息和知识面临着两个主要困难[4]:∃计算量与概率模型的精度之间难以取舍;%概率理论基于公理系统,其推理方式与人的思维方式有较大差别,难以构建基于概率型智能系统的解释机制 因此,人们一般使用广义概率方法(如主观Bayes方法、确定性因子等)或其他的启发式方法(D S证据理论、模糊理论等),来解决智能信息处理中的不确定性信息和知识 Bayes 网络模型通过巧妙使用条件独立性克服了第1个困难[1],使得人们恢复了对经典的概率理论和方法的信心,导致近十几年来智能信息处理中经典的概率理论和方法的复兴 本文的研究表明,基于Bay es网络的智能信息系统可以构造有效的解释机制,解释证据对推理结论作用的程度及路径参考文献1J Pearl Probabilistic Reasoning i n Expert Systems:Netw orks of Plausible Inference San M ateo,CA:M organ Kaufmann,1988 2S L Lauritzen,D J Spiegelhalter Local computati ons with probabilities on graphical structures and their appli cati on to expert systems Journal of the Royal Statistical Society,Series B,1988, 50(2):157~2243C Elsaes s er,et al Explanation of probabi listic inference In: Proc Conf Un certainty in Artificial Intelligence Amsterdam, Holland:Elsevier Science Publishers,1989 319~3284G S hafer,J Pearl,eds Readings in Uncertain Reasoning San M ateo,CA:M organ Kaufmann,19905D M adigan,et al Graphical explanations in belief networks Journal of Computational and Graphic Statistics,1997,6(2): 1601~1816U Chajew ska,J Y Halpern Defining explanation in probabilis tic systems In:Proc 30th Conf Uncertai nty in Artificial Intelli gence San Francisco:M organ Kaufmann,1997 62~717D Geiger,et al D Separation:From th eorems to algori thms T he5th W orkshop on U ncertainty in Artificial Intelligence,Wind sor,Ontari o,19898D Geiger,T Verma,J Pearl Identifying independence in Bayesian netw orks Netw orks,1990,20(2):507~5349I A Bei nli ch,e t al T he ALARM monitoring system:A case study w ith tw o probabilis tic inference techniques for belief n et works In Proc the2n d European Conf Artificial Intelligence i n M edical Care Berlin:S pringer Verlag,1989 247~25610I A Beinlich A logical alarm reduction mechanism http:( ww netlib alarm.htm,2002 1011I A Beinlich A m edical diagnostic alarm message system http: (ww netlib ALARM.dnet,2002 10Wang Ronggui,born in1966 Received hisPh D deg ree in2004 His curr ent researchinter ests include intelligent information processing,know ledge engineer,Bay esian Network,imag e understanding汪荣贵,1966年生,博士,副教授,主要研究方向为智能信息处理、知识工程、Bay es网络、图像理解Zhang Yousheng,born in1941 He hasbeen professor and Ph. D.supervisor ofHefei U niversity of T echnology since1994His current r esearch interests include ar tificial intelligent and its applicat ion in imagerecog nit ion and understanding张佑生,1941年生,教授,博士生导师,主要研究方向为人工智能在图形和图像识别与理解中应用Gao Jun,born in1963 He has been professor and Ph. D.supervisor of Hefei U niv ersity of T echnolog y since2000 H i s current resear ch interests include image processing,pattern recognition,neural netw orks,intelligent information processing高隽,1963年生,教授,博士生导师,主要研究方向为图像处理、模式识别、神经网络理论及应用、光电信息处理、智能信息处理Peng Qingsong,born in1975 Received hisPh D.deg ree in2004 His curr ent researchinter ests include art ificial intelligent and itsapplicatio n in image reco gnition and understanding彭青松,1975年生,博士,主要研究方向为人工智能及其图像识别与理解中应用Research BackgroundBayesian networ k can be used to compute the uncertain information and knowledge T he joint probabilist ic distribution is used to r epresent the knowledge to enhance consistency and to improve the reasonability of the inference conclusion,and conditional indepen dence contained in graphical mo dels is used to decrease the complex ity of joint probabili stic distribution Ho wever the infer ence conclu sion of the Bayesian network i s t he form of posteriori pr obabilistic distr ibut ion,and what s serious is that it could not be ex plained di rectly by translating inference chain o r inference networ k So the ex planation mechanism of the Bay esian netw ork is a research topic t hat is w orthy of researching We have resear ched and r ealized a kind of explanation mechanism and have prov ided new solut ion to study on the ex planation mechanism of Bayesian network T he research is supported by the national science research foundation(No 60175011,No 60375011)1532计算机研究与发展 2005,42(9)。
贝叶斯确证理论及其局限性Chapter 1: Introduction- Background information about Bayes' Theorem- Purpose of the paper- Overview of the main points discussed in the paperChapter 2: Understanding Bayes' Theorem- Explanation of Bayes' Theorem- Use of Bayes' Theorem in decision-making and scientific research- Comparison to frequentist approachChapter 3: Limitations of Bayes' Theorem- Assumptions made in Bayes' Theorem- Probability estimates and subjectivity- Small sample sizes- Dependence on prior knowledgeChapter 4: Applications and Criticisms- Examples of how Bayes' Theorem has been applied in various fields- Criticisms of Bayes' Theorem by scholars- Counterarguments to criticismsChapter 5: Conclusion and Future Directions- Summary of key points in the paper- Implications of limitations of Bayes' Theorem- Potential areas for future research on Bayes' Theorem and related topicsChapter 1: IntroductionBayes' Theorem is a statistical concept widely applied in decision-making, artificial intelligence, machine learning, and scientific research. Bayes' Theorem provides a framework for updating probabilities based on new evidence and prior knowledge. It was named after the eighteenth-century English statistician, Reverend Thomas Bayes, who developed the theory to solve the problem of inverse probability. Since its inception, Bayes' Theorem has been applied in numerous fields to make predictions, inferences, and to uncover causality.The purpose of this paper is to provide an overview of Bayes' Theorem, its applications, and its limitations. The paper will explore the assumptions behind Bayes' Theorem, the subjective nature of probability estimates, and the dependence on prior knowledge. Furthermore, the paper will examine the criticisms leveled at Bayes' Theorem and how these criticisms have been countered. Finally, the paper will dispel some common misconceptions about Bayes' Theorem and suggest areas for future research.Chapter 2: Understanding Bayes' TheoremBayes' Theorem is a mathematical formula that is used to update the probability of an event occurring based on new evidence. It is often used in decision-making, scientific research, and artificial intelligence systems. The fundamental concept behind Bayes' Theorem is conditional probability. Conditional probability describes the probability of an event occurring given that some other event has occurred. For instance, the probability of heavy rain is higher given that there has been an increase in cloud cover.Bayes' Theorem is expressed mathematically as:P(A|B) = P(B|A) P(A) / P(B)Where:P(A) is the prior probability of event A.P(B) is the probability of observing evidence B.P(B|A) is the probability of observing evidence B given that event A has occurred.P(A|B) is the posterior probability of event A after observing evidence B.The numerator of the equation represents the likelihood of the event happening if the evidence is true, while the denominator represents all possible outcomes of the evidence being true. Bayes' Theorem allows us to update our beliefs about the probability of an event after observing new evidence, given what we already know. The advantages of using Bayes' Theorem include:- It allows the incorporation of new data that may render previous conclusions obsolete, thus providing a more accurate estimation. - It is useful in making predictions and inferences about the future. - It can handle complex situations of multiple causes, as long as prior probabilities can be established.Chapter 3: Limitations of Bayes' TheoremDespite its numerous applications, Bayes' Theorem suffers fromlimitations that are important to understand. The following are key limitations of Bayes' Theorem:Assumptions made in Bayes' TheoremThe practical use of Bayes' theorem depends on several assumptions made regarding the nature of the problem being analyzed. These assumptions include:- Independence: It is assumed that the occurrence of one event does not affect the occurrence of another event. Yet, in real-world scenarios, variables are often interrelated.- Stationarity: It is assumed that the probabilities remain constant over time. However, in some instances, the probabilities may change over time.- Normality: The results of a study that uses Bayes' theorem may only be valid if the population distribution is normal. This is not always the case.Subjectivity in Probability EstimatesBayes' theorem requires the estimation of prior and posterior probabilities. In some instances, the prior probability may be subjective. This is particularly true when knowledge and data are limited or unavailable. In such instances, the subjective nature of the prior probability may cause bias in the final estimation. Small Sample SizesBayes' theorem may result in unreliable outcomes if the sample size is too small. In such cases, the estimation is subjected to the uncontrollable statistical fluctuations, which often result in an over-reliance on subjective reasoning.Dependence on Prior KnowledgeBayes' theorem is heavily reliant on prior knowledge for the estimation of probabilities. If prior knowledge is biased or insufficient, the resulting estimation may be inaccurate. Updating prior probabilities is also subjective, and it may lead to inconsistent outcomes.In conclusion, Bayes' Theorem is a powerful tool that has been widely used in various fields. Its application is dependent on certain assumptions that may not always be valid. Additionally, the estimation of probability is often subjective, leading to possible inaccuracies in outcomes. Despite its limitations, Bayes' Theorem remains a fundamental concept in statistics and decision-making. The next chapter will delve into the applications of Bayes' Theorem in various fields.Chapter 4: Applications of Bayes' TheoremBayes' Theorem has been applied in various fields, particularly in decision-making, scientific research, and artificial intelligence. The following are some of the key applications of Bayes' Theorem: Medical DiagnosisBayes' theorem is employed in medical diagnosis to calculate the probability of a patient having a particular condition based on their symptoms and medical history. The theorem is used to update physicians' prior knowledge of the patient's condition with new diagnostic test results. For instance, if a patient is hospitalized for chest pains, their chances of having a heart attack can be estimated using Bayes' Theorem after considering other significant risk factors like age, smoking status, and cholesterol levels.Machine LearningBayes' theorem is an essential element in creating machine learning models. Through its application, developers can create statistical models that help systems predict behaviors, classify data, and recognize patterns. Bayes' coefficient is essential in this field since algorithms rely on it to make predictions based on previous data.Stock Market PredictionBayes' theorem provides traders with a framework for making investment decisions. By predicting future market conditions using past performance and analyzing current trends, traders can calculate the probability of future market prices using Bayes' Theorem to make more informed decisions.Natural PhenomenaThe theorem has been used in the prediction of natural phenomena such as earthquakes, floods, and storms by predicting the probability based on historical data. Through the analysis of data, scientists can identify patterns and make informed predictions.Chapter 5: Criticisms and Controversies of Bayes' TheoremBayes' Theorem has generated numerous criticisms and controversies, despite its widespread use. These criticisms are primarily focused on its relevance in real-world scenarios and how it's applied in practice.Misapplication of Bayes' TheoremOne of the most significant criticisms of Bayes' Theorem is its incorrect application in real-world situations. The theorem depends on certain assumptions that are not always valid or difficult to estimate correctly, leading to misuse. This problem is most prevalent in the fields of law enforcement, social science, and medicine, where the use of Bayes' Theorem can lead to wrongful conviction, inappropriate treatment, and misjudgment of evidence. Subjectivity in Probability EstimatesAnother significant criticism is the fact that the estimation of probabilities is subjective and relies heavily on prior beliefs. In some cases, these priors may be biased or based on limited knowledge, leading to inaccurate results.Dependence on Prior Knowledge and Sample SizeCritics of Bayes' Theorem also argue that the theorem is heavily reliant on prior knowledge. If prior knowledge is biased or insufficient, this diminishes the credibility of the resulting estimate. Additionally, the theorem requires significant sample sizes to generate accurate results. When sample sizes are too small, the outcomes are less reliable.Controversies Surrounding Bayesian StatisticsApart from criticisms of the theorem itself, there are some controversies surrounding Bayesian statistics caused by philosophical differences between the Bayesian and non-Bayesian camps. The Bayesian camp argues that probability is subjective and that prior probabilities must depend on the available evidence. The non-Bayesian camp, on the other hand, believes in objective probability and prefers to use only observable data and tests.ConclusionBayes' Theorem is a powerful tool with a range of applications in various fields. Despite being a widely accepted system, it has received significant criticism, particularly regarding its relevance and accuracy in real-world scenarios. Nonetheless, Bayes' Theorem remains a fundamental concept in statistics and decision-making, and its use will continue to grow in the years ahead. Future research should focus on developing methods for improving how Bayes' Theorem is applied and the quality of prior knowledge used in estimation.。
贝叶斯定理深入浅出贝叶斯定理是关于随机事件A和B的条件概率(或边缘概率)的一则定理。
其中P(A|B)是在B发生的情况下A发生的可能性。
贝叶斯定理也称贝叶斯推理,早在18世纪,英国学者贝叶斯(1702~1763)曾提出计算条件概率的公式用来解决如下一类问题:假设H[1],H[2]…,H[n]互斥且构成一个完全事件,已知它们的概率P(H[i]),i=1,2,…,n,现观察到某事件A与H[1],H[2]…,H[n]相伴随机出现,且已知条件概率P(A/H[i]),求P(H[i]/A)。
贝叶斯定理是关于随机事件A和B的条件概率(或边缘概率)的一则定理。
其中P(A|B)是在B发生的情况下A发生的可能性。
一、研究意义人们根据不确定性信息作出推理和决策需要对各种结论的概率作出估计,这类推理称为概率推理。
概率推理既是概率学和逻辑学的研究对象,也是心理学的研究对象,但研究的角度是不同的。
概率学和逻辑学研究的是客观概率推算的公式或规则;而心理学研究人们主观概率估计的认知加工过程规律。
贝叶斯推理的问题是条件概率推理问题,这一领域的探讨对揭示人们对概率信息的认知加工过程与规律、指导人们进行有效的学习和判断决策都具有十分重要的理论意义和实践意义。
二、定理定义贝叶斯定理也称贝叶斯推理,早在18世纪,英国学者贝叶斯(1702~1763)曾提出计算条件概率的公式用来解决如下一类问题:假设H[1],H[2]…,H[n]互斥且构成一个完全事件,已知它们的概率P(H),i=1,2,…,n,现观察到某事件A与H[,1],H[,2]…,H[,n]相伴随机出现,且已知条件概率P(A/H[,i]),求P(H[,i]/A)。
贝叶斯公式(发表于1763年)为: P(H/A)=P(H)*P(A│H)/{P(H[1])*P(A│H[1]) +P(H[2])*P(A│H[2])+…+P(H[n])*P(A│H[n])}这就是著名的“贝叶斯定理”,一些文献中把P(H[1])、P(H[2])称为基础概率,P(A│H[1])为击中率,P(A│H[2])为误报率。
Data source: Zhang Lianwen, Guo Haipeng. "Introduction to Bayesian networks". Science Press, 2006 Thomas Leonard, John.J.Hsu. "Bayesian method" (English version)Industry Press, 2004A Bayesian ruleBias's Law (Bayes'theorem/Bayes theorem/Bayesian law) Bias was a basic tool called "Bias's law", although it is a mathematical formula, but the principle can be understood without digital. If you see a person always do some good, then that person will probably be a good man. That is to say, when cannot accurately knows the essence of a thing, can how much depends on the specific nature of the events and things appear to judge the nature of probability. It is expressed in mathematical language: support a property of the event of the possibility, the attribute set is big.The basic conceptBias's law is also known as the Bias theorem, Bias's law is the application of probability and statistics of the observed phenomena on subjective judgments about the probability distribution (i.e., prior probability) standard correction method. The so-called Bayesian rule, when analyzing samples to close to the overall number, probability sample of events will be close to the overall probability of events. But the behavioral economists found, people in the decision-making process is often not followed by Bayesian rule, but given recent events and the latest experience with more weight, recent events have too much weight to make judgments in the decision-making and. In the face of the complex and the general questions, people often take shortcuts, on the basis of probability rather thanaccording to probability decision. System of this classic model called "deviation deviation". Because of the psychological bias, investors are not in decision making when absolutely rational, behavior deviation, and the impact of price changes on the capital market. But for a long time, because of the lack of a strong alternative tools, economists have to adhere to the Bayes rule in the analysis.PrincipleUsually, the event A in the event B (happen) probability, probability and event B in the event A conditions are not the same; however, these two are determined, the Bayes rule is this statement. As a normative theory, Bayes rule is effective for all probabilistic interpretation; however, frequency, and Bayesian principle regarding the probability to be assigned a different view in the application: frequency, according to the random events, or the number of total sample assignment probability; Bayesian theory depending on the unknown proposition to assign probability. As a result, Bayesians have had more chances to use Bayes rule. Bias's law of random events A and B conditional and marginal probabilities. \Pr (A|B) = \frac{\Pr (A, B |), \Pr (A)}{\Pr (B)}\propto L (A | \ B), \Pr (A), where L (A|B)! Is the possibility of A occurred in the B case. In the Bayesian rule, each noun has a name: Pr (A) is the prior probability or probability of edge A. It is called "a priori" because it does not consider any aspect of B. Pr (A|B) is known B occurred after A conditional probability, also due to the self value of B and A calledthe posterior probability. Pr (B|A) is known A occurred after B conditional probability, also due to the self value of A and B called the posterior probability. Pr (B) is the prior probability or probability of edge of B, are also normalized constant (normalized constant). According to these terms, the Bayes rule can be expressed as: the posterior probability = (similarity * prior probability) / normalized constant that is to say, the product of a posteriori probability and is proportional to the prior probability and similarity. In addition, the ratio of Pr (B|A) /Pr (B) is also sometimes referred to as the standard similarity (standardised likelihood), the Bayes rule can be expressed as: the posterior probability = standard similarity * prior probability.The example analysisCase 1: the monopoly market, only an enterprise A provides products and services. Now consider whether to enter the enterprise B. Of course, A enterprise will not sit back and watch B and completely indifferent to. B know, whether can enter, depends entirely on the A enterprise to prevent the entry and the cost of size. Challenger B do not know the original monopolist A belongs to the high block cost type or low block cost type, but B know, if A belongs to the high block cost type, the probability of B in the market of A block is 20% (high profits, at A in order to maintain the monopoly regardless of the cost to piece together the life; if the block) A belongs to the low blocking probability B to enter the market cost type,when the A block is 100%. The game began, according to B probability A belongs to the high block cost of enterprises is 70%, therefore, the B estimate oneself in the market, by the probability A block: 0.7 * 0.2+0.3 * 1=0.44 0.44 is the prior probability type given in B A, A may use blocking behavior probability. When B entered the market, A does block. Using the Bayes rule, according to block the observable behavior, according to B probability A belongs to the high block cost of enterprises into A belongs to the probability of =0.7 high costs of enterprises (A belongs to the high cost of the prior probability of enterprise) x 0.2 (probability of high cost of enterprises in the new market enterprises by) ÷ 0.44=0.32 according to this new probability, B estimates themselves in the market, by the probability A block: 0.32 × 0.2+0.68 × 1=0.744 if B again to enter the market, A was blocked. Using the Bayes rule, according to block the observed behavior, B thinks the probability A belongs to the high block cost of enterprises into A belongs to the probability of =0.32 high costs of enterprises (A belongs to the high cost of the prior probability of enterprise) x 0.2 (probability of high cost of enterprises in the new market enterprises by) ÷ 0.744=0.086 thus, according to the blocking behavior of A once again, B on A type judgment has changed gradually, more and more inclined to judge for the low cost A to enterprise. The above example shows that, in a dynamic game of incomplete information, has the function of transmitting information in one act. Although the A enterprise may be the high cost of enterprise, butenterprise A continuous market entry deterrence, to B enterprises to A enterprise is low block impression cost of enterprises, thus enterprise B stopped coming into the market action. It should be pointed out is, information behavior is a cost. If there is no cost to this behavior, who can follow, then, this kind of behavior is not up to the purpose of transmitting information. Only in the act requires considerable cost, so that others can not easily imitated, this behavior can play the role of information transmission. Transfer payment information cost is caused by incomplete information. But we can not say that the incomplete information is necessarily a bad thing. Research shows that, in the repeated prisoner's dilemma game times are limited, incomplete information can lead to bilateral cooperation. The reason is: when the information is incomplete, participants in order to obtain the long-term interests of cooperation brings, not premature exposure of his nature. That is to say, in a long-term relationship, a person to do good or bad, often does not depend on his nature is good is bad, but to a large extent depends on the extent to which other people think he is a good man. If the other person doesn't know his true colors, a bad man will do good to cover themselves in a quite long period.Case two: consider a medical diagnosis problem, there are two possible hypotheses: (1) patients with cancer. (2) patients without cancer. The sample data from a laboratory test, it also has two possible results: positiveand negative. Suppose we have a priori knowledge: only 0.008 of the people in all the population prevalence. In addition, laboratory testing of disease in 98% of patients may return to positive results, no patients had 97% may return a negative result. The above data can be represented by the following formula: the probability P (cancer) =0.008, P (non cancer)=0.992 P (+ |cancer) =0.98, P (|cancer negative (positive) =0.02 P | without cancer) =0.03, P (negative | without cancer) =0.97 if you have a new patient, the test returns positive if the patient will determine, for cancer? We can calculate the maximum a posteriori hypothesis: P (+ |cancer) P (cancer) =0.98*0.008 = 0.0078 P (positive | without cancer) *p (without cancer) =0.03*0.992 = 0.0298, therefore, should be judged as no cancer. DifferenceHarsanyi transformation and Bayesian rule in 1967, Harsanyi (JohnHarsanyi) points out the game with incomplete information of all the old definition can not change its essence is under the condition of the new model into a complete but imperfect information game, it only needs to add a different set of rules from nature in the choice of the initial action. In the old definition, game theory often point out game of incomplete information can not be analyzed, while the Harsanyi ideas make all this change. The old definition is described in this way: in complete information game, all players know the rules of the game, otherwise the game is a game of incomplete information. Although not that old Harsanyi definition is aproblem, but the fact that people view has changed, now believes that in the original definition, the game was converted to a game of incomplete information. In the game, including participation in the game may pay is not very clear, but some understanding of payment. In general, represented by the information of subjective probability distribution. Is the probability to construct various game payment based on grouping, can form a specific payment collection. For example, a and B are choice of strategy, it can be considered, a selection of a certain strategy, B to select several strategies, these strategies are grouped according to B of the probability of occurrence. Usually build a game tree can better express all this. The key point of Harsanyi doctrine assumed that all the participants have a common understanding, to adopt the strategy of the probability of occurrence is a common knowledge. The implied meaning is: participants at least a little discloses to our assumptions. The division of time in the information structure of a game, not trying to decide in what can be inferred from the other people involved in the campaign. The prior probability is present, as part of the game rules, a player must be held about other players types of prior beliefs, at the same time, their actions after observation, is to assume that they follow the equilibrium behavior, and then update their beliefs.Related principlesThe prior probability and posterior probability P (H) expressed in the absence of training data before assuming the initial probability h owned. P (H) is known a priori probability H. The prior probability reflects on the H is a correct hypothesis opportunity background knowledge without the prior knowledge, can simply be each candidate that gives the same prior probability. Similarly, P (D) represents the prior probability of the training data of D, P (D|h) represents the probability of D assumptions h established. In machine learning, we are concerned with P (h|D), which established the probability h of a given D, called the H posterior probability. The maximum a posteriori hypothesis learning device in candidate hypotheses for a given data set H D the most likely hypothesis h, h is known as the maximum a posteriori hypothesis (MAP) determined by the method of MAP is calculated for each candidate hypothesis using the Bayesian posterior probability, calculation formula is as follows: h_map=argmax P (h|D) (P (=argmax D|h) *P (H)) /P (D) =argmax P (D|h) *p (H) (H belongs to the set H) the last step, removed the P (D), because it is not dependent on H. The maximum likelihood hypothesis in some cases, can be assumed that the H of each hypothesis have the same prior probability, so that the formula can be simplified, consider only the P (D|h) to find the most likely hypothesis. H_ml = argmax P (D|h) H belongs to the set H P (D|h) is oftenreferred to as a h D data likelihood, and the P (D|h) maximum is called the maximum likelihood hypothesis.Characteristic(1) Bias classification does not take an object is assigned to a class, but through calculating probability belonging to a class, the class with maximum probability is the object belongs to the class of;(2) general Bias in the classification of all attributes are potentially play a role, which is not of one or several attribute determines the classification, but all attributes are involved in classification;Bias (3) attribute classification object can be discrete, continuous, and can also be mixed.Bias theorem gives the optimal minimum error solution, can be used for classification and prediction. In theory, it looks perfect, but in practice, it can not directly use, it needs to know the exact probability distribution of evidence, but in fact we cannot give evidence and probability distribution of the exact. We therefore in many classification methods will make some assumptions to approach Bias theory.文献来源:张连文,郭海鹏.《贝叶斯网引论》.科学出版社,2006.Thomas Leonard,John .J.Hsu.《贝叶斯方法》(英文版).机械工业出版社,2004.关于贝叶斯法则贝叶斯法则(Bayes'theorem/Bayes theorem/Bayesian law)贝叶斯的统计学中有一个基本的工具叫“贝叶斯法则”,尽管它是一个数学公式,但其原理毋需数字也可明了。
Subjective Expected Utility Theory with CostlyActionsEdi Karni¤Johns Hopkins UniversityJuly23,2003AbstractThis paper explores alternative axiomatizations of subjective expected utility theory for decision makers with direct preferences over actions;including a general subjective expected utility representation with action-dependent utility,and separately additive representations.In the contextof the state-space formulation of agency theory the results of this paperconstitute axiomatic foundations of the agent’s behavior.1IntroductionAgency theory admits alternative formulations,including the state-space formu-lation,parameterized distribution formulation,and general distribution formu-lation.1In all these formulations the agent’s preferences are de…ned over action -acts(that is,contingent payo¤)pairs.Despite the obvious interest in agency theory and the huge volume of research that it inspired,almost no attention has been paid to the development of axiomatic foundations of the agent’s decision problem.This omission may re‡ect a belief that,because the agent is presumed to be an expected utility maximizer,subjective expected utility theory provides such a foundation.According to this belief actions are labels of the acts they induce and do not enter the agent’s preferences directly.A careful examination reveals that this belief,while containing an important element of truth,is not fully justi…ed.Situations requiring decision making under uncertainty in which,prior to the resolution of uncertainty,the decision maker may take costly measures that ¤My thinking on the issues discussed in this paper was inspired and stimulated by conver-sations with Steven Matthews,for which I am deeply grateful.I also bene…ted from Steve’s comments on an earlier version of this paper.I am particularly grateful to Larry Epstein and Peter Wakker for their insightful comments and suggestions and to John Quiggin for calling my attention to some of the references.Suggetions of Simon Grant,Matthew Jackson,Mark Machina,and Zvi Safra help improve the exposition.1See Hart and Holmstrom(1987)and Chambers and Quiggin(2000).1a¤ect the consequences associated with some states do not arise exclusively in the context of principal-agent relationships.Indeed,such situations are com-mon in economic analysis and often arise even when the decision maker acts in isolation.For instance,Robinson Crusoe may have to choose between building his hut on a‡ood plane or on high grounds and thereby,at a cost,a¤ect the consequences of‡ooding.Robinson Crusoe’s choice is not adequately captured by traditional subjective expected utility theory because in that theory(e.g., Savage(1954))the choice set includes all acts(that is,all assignment of con-sequences to states)and,more importantly,acts do not enter the preferences directly.This framework is inadequate to describe the choice facing Crusoe. Building his hut on high grounds involves a larger expense of time and e¤ort. Moreover,because the time and e¤ort to be spent must be determined in ad-vance,they cannot vary with the state of nature(e.g.,they cannot depend on the amount of precipitation).Put di¤erently,to…t into Savage’s framework, the time and e¤ort spent building the hut must be incorporated into the de-scription of the consequences and the set of acts must include all assignments of consequences to states.The problem is that the decision of where to locate the hut must be made prior to the realization of the state yet acts such as f described below must be contemplated:States=ActsfPut in more general terms,let X be the set of outcomes and denote by A the set of actions(e.g.,the hut damaged by a‡ood is a consequence and time and e¤ort spent is an action).If consequences are de…ned as action-outcome pairs then Savage’s set of acts is given by(A£X)S;where S denotes the set of states of nature.But,as the above example suggests,this set contains acts that are incredible.To suppose that the decision maker is able to express meaningful preferences as regards such acts creates a conceptual di¢culty with the theory.2 This di¢culty is averted if,for instance,the only cost of actions is the …nancial cost and the consequences represent…nancial rewards then actions are indeed labels of the corresponding acts.To see this,let c(a)denote the…nancial cost of implementing the action a and denote by x(s;a)the state-contingent action-dependent monetary payo¤,then the state-contingent payo¤function x a(¢)=x(¢;a)¡c(a)is a Savage-type act and a is its label.In this example the fact that the cost of action and the reward are additive is essential.However, if the action involves spending time(e.g.,while seeking employment)and the state-contingent payo¤is monetary(e.g.,wages or unemployment insurance bene…ts depending on the eventual state of employment)then,not being perfect substitutes,actions must be separated from acts.2I am indebted to Steve Matthews for suggesting this discussion of Savage’s theory.2It seems,therefore,that an examination of the axiomatic foundations of sub-jective expected utility theory for decision makers with direct preferences over actions is warranted.As in other…elds,this study is intended to uncover andallow clearer understanding of the behavioral premises underlying the theory,thereby opening the way to a critical examination of its basic tenets.In a recent paper(Karni[2003a])I developed an axiomatic theory of Bayesiandecision making founded on the notion that,by taking appropriate actions, the decision maker can prevent certain events from occurring.In other words,through his choice of action the decision maker might delimit the relevant state-space.This approach is at variance with most formulations of subjective ex-pected utility theory and the state-space formulation of agency theory,in whichthe state-space is unalterable.In this paper I pursue an alternative,more tra-ditional,approach in which decision makers are supposed to have preference relations on M(A)£F j A j;where M(A)is the set of probability measure on A; and F is the subset of X S containing all the simple acts:The resulting model …ts naturally situations in which the eventual outcome is determined by thestate of nature and state-independent actions taken by the decision maker,suchas the decisions of the agent in agency theory.I will argue below(see Section 3.2)that the two approaches are complementary and that there are situations in which taking a combined approach is appropriate.2Preference Structures and Representations 2.1PreliminariesLet S be an in…nite set whose elements are states of nature and whose subsets are events.Consequences,or outcomes,are real numbers representing monetary payo¤s.Let the set of outcomes,X;be a compact interval in the real line. Following Savage(1954),an act is a function from the set of states to the set of outcomes.Let F denote the set of simple acts,namely,acts under which the image of S is…nite.Assume that F is endowed with the product topology,then it is a compact set.Denote by f a generic element of F and by x the constant act f(s)=x for all s2S:Let A=f a1;:::;a n g;2·n<1be a set of actions and denote by M(A) the set of probability measures on A:The choice space is the product set D= M(A)£F n endowed with the product topology.Elements of D are alternatives and are denoted by(®;f).An alternative(®;f)is a risky prospect that assignsto the action-act pair(a i;fi )2A£F the probability®i;i=1;:::;n:A preference relation is a complete and transitive binary relation,<;on D: The strict preference relation,Â;and the indi¤erence relation,»,are de…ned from<as usual.Assume that preference relations satisfy the properties described by the fol-lowing axioms which were introduced,in a di¤erent context,by Karni and Safra (2000).(A.1)(Continuity)For all(¯;g)2D the sets f(®;f)j(®;f)<(¯;g)g and3f(x;®)j(¯;g)<(®;f)g are closed.The second axiom states that every action matters.Formally,let e i be the degenerate lottery in M(A)that assigns a i the unit probability mass(that is, e i is the unit vector in R n whose i¡th coordinate is1and the other coordinates are all0)then(A.2)(Essentiality)For all a i2A;there are f;g;2F n such that¡e i;f¢Â¡e i;g¢:The next axiom requires that the evaluation of coordinates is independent in the sense that preferences among alternatives of the form(e i;(f;f¡i));where the n¡tuple of acts(h;f¡i)is h if i and agrees f on i2f1;:::;i¡1;i+1;:::;n g; depends solely on the i¡th coordinate of f.Formally,(A.3)(Certainty Principle)For all f;g;h;l2F n;(e i;(f;f¡i))<(e i;(g;g¡i))if and only if(e i;(f;h¡i))<(e i;(g;l¡i)):In view of the certainty principle,to simplify the notations,I shall henceforth write(a i;f)<(a j;f0)instead of(e i;(f;f¡i))<(e i;(f0;g¡j)):To introduce the next axiom de…ne partial mixture operation on D:Let (®;f)and(¯;f)in D then(¸®+(1¡¸)¯;f);¸2[0;1]is an element of D such that(¸®+(1¡¸)¯)i=¸®i+(1¡¸)¯i:This may be interpreted as a two-stage lottery in which in the…rst stage the alternatives(®;f)and(¯;f)are obtained with probabilities¸and1¡¸;respectively.In the second stage the coordinate of f is selected by the lottery,®or¯,that corresponds to the outcome of the…rst stage.With this interpretation in mind assume that the decision maker prefers (®;f)over(®0;g)and(¯;f)over(¯0;g):Moreover,assume that if a decisionmaker faces a choice between the alternatives d=(¸®+(1¡¸)¯;f)and d0=¡¸®0+(1¡¸)¯0;g¢he reasons as follows:If the event whose probability is¸is realized then he participates in the lottery(®;f)if he has chosen d and in the lottery(®0;g)if he has chosen d0:Conditional on the realization of this event,he is better o¤with d:By the same logic he would also prefer d over d0conditional on the realization of the event whose probability is1¡¸:Consequently,he prefers d over d0unconditionally.Formally,(A.4)Constrained Independence-For all(®;f),(¯;f),(®0;g),(¯0;g)inD and¸2[0;1)if(®;f)»(®0;g)then(¯;f)<(¯0;g)if and only if (¸®+(1¡¸)¯;f)<¡¸®0+(1¡¸)¯0;g¢:Constrained independence is weaker than the independence axiom of ex-pected utility theory(see discussion in Karni and Safra[2000]).A real valued function V on D is said to represent<if for all(®;f)and (¯;g)in D;(®;f)<(¯;g)if and only if V(®;f)¸V(¯;g):Theorem(Karni and Safra[2000])Let<be a binary relation on D.Then the following conditions are equivalent:4(a)<is a preference relation satisfying(A.1)-(A.4).(b)There exist continuous non-constant functions V:D!R and U(¢;a i):F!R;a i2A,such that V represents<and,for all(®;f)2D;V(®;f)=nX i=1®i U(f i;a i):Moreover,if there are functions W:D!R and W(¢;a i):F!R;a i2 A;such that W represents<and W(®;f)=P n i=1®i W(f i;a i);then W(¢;a i)=BU(¢;a i)+C i;B>0;a i2A.2.2The main resultFor each a2A de…ne a conditional preference relation,<a on F;as follows:For all f;g2F;f<a g if and only if(a;f)<(a;g).Assume that,for every given action,the decision maker’s conditional preference relation over the set of acts satis…es Savage’s(1954)postulates.Formally,(A.5)(Subjective expected utility preferences on acts)For every a2A;<a on F satis…es all of Savage’s(1954)axioms.In addition to being a weak order satisfying the Savage(1954)postulates, as required by(A.5),the decision maker’s preferences also satisfy two action-independence conditions.The…rst asserts that the ordinal ranking of the con-sequences(that is,constant acts)is independent of the action.Formally: (A.6)(Action-independent consequences ordering)For all a;a02A andconstant acts x;x02F;x<a x0if and only if x<a0x0:Note that,since the consequences are real numbers representing the decision maker’s wealth it is natural to assume that the preferences are monotonic in x: In this case(A.6)may be restated as follows:For every a2A;x<a x0if and only if x¸x0:Moreover,in view of(A.6)de…ne x<x0if x<a x0for some a2A:An act f is said to agree with another act f0on an event E if f(s)=f0(s) for all s2E:For every given event E denote by f E h the act that agrees with f on E and agrees with h on S¡E:An event,E;is null if for all a2A and f;f0;h2F;(a;f E h)»(a;f0E h):An event is nonnull if it is not null.The second assumption requires that the decision maker’s betting preferences be independent of the action.In other words,the agent prefers to bet on an event E rather than on G when taking the action a if and only if he prefers to bet on E rather than on G when taking any other action,a0:Formally:(A.7)(Action-independent betting preferences)For any given events,Eand G;for all a;a02A and x;x02X such that xÂx0;x E x0<a x G x0if and only if x E x0<a0x G x0:5Note that only the action-independence conditions(A.6)and(A.7)are im-posed explicitly.The independence of the consequence ordering and betting preferences across events is a consequence of the assumed subjective expected utility preferences.Theorem1below asserts that a preference relation on action-act pairs that satis…es(A.1)-(A.7)has a subjective expected utility representation with unique,action-independent,probability on the state-space and action-dependent utility function on the set of consequences.Theorem1Let<be a binary relation on D;then the following conditions are equivalent:(i)<is a preference relation satisfying(A.1)-(A.7).(ii)There exist nonatomic probability measure,¼;on S,and continuous real-valued functions,u on X andªon A£X;such that V represents<and, for all(®;f)2D;V(®;f)=nX i=1®i Z Sª(a i;u(f(s))d¼(s);(1)whereª(a;¢)is monotonic increasing transformation.Moreover,¼is unique and u is unique up to positive linear transformation.Remark:In reality decision makers choose among action-act pairs.Theo-rem1implies that the restriction of the preference relation<to A£F has the following representation:For all(a;f)and(a0;f0)in A£F; (a;f)<(a0;f0),Z Sª(a;u(f(s))d¼(s)¸Z Sª(a0;u(f0(s)))d¼(s):(2) Hence the representation in(2)is applicable.Proof.The proof that(ii)!(i)is straightforward.I shall prove that (i)!(ii):Axiom(A.5)and Savage’s(1954)theorem imply that,for every given a2A; there exist a nonatomic probability measure¼(¢;a)on S and a real-valued function,u;on X such that,U(f;a)=Z S u(f(s);a)d¼(s;a):(3)where¼(¢;a)is unique and u(¢;a)is unique up to positive linear transformation. Thus,by the theorem of Karni and Safra(2000),for all f;f02F;f<a f0,X S u(f(s);a)d¼(s;a)¸Z S u(f0(s);a)d¼(s;a):(4)6Let x;x02X;E;G22S and a;a02A be as in(A.7).Then,by the representation in(4),[u(x;a)¡u(x0;a)][¼(E;a)¡¼(G;a)]¸0(5) if and only if[u(x;a0)¡u(x0;a0)][¼(E;a0)¡¼(G;a0)]¸0;(6) where,for each EµS;¼(E;a)=R E¼(s;a)ds:But,by(A.6),u(x;a)¡u(x0;a)>0if and only if u(x;a0)¡u(x0;a0)>0:Hence¼(E;a)¡¼(G;a)¸0,¼(E;a0)¡¼(G;a0)¸0:(7) But equation(7)implies that,for every EµS and all a;a02A;¼(E;a)=¼(E;a0)=¼(E):To prove the last assertion suppose,by way of negation, that for some E½S and a;a02A;¼(E;a)>¼(E;a0):Because the measures ¼(¢;a);a2A are nonatomic,by Liaponov theorem there exist K½S such that ¼(E;a)>¼(K;a)=¼(K;a0)>¼(E;a0):Substituting K for G in equation (7)leads to a contradiction.Hence¼(¢;a)=¼(¢)for all a2A:Equation(3)together with the Theorem of Karni and Safra(2000)imply thatV(®;f)=nX i=1®i Z S u(f i(s);a i)d¼(s):But,by(A.6),u(¢;a)is ordinally equivalent to u(¢;a0)for all a;a02A:Hence, for all a i2A;u(¢;a i)=ª(a i;u(¢;a i))whereª(a i;¢)is monotonic increasing continuous transformation.Let u(¢)=u(¢;a1)to obtain the representation in (ii):The uniqueness follows from Savage’s theorem.2.3Action-a¢ne utility representationInsofar as the dependence of the utility function on the action is concerned, the result in Theorem1is quite general.In some applications the e¤ect of the choice of action on the decision maker’s well-being calls for a more speci…c functional form of the utility function.For example,when actions correspond to levels of e¤ort,it is commonplace to assume some form of separability between the utility of money and the utility of e¤ort.I explore next the axiomatic underpinnings of two alternative,more speci…c,representations of the decision maker’s preferences.The…rst,action-a¢ne utility representation,was used in Grossman and Hart(1983).The next axiom requires that,for any given event,E;the“intensity of prefer-ences”between consequences on the complementary event S¡E be independent of the action.The formal statement of this idea is an adaptation of Wakker’s (1987)cardinal consistency axiom.33See Karni(2003,2003a)for di¤erent adaptations of the same idea.7(A.8)(Action-independent intensity of preferences)For all f;g;h;l2F,a;a02A,E½S;and w;x;y;z;2X,if f E w<a g E x,g E y<a f E z and h E x<a0l E w then h E y<a0l E z:To grasp the meaning of this axiom think of the preferences f E w<a g E x and g E y<a f E z as indicating that,given the action a;the“intensity”of the preference for y over z on the event S¡E is su¢ciently greater than that of w over x as to reverse the order of preference between the f and g on the event E:Then(A.9)requires that these intensities of preferences are not contradicted when the action a is replaced by another action,a0.Action-independent intensity of preferences in conjunction with the preced-ing axiomatic structure is equivalent to the requirement that the utility func-tions corresponding to di¤erent actions be positive a¢ne transformations of one another.Formally,Theorem2Let<be a binary relation on D then the following conditions are equivalent:(i)<is a preference relation satisfying(A.1)-(A.8).(ii)There exist a nonatomic probability measure¼on S,a real-valued function u;on X,real-valued functions¸and¹on A;where¸is nonnegative, and a real valued continuous function V representing<,such that for all (®;f)2DV(®;f)=nX i=1®i·¸(a i)Z S u(f(s))d¼(s)+¹(a i)¸:(8)Moreover,¼is unique and u is unique up to positive a¢ne transformation.Remark:The restriction of<to A£F implies that,for all(a;f)and(a0;f0);(a;f)<(a0;f0),¸(a)Z S u(f(s))d¼(s)+¹(a)¸¸(a0)Z S u(f0(s))d¼(s)+¹(a0):(9)Proof of Theorem2:Theorem2is a corollary of Theorem1and the following Lemma:Lemma3Let<be a binary relation on D satisfying(A.1)-(A.7):Then the following conditions are equivalent:(i)<satis…es(A.8).(ii)There exist a real-valued function,u;on R such that,for all a02A;u(¢;a0)=¸(a0)u(¢)+¹(a0);where¸(a0)¸0and u(¢;a0)is the util-ity function that…gure in Theorem1.8Proof.(i))(ii):Fix a nonnull event E½S and a2A;then,by Theorem 1,there exist f;g;2F and such thatZ E[u(f(s);a)¡u(g(s);a)]d¼(s)=³>0:(10) For every given a02A;by Theorem1,there exist acts h;l2F that satisfyZ E[u(h(s);a0)¡u(l(s);a0)]d¼(s)=">0.(11)By continuity of the functions u(¢;a)and u(¢;a0);for every^³2[¡³;³];^"2 [¡";"];there exist f0;g0;h0;l02F and such thatZ E[u(f0(s);a)¡u(g0(s);a)]d¼(s)=^³(12)andZ E[u(h0(s);a0)¡u(l0(s);a0)]d¼(s)=^".(13)De…neÁa0by u(¢;a0)=Áa0±u(¢;a)for all a02A:ThenÁa0is a continuous function.To show thatÁa0is positive a¢ne or constant take®;¯;°;±2R such that¡³·®¡¯=°¡±·³and¡"·Áa0(®)¡Áa0(¯)·":Let w;x;y;z2R satisfy u(w;a)¼(S¡E)=®;u(x;a)¼(S¡E)=¯;u(y;a)¼(S¡E)=°and u(z;a)¼(S¡E)=±:Take^f;^g2F such thatZ E h u³^f(s);a´¡u(^g(s);a)i d¼(s)=®¡¯:(14)Thus^f E w»a^g E x and^f E y»a^g E z:Take^h;^l2F such that,Z E h u³^h(s);a0´¡u³^l(s);a0´i d¼(s)=Áa0(®)¡Áa0(¯):(15)Since u(¢;a0)=Áa0±u(¢;a)this implies^h E w»a0^l E x:Applying(A.8)twice yields^h E y»a0^l E z:ThusÁa0(°)¡Áa0(±)=Z E h u³^h(s);a0´¡u³^l(s);a0´i d¼(s)=Áa0(®)¡Áa0(¯):(16)By Wakker(1987)Lemma4.4this implies thatÁa0is a¢ne.ButÁa0is nonde-creasing.(To see this,let f E w<a f E x:But f E w<a f E w;f E w<a f E x;and f E w<a0f E w:Thus,by(A.8)f E w<a0f E x:The conclusion is implied by the representation of<.)HenceÁa0is constant or positive.Let u(¢;a)=u(¢)to obtain(ii):9(ii))(i):Assume that there exist positive a¢ne or constant transforma-tionsÁa0such that u(¢;a0)=Áa0±u(¢;a)for all a02A.Fix a nonnull E½Sand suppose that f E w<a g E x,f E y<a g E z and h E x<a0l E w:By Theorem1,f E w<ag E x if and only ifZ E u(f(s);a)d¼(s)+u(w;a)¼(S¡E)¸Z E u(g(s);a)d¼(s)+u(x;a)¼(S¡E)(17)and f E y<a g E z if and only ifZ E u(f(s);a)d¼(s)+u(y;a)¼(S¡E)¸Z E u(g(s);a)d¼(s)+u(z;a)¼(S¡E)(18)Hence[u(x;a)¡u(w;a)]¼(S¡E)·Z E[u(f(s);a)¡u(g(s);a)]d¼(s)·[u(z;a)¡u(y;a)]¼(S¡E):(19)By positive a¢neness or constancy ofÁa0these inequalities implyu(x;a0)¡u(w;a0)·u(z;a0)¡u(y;a0):(20)By Theorem1,h E x<a0l E w if and only ifZ E u(h(s);a0)d¼(s)+u(x;a0)¼(S¡E)¸Z E u(l(s);a0)d¼(s)+u(w;a0)¼(S¡E):(21)Thus[u(x;a0)¡u(w;a0)]¼(S¡E)¸Z E[u(l(s);a0)¡u(h(s);a0)]d¼(s):(22)But u(x;a0)¡u(w;a0)·u(y;a0)¡u(z;a0)impliesZ E u(h(s);a0)d¼(s)+u(y;a0)¼(S¡E)¸Z E u(l(s);a0)d¼(s)+u(z;a0)¼(S¡E):(23) Hence,by Theorem1,h E y<a0l E z:2.4Separately additive representationIn many applications of agency theory it is assumed that the agent’s preferencesover actions and payo¤s are additively separable in income and e¤ort(e.g.,Holmstrom[1979]).10Two actions,a;a02A are elementarily linked if there are x;x0;y;y02X such that(a;x)Â(a;y);(a;x)»(a0;x0)and(a;y)»(a0;y0):Two actions,a and a0are said to be linked if there exists a…nite sequence a=a0;a1;:::;a n=a0 such that every a j is elementarily linked with a j+1:The axiomatic underpinning of the separately additive speci…cation consists of the preceding axioms and,in addition,the following independence condition,(A.9)(Independence)For all E½S;a;a02A and x;x0;y;y02X;(a;x E y)<(a0;x0E y)if and only if(a;x E y0)<(a0;x0E y0):The next theorem characterizes the separately additive representation. Theorem4Let<be a binary relation on D and suppose that every pair of actions a;a02A is linked.Then the following conditions are equivalent:(i)<is a preference relation satisfying(A.1)-(A.9).(ii)There exist a nonatomic probability measure¼on S,real-valued functions, u on X and¹on A,and a real valued continuous function V representing <,such that for all(®;f)2DV(®;f)=nX i=1®i[Z S u(f(s))d¼(s)+¹(a i)]:(24)Moreover,¼is unique and u is unique up to positive a¢ne transformation.Proof.Let x;x0;y2X and E½S such that neither it nor its complement in S are null satisfy(a;x E y)»(a0;x0E y).(The existence of such outcomes and event follow from continuity and the fact that a and a0are linked.)Then,by Theorem2and(A.9),¼(E)[¸(a)u(x)¡¸(a0)u(x0)+¹(a)¡¹(a0)]=[¸(a0)¡¸(a)](1¡¼(E))u(y)(25) if and only if¼(E)[¸(a)u(x)¡¸(a0)u(x0)+¹(a)¡¹(a0)]=[¸(a0)¡¸(a)](1¡¼(E))u(y0)(26) for all y02X:Hence,[¸(a0)¡¸(a)](1¡¼(E))[u(y)¡u(y0)]=0(27) for all y;y02X:But u is nonconstant hence¸(a0)=¸(a)for all a;a02A: The representation in(24)and the uniqueness follow from Theorem2.113Discussion3.1Formulations of Principal-Agent ProblemsTo begin with consider the application of the models developed in the preceding section to agency theory.A technology,t;is a mapping from the set A£S to the set of consequences.The agent’s choice of action a2A and nature’s “choice”of a state s2S jointly determine the outcome t(a;s)2R:Thus, given a technology,t;and an action,a;t(¢;a)is an act in F.In other words, t a:=t(a;¢)2F is a random variable whose value t a(s)is the monetary payo¤s associated with the action a in the state s:The independent support assumption often invoked in the formulation of agency problem requires that the range of possible payo¤s be uniform across actions,that is,T=t a(S)for all a2A:4 An incentive contract is a function w:T!R,where w(x)is the payo¤to the agent as a function of the realized outcome x:Let W denote the set of all incentive contracts.Note that,given a technology t;each action a and incentive contract w correspond to the act^w(a;w)2F de…ned by^w(a;w)(s)=w(t a(s)) for all s2S:Similarly,the residual payo¤t a¡^w(a;w)is also an act in F: Suppose that the principal’s preference relation,<P;on F satis…es Savage’s (1954)axioms and that the agent has a preference relation,<A;on M(A)£F j A jsatisfying the axioms in either Theorems1,2,or3.Assume that the production technology is common knowledge,the level of output is veri…able,and the choice of action is private information of the agent. Under these assumptions the information regarding the state becomes a focal issue in the formulation of the agency problem.In decision theory it is generally assumed that the state of nature becomes known ex post.Presumably in order to verify whether he is correctly rewarded according to the act that he chose, the decision maker must be able to observe the state.However,if the state and the consequences are observable and the technology is known,each action the agent may choose can be inferred on some events(subset of states).Hence the principal is in a position to implement the…rst-best solution by imposing a su¢ciently large penalty on all the events-output combinations that,under the known technology,indicate that the agent chose an action other than that required to attain the…rst-best solution.Thus for the agency problem to be nontrivial it is necessary to assume that the states are not directly observable.5 This is analogous to the independent support assumption.Within this framework the principal’s problem may be stated as follows: Choose an action-contract pair(a¤;w¤)such that¡t a¤¡^w(a¤;w¤)¢<P¡t a¡^w(a;w)¢for all(a;w)2A£W(28)subject to the incentive compatibility constraints¡a¤;^w(a¤;w¤)¢<A¡a;^w(a;w¤)¢for all a2A(29)4See Shavel(1979).5See,for example,Quiggin and Chambers(1998).12。
bayesian data analysis 概述及解释说明1. 引言1.1 概述Bayesian数据分析是统计学中一种基于贝叶斯理论的方法,用于进行数据建模、参数估计和推断。
该方法通过结合先验知识和实际观测数据,对未知参数进行概率推断,从而提供了更为准确和可靠的统计结果。
1.2 文章结构本文将首先介绍Bayesian数据分析的基本原理和方法,包括Bayesian统计学概述和数据分析的基本原理。
然后,文章将详细解释Bayesian数据分析的核心概念,包括先验知识和后验推断、贝叶斯公式和贝叶斯定理以及参数估计与模型选择方法。
接下来,本文将通过实例解释Bayesian数据分析在实际问题中的应用,并分别介绍实验设计与数据采集阶段、模型建立与参数估计阶段以及结果解释与决策推断阶段的应用案例。
最后,我们会总结主要观点和发现结果,并对Bayesian数据分析未来发展趋势进行展望和思考。
1.3 目的本文旨在向读者介绍Bayesian数据分析的概念、原理以及应用领域,并为其提供相关示例以便更好地理解和应用该方法。
通过阅读本文,读者将能够了解Bayesian数据分析的基本原理、核心概念以及在实际问题中的运用,从而为自己的研究或实践工作提供有益的参考和指导。
另外,本文还将探讨Bayesian数据分析未来的发展趋势,以促进对该方法在更广泛领域中的应用和发展。
2. Bayesian Data Analysis 简介2.1 Bayesian 统计学概述在统计学中,贝叶斯统计学是一种基于贝叶斯公式的统计推断方法。
它利用先验知识和观察到的数据来更新对未知量的概率分布的认识。
与传统频率主义统计学不同,贝叶斯统计学允许我们在推断过程中使用主观意见,并将不确定性量化为概率分布。
2.2 数据分析的基本原理和方法数据分析是处理和解释收集到的数据以获得有意义信息的过程。
贝叶斯数据分析通过结合已知先验信息和新观测数据来进行参数估计、模型选择和预测。
2006年中国经济学年会投稿论文研究领域:数理经济学与计量经济学贝叶斯计量经济学:从先验到结论Bayesian Econometrics: From Priors to Conclusions刘乐平1摘要本文从现代贝叶斯分析与现代贝叶斯推断的角度探讨贝叶斯计量经济学建模的基本原理。
并通过一具体应用实例介绍贝叶斯计量经济学常用计算软件WinBUGS的主要操作步骤,希望有更多的国内计量经济学研究学者关注现代计量经济学研究的一个重要方向——贝叶斯计量经济学(Bayesian Econometrics)。
关键词:贝叶斯计量经济学, MCMC, WinBUGSAbstract:Basic principles of Bayesian econometrics with Modern Bayesian statistics analysis and Bayesian statistics inference are reviewed. MCMC computation method and Bayesian software WinBUGS are introduced from application example.KEYWORDS: Bayesian Econometrics, MCMC, WinBUGSJEL Classifications: C11, C15,1天津财经大学统计学院教授, 中国人民大学应用统计科学研究中心兼职教授。
电子邮箱:liulp66@ 。
天津市2005年度社科研究规划项目[TJ05-TJ001];中国人民大学应用统计科学研究中心重大项目(05JJD910152)资助。
一、引言美国经济学联合会将2002年度“杰出资深会员奖(Distinguished Fellow Award)”授予了芝加哥大学Arnold Zellner教授,以表彰他在“贝叶斯方法”方面对计量经济学所做出的杰出贡献。
B light source B 光源B reaction B 反应B score B 分数B type B 型B type BMR B 型基础代谢率B type personality B 型人格B.P. 血压babble 咿呀语babbler 说话不清楚的人babbling stage 咿呀语期babe 婴儿Babinsky law 巴宾斯基定律Babinsky phenomenon 巴宾斯基现象Babinsky reflex 巴宾斯基反射Babinsky sign 巴宾斯基症Babinsky Nageotte syndrome 巴纳二氏综合症baboon 狒狒baby biography 婴儿传记baby boom 生育高峰期baby bust 生育低谷期baby farm 育婴院baby talk 儿语babyhood 婴儿期babyish 孩子气的bacchanal 狂饮作乐者bacchant 酗酒的bacchante 酗酒的女人bach 独身男人bachelor 未婚男子bachelor girl 独身女子bachelor mother 独身母亲bachelordom 独身bachelorette 未婚少女bacillophobia 恐细菌症back action 反向作用back breaking 劳累至极的back reaction 反向反应back up data 备查资料backache 背痛backer 支持者background 背景background characteristics 背景特征background checks 背景考核background condition 背景条件background information 背景信息background music 背景音乐background noise 背景噪声background of feeling 感觉的背景background of information 信息背景background reflectance 背景反差background science 基本科学backknee 膝反屈backlash 间隙backrest 靠背backscatter 反向散射backtrack 反退backward acceleration 后向加速度backward association 倒行联想backward classical conditioning 逆向古典条件反射backward conditioning 后向条件作用backward conditioning reflex 倒行条件反射backward curve 反向曲线backward elimination procedure 反向淘汰法backward inference 逆向推理backward learning curve 倒行学习曲线逆向学习曲 backward masking 倒行掩蔽backward moving 后向运动backward reading 反向阅读backward shifting 逆转backward transfer 逆向迁移Baconian 培根哲学的Baconian method 培根法bad 恶的bad blood 恨bad language 骂人的话bad me 劣我badger 纠缠badinage 开玩笑badly off 穷的bad weather flight 复杂天气飞行baffle 隔音板baffle 阻碍bagatelle 不重要的东西baggage 过时的观念bagged 喝醉的baggy 宽松下垂的Baillarge bands 贝亚尔惹带Baillarge layer 贝亚尔惹层Baillarge lines 贝亚尔惹线Baillarge sign 贝亚尔惹症Baillarge striae 贝亚尔惹纹Baillarge striation 贝亚尔惹纹Baillarge stripes 贝亚尔惹纹bait shyness 怯饵balance 平衡balance force 平衡力balance frequency 均衡频率balance hypothesis 平衡假说balance of control 控制平衡balance of nature 自然平衡balance of power 均势balance of power 势力平衡balance scale 平衡量表balance test 平衡检验balance theory 平衡论balance type 平衡型balanced bilingual 平衡型双重语言balanced error 对称误差balanced experimental design 平衡实验设计balanced sample 对称样本balanced sampling 对称抽样balanced structural change 平衡结构改变balanced type 平衡型balancing response 平衡反应balderdash 胡言乱语bale 灾祸Balint s syndrome 巴林特综合症Ball Adjustment Inventory 贝尔适应性调查表ball and field test 球与运动场测验Ball and Field Test 寻找失球测验Ballard phenomenon 巴拉德无意识记现象Ballard Williams phenomenon 巴拉德 威廉斯现象Ballet sign 巴雷症ballism 颤搐ballistic 颤搐的ballistic movement 冲击运动ballistics 弹道学ballistics 射击学ballistocardiogram 心冲击描记图ballistocardiograph 心冲击描记器ballistocardiography 心冲击描记术ballistophobia 飞弹恐怖症balmy 有香气的baloney 骗人的鬼话balsam 香油balsamic 香料香bamboozle 欺骗ban 禁令banal 平庸的banality 陈词滥调band 带band chart 带形图band curve chart 带形曲线图band movement 带形运动band wagon technique 运用流行思想的宣传术banding 能力分组Bandura s personality theory 班图拉个性理论bandwagon effect 潮流效应bandwagon effect 从众效应bandwagon technique 挟众宣传技术bandwidth 频带宽度bandwidth fidelity dilemma 宽度 精确度两难band pass filter 带通滤光器band pass sound pressure level 带通声压水平baneful influence 恶劣影响banish 消除bankrupt 破坏者bankruptcy 破产banter 开玩笑bantering 开玩笑的bar 巴bar 棒bar chart 条形图bar diagram 条形图bar diagram 柱形图bar graph 柱形统计图baragnosis 压觉缺失Barany Pointing Test 巴兰尼指向测验Barany Test 巴兰尼平衡觉测验Barany s sign 巴兰尼症Barany s symptom 巴兰尼症状barbaralalia 异国语言涩滞barbarian 不文明的barbarism 粗暴行为barbarism 非规范语言现象barbarity 残暴barbarize 变得野蛮barbital 巴比妥barbiturate 巴比妥酸盐Barclay Classroom Climate Inventory 巴克雷班级气氛测验baresthesia 压觉baresthesiometer 压觉计bargain for 指望bargaining 协商bargaining theory 谈判理论barium sulfate 硫酸钡Barnum effect 巴奴姆效应baroagnosis 压觉缺失baroelectroesthesiometer 压觉电测计barognosis 辨重能barometer 气压计baropacer 血压调节器barophilic 嗜压的barophobia 重量恐怖症baroreceptor 压力感受器barothermograph 气压温度计barracoon 奴隶集中场所Barret Lennard Relationship Inventory 巴伦二氏关系量表Barre Guillain syndrome 急性热病性多神经炎Barre Lieou syndrome 后颈交感神经综合症barrier 隔栅Barron Conformity Scale 贝伦从众量表Barron Welsh Art Scale 贝韦二氏美术量表barye 微巴baryencephalia 智力迟钝baryesthesia 压觉baryglossia 言语拙笨barylalia 言语不清baryphonia 语声涩滞basal 底的basal age 基础年龄basal age 基准年龄basal body temperature 基础体温basal ganglia 基底神经节basal ganglion 基底神经节basal metabolic rate 基础代谢率basal metabolic rate 新陈代谢基率basal metabolism 基础代谢basal reflex 基础反射basal year level 基础年龄base 基础base component 基础部分base configuration 基础构造base form 基础形式base frequency 基础频率base line 基线base material 基质base of skull 颅底base point 基点base rate 基础比率base rule 基础规则base structure 基础结构base year 基础年龄Basedow s disease 巴塞道病baseline activity 基线活动baseline group 基准组baseline time 基线时间basement effect 底层效应bashfulness 羞怯basiarachnitis 颅底蛛膜炎basiarachnoiditis 颅底蛛膜炎basic acts 基本动作Basic and Applied Social Psychology 基础与应用社会心理学basic anxiety 基本焦虑basic anxiety theory 基本焦虑理论basic body dimension drawings 基本人体尺寸图basic capacity 基本能力basic cells of thinking 思维的基本单位思维的基本单位basic concept 基本概念basic conflict 基本冲突basic contention 基本论点basic contradiction 基本矛盾basic element 基本元素basic energy level 基本能量水平basic fact 基本事实basic feature 基本特征basic form 基本形式basic formal pattern 基本形式模式basic frequency 基本频率basic function of commodities 商品的基本功能basic human right 基本人权basic law 基本法律basic line 基线basic logic 基础逻辑basic meaning 基本意义basic mistrust 基本怀疑basic modal logic 基本模态逻辑basic need 基本需要basic number 基数Basic Occupational Literacy Test 基本职业识字测验basic operation 基本运算basic personality 基本人格basic personality structure 基本人格构造basic personality type 基本人格类型basic principle 基本原理basic process 基本过程basic process of nervous activity 神经活动的基本过程basic representation 基本表示basic research 基本研究basic rule 基本规则basic sentence 基本句子basic skill 基本技能basic solution 基本解basic statement 基本陈述basic statistical method 基本统计方法basic statistics 基本统计basic structure 基本结构basic structure of a subject 学科的基本结构basic symbol 基本符号basic terminology 基本术语basic thought 基本思想basic trust 基本信任basic variable 基本变量basic vocabulary 基本语汇basilar membrane 基底膜basilemma 基膜basioccipital 枕骨底部basis 基础basis cerebri 大脑底basis of calculation 计算标准basis of ethics 伦理学基础basis reference 参考基准basiscopic 下侧的basket 蓝状细胞basket endings 蓝状末梢basograph 步态描记器basophobia 恐走动症basophobia 走动恐怖症basophobiac 步行恐怖者bastard 私生子Bastick s intuition theory 巴斯蒂克直觉论bathesthesia 深部感觉bathmic 生长力的bathmic force 进化控制力bathmotropic 变阈性bathmotropic action 变兴奋性作用bathomorphic 凹眼的bathophobia 望深恐怖症bathyanesthesia 深部感觉缺失bathyesthesia 深部觉bathyhyperesthesia 深部感觉过敏bathyhypesthesia 深部感觉迟钝bathypsychology 深蕴心理学BATNA 达成谈判协议的最佳选择方案batrachophobia 恐蛙症batrachophobia 蛙恐怖症bats 发疯的battarism 口吃battered child syndrome 被虐儿童综合症battery of tests 成套测验battle fatigue 战争神经症Battley s sedative 巴特利镇静剂bawdry 淫乱Bayesian analysis 贝叶斯分析Bayesian inference 贝叶斯推论Bayesian statistic 贝叶斯统计Bayes s estimator 贝叶斯估计量Bayes s theorem 贝叶斯定理Bayley Scales 贝氏量表Bayley Scales of Infant Development 贝氏婴儿发展量表Bayle s disease 贝尔病BBB 血脑屏障血脑屏障BBT 基础体温beamy 放光的bear with 忍受Beard s disease 神经衰弱bearing capacity 承载能力bearing down 下坠力bearish 粗鲁的beastliness 兽性beastly 残忍的beat 拍beat generation 美国“垮了的一代”美国“垮了的一代”beat of pulse 脉搏beaten 筋疲力尽的beating fantasy 毒打幻想beat frequency oscillator 节拍器beat tone 拍音beauty 美beauty of art 艺术美beauty of behavior 行为美beauty of defect 缺陷美beauty of environment 环境美beauty of human body 人体美beauty of human society 人类社会美beauty of language 语言美beauty of mind 心灵美beauty of nature 自然美beauty of neutrality 中和美beauty of silence 无言美beauty of staunchness 刚性美Bechterev technique 别赫切列夫法Bechterev s nucleus 别赫切列夫核Bechterev s nucleus 前庭神经上核Beck Depression Inventory 白氏抑郁症量表Becker Test 贝克尔试验Beckman thermometer 贝克曼氏温度计贝克曼氏温度计beckon 召唤bedevil 折磨bedevilment 着魔bedfast 卧床不起bedim 模糊不清bedlamism 疯狂状态bedlamite 精神病患者bedrid 卧床不起bedridden 卧床不起bed wetting 遗尿Beevor s sign 比佛症befit 适合before after design 事前事后设计前後设计法before after experiment 事前事后测验beginning of personality 人格起源beginning psychology 早期心理学behavior 行为behavior act 行为动作behavior adaptation 行为适应behavior adjustment 行为适应behavior analysis 行为分析Behavior and Philosophy 行为与哲学behavior assumption 行为假定behavior biology 行为生物学behavior case study 行为事例研究法behavior chain 行为连锁behavior chaining 行为连锁化behavior change 行为变化behavior characteristics 行为特征behavior check list 行为检核表behavior component 行为成分behavior contagion 行为感染behavior contract 行为契约behavior control 行为控制behavior control power 行为控制力behavior description 行为描述behavior determinant 行为定因behavior disorder 行为障碍behavior disposition 行为意向behavior dynamics 行为动力学behavior ecology 行为生态学behavior emergence 行为显现behavior engineering 行为工程学behavior environment 行为环境behavior feedback 行为反馈behavior field 行为场behavior genetics 行为遗传学behavior gradient 行为梯度behavior homology 行为同源behavior integration 行为整合behavior interaction 行为相互作用behavior level 行为水平behavior measure 行为测量behavior medicine 行为医学behavior method 行为研究法behavior modification 行为矫正behavior observation 行为观察behavior observation scales 行为评定量表Behavior Observation Survey 行为观察调查表behavior of men 人类行为behavior patter 行为模式behavior patter in old age 老年期的行为模式behavior potential 行为势能behavior problem 行为问题behavior psychology 行为心理学behavior rating 行为评定法Behavior Rating Instrument for Autistic and Other Atypical Children 孤独和反常儿童行为评定工具behavior rating schedule 行为评定表behavior reaction 行为反应behavior record 行为记录behavior reflex 行为反射Behavior Research and Therapy 行为研究与治疗behavior rigidity 行为僵化behavior sampling 行为抽样behavior segment 行为片段behavior setting 行为定势behavior shaping 行为塑造behavior socialization 行为社会化behavior space 行为空间behavior system 行为系统behavior theory 行为理论Behavior Therapy 行为疗法behavior therapy groups 行为治疗团体行为治疗团体behavior transfer 行为转移behavior type 行为类型behavior unit 行为单元behavioral 行为的behavioral analysis of society 社会的行为分析Behavioral Anchored Rating Scale 行为定位评价量表behavioral and learning principles 行为与学习原则behavioral competence 行为能力behavioral component 行为性成分behavioral concept 行为概念behavioral constraint model 行为受限模型behavioral contrast 行为反差behavioral counseling 行为咨询behavioral decision theory 行为决策理论behavioral dimorphism 行为二形性behavioral disposition 行为意向behavioral ecology 行为生态学behavioral engineering 行为工程学behavioral environment 行为环境behavioral equivalence 行为等值behavioral expression 行为表现behavioral facilitation 行为便利化behavioral genetics 行为遗传学behavioral indicator 行为指标behavioral intention 行为意向behavioral learning model 行为学习模式行为学习模式behavioral measure 行为测量behavioral medicine 行为医学behavioral model 行为模式behavioral motive 行为动机Behavioral Neuroscience 行为神经科学behavioral norms 行为规范behavioral objective 行为目标behavioral pattern 行为模式behavioral pattern theories 行为模式理论behavioral phase 行为阶段behavioral psychology 行为主义心理学行为主义心理学behavioral psychotherapy 行为心理治疗行为心理治疗法behavioral school 行为学派behavioral science 行为科学behavioral scientific decision model 行为科学的决策模式behavioral scientific decision rule 行为科学的决策规则behavioral scientist 行为科学家behavioral sink 行为的沉沦behavioral stimulus field 行为刺激场behavioral styles theory 行为类型理论行为类型理论behavioral taxonomy 行为分类学behavioral techniques 行为技术behavioral theories of leadership 领导的行为理论behavioral theories of learning 学习的行为理论behavioral theory 行为理论behavioral trait 行为特性behavioral unity 行为统一behavioral pattern theories of leadership 领导的行为模式理论behaviorism 行为主义behaviorist 行为主义者behavioristic 行为主义的behavioristic approach 行为主义方法behavioristic psychology 行为主义心理学behavioristic resistance 行为阻力behavioristics 行为学behavior mapping 行为区划behind conditioned reflex 错后条件反射错後制约反射Behn Rorschach Test 贝 罗测验beholder 观看者being 存在being and becoming 存在和生成being and freedom 存在和自由being and meaning 存在与意义being and not being 有与无being and time 存在与时间being cognition 存在认知being fond of learning 好学being love 存在之爱being motivation 存在动机being need 存在需求being value 存在价值beingless 不存在的beingless 无实在性being beyond the world 超现实存在being for itself 自为之有being for self 自为存在being imitated effect 受模仿效应being in itself 实在本身being in the world 现实中存在Bekesy audiometry 贝克赛测听法Bekhterev technique 别赫捷列夫法Bekhterev Test 别赫捷列夫试验Bekhterev s layer 别赫捷列夫层Bekhterev s nucleus 前庭神经上核Bekhterev s reflex 别赫捷列夫反射Bekhterev s symptom 面肌麻痹bel 贝尔bel esprit 才子belabor 痛打belief 信念belief and worship 信仰与崇拜belief systems 信仰体系belief value matrix 信念价值方阵believe 信任believing and necessity 信仰与需要belittle 贬低Bell Adjustment Inventory 贝尔适应量表belladonna 颠茄belle indifference 泰然淡漠Bellevue Scale 贝尔夫智力量表belligerence 好战性bellow 吼叫belly laugh 捧腹大笑Bell s law 贝尔定律Bell s mania 急性谵妄Bell s nerve 胸长神经Bell s palsy 贝尔麻痹Bell s phenomenon 贝尔现象Bell Magendic law 贝 马神经运动感知定律bell shaped 钟形bell shaped curve 钟形曲线bell shaped distribution 钟形分布bell shaped symmetrical curve 钟形对称曲线belonephobia 尖物恐怖症belonephobia 恐针症belonging 隶属belonging need 从众需要belonging need 归属需要belongingness 从众性belongingness 归属感belongingness and love need 归属与相爱需要belonoskiascopy 针形检影法bemoan 悲叹Benary effect 贝纳瑞效应bench height 工作台高度benchmark 基准benchmark data 基准数据benchmark program 基准程序benchmark system 基准系统Bender Gestalt Test 本德尔格式塔测验班达完形测验Bender Visual Motor Gestalt Test 本德尔视觉动作完形测验beneath contempt 卑鄙之极beneath notice 不值得注意的beneceptor 良性感受器Benedikt s syndrome 本尼迪克特综合症benefaction 善行benefit 利益benevolence 仁爱benevolent authoritative management 恩威式管理Benham top 贝汉圆板benign psychosis 良性精神病benignity 慈祥Bennett Mechanical Comprehension Test 贝纳特机械理解测验Bennett Test 贝纳特大学智力测验benny 安非他明片Benton Visual Retention Test 本顿视觉保持测验benumb 使僵化berate 训斥bereavement 丧亲之哀bereavement and mourning reaction 居丧反应Berger rhythm 贝格尔节律Berger rhythm 甲种脑电波Bergeron s chorea 贝尔热隆病Berger s paresthesia 贝格尔感觉异常柏格感觉 常Berger s sign 贝格尔症Berger s symptom 贝格尔症状Bergmann s cells 贝格曼细胞Bergmann s cords 第四脑室髓纹Bergmann s fibers 贝格曼纤维Bergsonian 柏格森派Bergsonism 柏格森主义Bergson s doctrine of duration 柏格森的绵延说Berkeleian 贝克莱派Berkeleian idealism 贝克莱派唯心主义贝克莱派唯心主义Berkeleianism 贝克莱主义Berkeley s theory of vision 贝克莱视觉说Berlin s disease 视网膜震荡Bernheimer s fibers 伯恩海默纤维Bernreuter Personality Inventory 本罗特人格量表berserk fury 暴怒berserker 狂暴的人beseech 恳求besetment 困扰beshrew 诅咒beside oneself with joy 欣喜若狂besmirch 沾污best allocation 最佳配置best estimate 最佳推算数best estimator 最佳估计量best fit 最佳配合best in quality 品质优良best linear unbiased estimator 最佳线性不偏估计量best test 最佳检验bestiality 兽恋症bestiality 兽性bestow 赠给best answer test 选答测验beta coefficient β系数beta distribution β分布Beta Examination β智力测验beta hypothesis β假说beta model β模式beta motor neuron β运动神经元beta movement β似动现象beta movement β运动beta particle β粒子beta ray β射线beta receptor β受体beta response β节律beta value β值beta wave β波beta wave 乙型波beta binomial models β二项式模式beta fiber β纤维beta receptor β 肾上腺受体betrayal 背叛行为bettered child 被虐儿童bettered child syndrome 被虐儿童综合症bettered wife 被虐妻子betterment 改善betterness 优等better off 境况较好between group design 组间设计between group variable 组间方差between word spacing 字间距betweenbrain 间脑betweenness 中间性between Ss design 组间法between subjects design 被试者间设计被试者间设计between subjects variance 被试者间方差Betz cell area 贝兹细胞区Betz cells 贝兹细胞bewail 悲叹beware 谨防bewilderment 迷惑状态bewitch 蛊惑bewitchment 被蛊惑bewray 泄露Bezold s color mixing effect 贝措尔德混色效应Bezold s ganglion 贝措尔德神经节Bezold s triad 贝措尔德三症Bezold Brucke effect 贝 勃效应Bezold Brucke phenomenon 贝 勃色觉现象Bezold Brucke shift 贝 勃转移bhang 印度大麻bi serial 双列Bianchi s syndrome 比昂基综合症bias 偏差bias 偏向biased choice 偏见选择biased error 偏误biased estimator 有偏估计量biased group 选择组biased result 有偏结果biased sample 偏性样本biased sampling 偏性抽样biased statistic 有偏统计量bibber 酒鬼bibliofilm 图书缩微胶片Bibliographic Guide to Psychology 心理学文献指南bibliography 文献目录bibliomania 藏书癖bibliophilism 爱书癖bibliophilism 藏书家bibliophobia 恐书症bibliotherapy 读书疗法bicameral 两腔的Bichat s canal 大脑大静脉Bichat s fissure 大脑横裂Bichat s foramen 蛛网膜孔bicker 争吵bicoordinate navigation 双坐标导航bicycle ergometer 自行车测力计Bidder s ganglion 比德尔神经节bidding 命令Biedl s disease 比德尔病Biedl s syndrome 比德尔综合症Bielschowsky s method 比尔肖夫斯基法比尔肖夫斯基法Bielschowsky Jansky disease 晚期婴儿型家族性黑蒙性痴呆Biernacki s sign 别尔纳茨基症bifactor theory of conditioning 条件反射二因素说bifuration 分支亲系bifurcate 两枝的bifurcation point 分歧点bifurcation problem 分歧问题bifurcation theory 分歧理论big business 大企业big cheese 粗鲁男子big mouth 多话的人bigamist 犯重婚罪者bigamy 重婚bigot 抱偏见的人bigotry 偏执bilateral 两侧的bilateral cooperative model of reading 阅读的双边协作模式bilateral descent 双亲遗传bilateral power relationship 双方分权关系bilateral summation 两侧积累bilateral symmetry 两侧对称bilateral transfer 两侧性迁移bilateral transfer 左右迁移bilateral type 两侧对称式bilateralism 两侧对称bilaterality 两侧对称bilaterally symmetrical 两侧对称的bile 胆液bilineurine 胆碱bilingual 双重语言的bilingual children 双语儿童bilingual distribution 双语分布bilingual education 双语教育bilingual interference 双语阻扰bilingual society 双语社会bilingualism 双语制bilingualism 双重语言bilious 暴躁的bilious 胆汁的bilious temperament 胆汁质bilis 胆汁bimanual synergia 双手协同运动bimodal 双峰的bimodal curve 双峰曲线bimodal distribution 双峰分配bimodality 双峰性binac 二进制自动计算机binary 二进制的binary arithmetic 二进制算术binary bit 二进位binary cell 二进制单位binary digits 二进位数binary information 二进制信息binary internal number base 内二进制数基binary number 二进制数binary signal 二进制信号binary word 二进制字binaural 双耳的binaural fusion 双耳融合binaural hearing 双耳听觉binaural interaction 双耳相互作用binaural phase difference 双耳相位差binaural ratio 双耳比例binding energy 结合能bindle stiff 流浪汉Binet Simon Intelligence ScaleBinet class 比内班Binet laboratory 比内实验室Binet Scale 比内量表Binet Test 比内测验Binetgram 比内量表图解Binet Simon classification 比内 西蒙分类Binet Simon Scale of Intelligence 比内 西蒙智力量表Binet Simon Test 比内 西蒙测验binocular 双眼的binocular accommodation 双眼调节binocular cell 双眼性细胞binocular color mixture 双眼混色binocular colormixture 双眼色混合binocular contrast 双眼对比binocular cues 双眼线索binocular depth perception 双眼深度知觉binocular diplopia 双眼复视binocular disparity 双眼像差binocular fixation 双眼注视binocular flicker 双眼集中binocular fusion 双眼视像融合binocular information 双眼信息binocular luster 双眼光泽现象binocular matching 双眼调节binocular parallax 双眼视差binocular perception 双眼知觉binocular perspective 双眼透视binocular rivalry 双眼竞争binocular stereopsis 双眼实体视觉binocular summation 双眼总和binocular synergy 双眼共同运动binocular vision 双眼视觉binomial 二项式binomial coefficient 二项系数binomial distribution 二项分布binomial group testing 二项分组测验法二项分组测验法binomial theorem 二项式定理binomial variable 二项变量binophthalmoscope 双目检眼镜binoscope 双目单视镜Binswanger dementia 宾斯万格痴呆Binswanger encephalitis 宾斯万格脑炎宾斯万格脑炎bioassay 生物鉴定biobalance 生物平衡biocenology 生物群落学biocenose 生物群落biocenosis 生物群落biochemical defect 生化缺陷biochemical energy 生化能biochemical engineering 生化工程biochemical evolution 生化进化biochemical genetics 生化遗传学biochemical mechanism 生化机制biochemist 生物化学家biochemistry 生物化学biochemorphology 形态生物化学bioclimatology 物候学biocoenology 生物群落学biocoenosis 生物群落biocybernetics 生物控制论biocycle 生活周期biodynamics 生物动力学bioecology 生物生态学bioelectric current 生物电流bioelectric potential 生物电位bioelectricity 生物电bioelectrogenesis 生物电发生bioelement 生物元素bioenergetics 生物能量学bioengineering 生物工程学bioethics 生物伦理学biofeedback 生物反馈biofeedback therapy 生物反馈疗法biofeedback training 生物反馈训练biogenesis 生源说biogenetic law 生物发生原则biogenic 生源的biogenic amine 生物胺biogenic amine hypothesis of depression 抑郁症的生物胺假说biogenic motive 生物发生动机biogenic succession 生物进化演替biogenous 生物发生的biogeochemistry 生物地球化学biogeography 生物地理学biograph 生物运动描记器biographer 传记作者biographical characteristics 传记特点biographical data 传记式资料biographical inventory 传记式量表Biographical Inventory for R & T Talent 科研人才甄选传记式量表biographical inventory tests 传记式问卷测验biographical method 传记法biographical study 传记研究biographical type case study 传记式个案研究biography 传记biokinetics 生物运动学biolinguistics 生物语言学biologic therapy 生理疗法biological aging 生物性老化biological anthropology 生物人类学biological balance 生态平衡biological clock 生物钟biological cycle 生物周期biological determinism 生物决定论biological drive 生物性驱力biological engineering 生物工程学biological gratification 生物满足biological heritage 生理承袭biological imperative 生理的必然性biological intelligence 生理智力biological memory 生物性记忆biological motivation 生物性动机biological motive 生物性动机biological nature of man 人的生物性biological need 生理需要biological rhythm 生物节律biological school in crimin 犯罪生物学派biological sex difference 生物性别差异生物性别差biological tendency 生物倾向biological theory 生物学理论biological theory of sex type 性别类型的生物学理论biologicalization 生物学化biologism 生物学主义biologist 生物学家biologos 生物活力biology 生物学biomathematics 生物数学biome 生物群落biomechanics 生物力学biomechanism 生物机械论biomedical engineering 生物医学工程学生物医学工程学biomedical model 生物医学模式biomedical therapy 生物医学疗法biomedicine 生物医学biometeorology 生物气象学biometric 生物计量的biometric procedure 生物统计法biometrics 生物计量学biometrics method 生理计量法biometry 生物统计学biomophic 生物形态的bionergy 生命力bionics 仿生学bionomic 生态的bionomic factor 生态因素bionomics 生态学bionomy 生命规律学bionomy 生态学bionuclenonics 生物核子学biophilia 生物自卫本能biophotometer 光度适应计biophysical intervention 生物物理干涉生物物理干涉biophysicist 生物物理学家biophysics 生物物理学bioplasm 原生质biopoiesis 生命自生bioprogram language 生物程序语言biopsychic 生物心理的biopsychology 生物心理学biopsychosocial medical model 生物心理社会医学模式biopsychosocial model 生物心理社会模式bioreactor 生物反应器bioreversible 生物可逆的biorgan 生理器官biorhythm 生物节律bios 生物活素类biosensor 生理传感器biosis 生命现象biosocial 生物社会的biosocial approach 生物社会性研究方法biosocial environment 生物社会环境biosocial psychology 生物社会心理学biosocial theory 生物社会论biosphere 生物圈biostatics 生物静力学biostatistician 生物统计学家biostatistics 生物统计学biotaxy 生物分类学biotechnology 生物技术学biotelemetry 生物遥测法biotic community 生物共同体biotic energy 生命力biotic environment 生物环境biotic experiment 生物实验biotic influence 生物影响biotics 生命学biotomy 生物解剖学biotype 生物类型biotypology 生物属型学bio acoustics 生物声学bio clock 生物钟bio ecology 生物生态学bio electricity 生物电bio electrochemistry 生物电化学bio energetics 生物能学bio engineering 生物工程bio information 生物信息bio medical model 生物医学模式biparental 双亲的biparental inheritance 双亲遗传biphasic 两相的biphasic symptom 两相症状bipolar 双极的bipolar adjective scale 双极形容词量表双极形容量表bipolar affective disorder 两极性情绪紊乱bipolar affective psychosis 两极情感性精神病bipolar cell 双极细胞bipolar cell layer 双极细胞层bipolar depression 两极型忧郁症bipolar disorder 两极型异常bipolar lead 双极导出bipolar nerve cell 双极神经细胞bipolar neuron 双极神经元bipolar rating scale 两极式评定量表bipolarity 两极性bipolarity of affective 情感两极性bipolarity of feeling 情感两极性bipotentiality 双潜能bipotentiality of the gonad 生殖腺两性潜能birth 分娩birth adjustment 出生顺应birth age specific 年龄别出生率birth age marital 年龄 婚龄别出生率birth cohort 出生组birth control 节制生育birth differential 级差的出生率birth injury 产伤birth intrinsic 固定人口出生率birth nuptial 婚姻出生率birth order 出生次序birth place population 出生的人口birth planned 计划出生率birth process 出生过程birth rate 出生率birth specific 类别出生率birth stable rate 稳定出生率birth standardized 标准化出生率birth statistics 出生统计birth symbolism 出生表象birth total 合计出生率birth trauma 产伤birth trauma 出生创伤birthright 生来就有的权利birth and death process 出生死亡过程出生死亡历程bisect 分开bisected brain 分裂脑bisection 平分bisector 平分线bisectrix 等分角线biserial 双数列的biserial coefficient 双数列系数biserial coefficient of correlation 双列相关系数biserial correlation 二列相关biserial correlation 双列相关biserial correlation coefficient 双列相关系数biserial ratio of correlation 双数列相关比bisexual libido 两性欲力bisexuality 两性俱有Bishop s sphygmoscope 毕晓普脉搏检视器bit 比特bit 二进制位bit combination 位组合bit of information 信息单位bit per second 每秒比特bit site 数位位置bit time 一位时间bitter 苦bitterness 苦味bivalent releaser 二价释放刺激bivariate 双变量的bivariate analysis 双变量分析bivariate correlation 双变量相关bivariate distribution 二元分布bivariate frequency table 双变量次数表双变项次数表bivariate normal distribution 双变量正态分布bizarre delusion 怪异妄想bizarre experience 奇异经验bizarre image 怪诞意象bi directional goal gradient 双向目标梯度bi directional movement 双向移动bi directional replication 二向复制bi modality 双通道blabber 喋喋不休的人black 黑black box 暗箱black box 黑匣子black box theory 黑箱理论black despair 绝望black in the face 脸色发紫black light 黑线black out 撤光black propaganda 暗宣传blackbox 暗箱blackbox theory 暗箱说blackdamp 窒息性空气blackguard 恶棍blackguardism 恶棍行为blackmail 敲榨blackout 黑视blackout 昏厥blackout threshold 失觉阈限Blacky Pictures 布莱克漫画测验Blacky Test 布莱克测验black box organization model 黑箱组织模型black box paradigm 黑箱范式bladder 膀胱bladder control 膀胱控制bladder training 排尿训练blah 废话Blake s disks 布莱克盘blame 责备blank experiment 插入实验blank experiment 空白实验blank field effect 空虚视野效应blank instruction 空白指令blank trial 空白练习blarney 奉承话blase 厌于享乐的blaspheme 辱骂blasphemer 辱骂者blast 胚叶blastocyte 胚细胞blastomere 裂球blastophyly 种族史blastoprolepsis 发育迅速blastula 囊胚blather 胡说bleaching 漂白blemish 污点blench 退缩blend 混合blepharism 脸痉挛blepharoplegia 脸瘫痪blight 使失望blind 盲点blind action 盲动blind alley 盲路blind alley job 盲目职业blind analysis 匿情分析blind child 盲童blind path 盲路blind positioning movement 盲目定位动作blind research procedure 保密研究法blind spot 盲点blinder 障眼物blinding glare 失明眩光blindism 盲人主义blindman 盲人blindman s buff 捉迷藏blindness 盲blink 瞬目blink rate 眨眼率blink reflex 眨眼反应blinker 闪光警戒标blinking 瞬目blinking coding 闪烁编码blinking light 闪光信号灯blinkpunkt 注视点blink eyed 习惯性眨眼blob 模糊的东西Bloch s law 布洛奇法则block 区block 阻断block access 字组存取block building 积木block chart 方块图block code 分组码block design 分组实验设计block design 区组设计block diagram 直方图block encoding 分块编码block sample 区域样本block sampling 区域抽样。
Bayesian Analysis(2006)1,Number3,pp.403–420Subjective Bayesian Analysis:Principles andPracticeMichael Goldstein∗Abstract.We address the position of subjectivism within Bayesian statistics.We argue,first,that the subjectivist Bayes approach is the only feasible methodfor tackling many important practical problems.Second,we describe the essentialrole of the subjectivist approach in scientific analysis.Third,we consider possiblemodifications to the Bayesian approach from a subjectivist viewpoint.Finally,weaddress the issue of pragmatism in implementing the subjectivist approach.Keywords:coherency,exchangeability,physical model analysis,high reliabilitytesting,objective Bayes,temporal sure preference1IntroductionThe subjective Bayesian approach is based on a very simple collection of ideas.You are uncertain about many things in the world.You can quantify your uncertainties as probabilities,for the quantities you are interested in,and conditional probabilities for observations you might make given the things you are interested in.When data arrives, Bayes theorem tells you how to move from your prior probabilities to new conditional probabilities for the quantities of interest.If you need to make decisions,then you may also specify a utility function,given which your preferred decision is that which maximises expected utility with respect to your conditional probability distribution.There are many compelling accounts explaining how and why this view should form the basis for statistical methodology;see,for example,Lindley(2000)and the accompa-nying discussion.Careful treatments of the Bayesian approach are given in,for example, Bernardo and Smith(1994),O’Hagan and Forster(2004)and Robert(2001).In partic-ular,Lad(1996)provides an excellent introduction to the subjectivist viewpoint,with a wide ranging collection of references to the development of this position.Moving from principles to practice can prove very challenging and so there are many flavours of Bayesianism reflecting the technical challenges and requirements of different fields.In particular,a form of Bayesian statistics,termed“objective Bayes”aims to gain the formal advantages arising from the structural clarity of the Bayesian approach without paying the“price”of introducing subjectivity into statistical analysis.Such attempts raise important questions as to the role of subjectivism in Bayesian statistics. This account is my subjective take on the issue of subjectivism.My treatment is split into four parts.First,the subjectivist Bayes approach is the only feasible method for tackling many important practical problems,and in Section ∗Department of Mathematical Sciences,University of Durham,UK, :8000/stats/people/mg/mg.htmlc 2006International Society for Bayesian Analysis ba0003404Subjective Bayesian Analysis 2I’ll give examples to illustrate this.Next,in Section3,I’ll look at scientific analy-ses,where the role of subjectivity is more controversial,and argue the necessity of the subjective formulation in this context.In Section4,I’ll consider how well the Bayes approach stands up to scrutiny from the subjective viewpoint itself.In Section5,I’ll dis-cuss the issue of pragmatism in implementing the subjectivist approach.In conclusion, I’ll comment on general implications for developing the full potential of the subjectivist approach to Bayesian analysis.2Applied subjectivismAmong the most important growth areas for Bayesian methodology are those applica-tions that are so complicated that there is no obvious way even to formulate a more traditional analysis.Such applications are widespread;for many examples,consult the series of Bayesian case studies volumes from the Carnegie Mellon conference series. Here are just a couple of areas that I have been personally involved in,with colleagues at Durham,chosen so that I can discuss,from the inside,the central role played by subjectivity.2.1High reliability testing for complex systemsSuppose that we want to test some very complicated system-a large software system would be a good example of this.Software testing is a crucial component of the software creation cycle,employing large numbers of people and consuming much of the software budget.However,while there is a great deal of practical expertise in the software testing community,there is little rigorous guidance for the basic questions of software testing, namely how much testing a system needs,and how to design an efficient test suite for this purpose.Though the number of tests that we could,in principle,carry out is enormous,each test has non-trivial costs,both in time and money,and we must plan testing(and retesting given each fault we uncover)to a tight time/money budget.How can we design and analyse an optimal test suite for the system?This is an obvious example of a Bayesian application waiting to happen.There is enormous uncertainty and we are forced to extrapolate beliefs about the results of all the tests that we have not carried out from the outcomes of the relatively small number of tests that we do carry out.There is a considerable amount of prior knowledge carried by the testers who are familiar with the ways in which this software is likely to fail,both from general considerations and from testing andfield reports for earlier generations of the software.The expertise of the testers therefore lies in the informed nature of the prior beliefs that they hold.However,this expertise does not extend to an ability to analyse,without any formal support tools,the conditional effect of test observations on their prior beliefs,still less to an ability to design a test system to extract optimum information from this extremely complex and interconnected probabilistic system.A Bayesian approach proceeds as follows.First,we construct a Bayesian belief net. In this net,the ancestor nodes represent the various general reasons that the testersMichael Goldstein405 may attribute to software failure,for example incorrectly stripping leading zeroes from a number.The links between ancestor nodes show relationships between these types of failure.The child nodes are the various test types,where the structuring ensures that all tests represented by a given test node are regarded exchangeably by the testers.Second, we quantify beliefs as to the likely levels of failure of each type and the conditional effects of each failure type on each category of test outcome.Finally,we may choose a test suite to optimise any prespecified criterion,either based on the probability of any faults remaining in the system or on the utility of allowing certain types of failure to pass undetected at software release.This optimality calculation is tractable even for large systems.This is because what concerns us,for any test suite,is the probability of faults remaining given that all the chosen tests are successful,provided any faults that are detected will befixed before release.In principle,this methodology,by combining Bayesian belief networks with optimal experimental design,is massively more efficient andflexible than current approaches. Is the approach practical?From our experiences working with an industrial partner,I would say definitely yes.A general overview of the approach that we developed is given in Wooffet al.(2002).As an indication of the potential increase in efficiency,we found, in one case study,that Bayesian automatic design provided eight tests which together were more efficient than233tests designed by the original testing team,and identified additional tests that were appropriate for checking areas of functionality that had not been covered by the original test suite.This is not a criticism of the testers,who were very experienced,but simply illustrates that optimal multi-factor probabilistic design is very difficult.The value of the subjectivist approach lies in translating the compli-cated but informal generalised uncertainty judgements of the experts into a language which allows for precise and rigorous analysis.In system testing,the careful use of this language offers enormous potential for clarity and efficiency gains.Of course,there are many issues that must be sorted out before such benefits can be realised,from the construction of user-friendly interfaces for building the models to(a much larger obstacle!)the culture change required to recognise and routinely exploit such methods.However,the subjective Bayes approach does provide a complete framework for quantifying and managing the uncertainties of high-reliability testing.It is hard to imagine any other approach which could do so.2.2Complex physical systemsMany large physical systems are studied through a combination of mathematical mod-elling,computer simulation and matching against past physical data,which can,hope-fully,be used to extrapolate future system behaviour;for example,this accounts for much of what we claim to know about the nature and pace of global climate change. Such analysis is riddled with uncertainty.In climate modelling,each computer simu-lation can take between days and months,and requires many input parameters to be set,whose values are unknown.Therefore,we may view computer simulations with varied choices of input parameters as a small sample of evaluations from a very high dimensional unknown function.The only way to learn about the input parameters is406Subjective Bayesian Analysis by matching simulator output to historical data,which is,itself,observed with error. Finally,and often most important,the computer simulator is just a model,and we need to consider the ways in which the model and reality may differ.Again,the subjectivist Bayesian approach offers a framework for specifying and synthesising all of the uncertainties in the problem.There is a wide literature on the probabilistic treatment of computer models;a good starting point with a wide collec-tion of references is the recent volume Santner et al.(2003).Our experience at Durham started with work on oil reservoir simulators,which are constructed to help with all the problems involved in efficient management of reservoirs.Typically,these are very high dimensional computer models which are very slow to evaluate.The approach that we employed for reservoir uncertainty analysis was based on representing the reservoir sim-ulator by an emulator.This is a probabilistic description of our beliefs about the value of the simulator at each input value.This is combined with statements of uncertainty about the input values,about the discrepancy between the model and the reservoir and about the measurement uncertainty associated with the historical data.This completely specified stochastic system provides a formal framework allowing us to synthesise expert elicitation,historical data and a careful choice of simulator runs.While there are many challenging technical issues arising from the size and complexity of the system,this spec-ification does allow us to identify“correct”settings for simulator inputs(often termed history matching in the oil industry),see Craig et al.(1996),and to assess uncertainty for forecasts of future behaviour of the physical system,see Craig et al.(2001).Our approach relies on a Bayes linear foundation(which I’ll discuss in Section4)to handle the technical difficulties involved with the high dimensional analysis;for a full Bayes approach for related problems,see Kennedy and O’Hagan(2001).Our approach has been implemented in software employed by users in the oil indus-try,through our collaborators ESL(Energy SciTech Limited).This means that we get to keep track,just a little,of how the approach works in practice.Here’s an example of the type of success which ESL has reported to us.They were asked to match an oilfield containing650wells,based on one million plus grid cells(for each of which permeability,porosity,fault lines,etc.are unknown inputs).Finding the previous best history match had taken one man-year of effort.Our Bayesian approach,starting from scratch,found a match using32runs(each lasting4hours and automatically chosen by the software),with a fourfold improvement according to the oil company measure of match quality.This kind of performance is impressive,although,of course,these remain very hard problems and much must still be done to make the approach more flexible,tractable and reliable.Applications such as these make it clear that careful representation of subjective beliefs can give much improved performance in tasks that people are already trying to do.There is an enormous territory where subjective Bayes methods are the only feasible way forward.This is not to discount the large amount of work that must often be done to bring an application into Bayes form,but simply to observe that for such applications there are no real alternatives.In such cases,the benefits from the Bayesian formulation are potentially very great and clearly demonstrable.The only remaining issue,therefore,is whether such benefits outweigh the efforts required to achieve them.Michael Goldstein407 This“pain to gain”ratio is crucial to the success of subjective Bayes applications.When the answer really matters,such as for global climate change,the pain threshold would have to be very high indeed to dissuade us from the analysis.By explicitly introducing our uncertainty about the ways in which our models fall short of reality,the subjective Bayes analysis also does something new and important. Only technical experts are concerned with how climate models behave,while everybody has an interest in how global climate will actually change.For example,the Guardian newspaper leader on Burying Carbon(Feb3,2005)tell us that“the chances of the Gulf Stream-the Atlantic thermohaline circulation that keeps Britain warm-shutting down are now thought to be greater than50%.”This sounds like something we should know.However,I am reasonably confident that no climate scientist has actually carried an uncertainty analysis which would be sufficient to provide a logical bedrock for such a statement.We can only use the analysis of a global climate model to guide rational policy towards climate change if we can construct a statement of uncertainty about the relation between analysis from the climate model and the behaviour of the real climate. To further complicate the assessment,there are many models for climate change in current use,all of whose analyses should be synthesised as the basis of any judgements about actual climate change.Specifying beliefs about the discrepancy between models and reality is unfamiliar and difficult.However,we cannot avoid this task if we want our statements to carry weight in the real world.A general framework for making such specifications is described in Goldstein and Rougier(2005).3Scientific subjectivism3.1The role of subjectivism in scientific enquiryIn the kind of applications we’ve discussed so far,the only serious issues about the role of subjectivity are pragmatic ones.Each aspect of the specification,whether part of the“likelihood function”or the“prior distribution,”encodes a collection of subjective judgements.The value of the Bayesian approach liesfirst in providing a language within which we can express all these judgements and second in providing a calculus for analysing these judgements.Controversy over the role of subjectivity tends to occur in those areas of scientific experimentation where we do appear to have a greater choice of statistical approaches. Laying aside the obvious consideration that any choice of analysis is the result of a host of subjective choices,there are,essentially,two types of objections to the explicit use of subjective judgements;those of principle,namely that subjective judgements have no place in scientific analyses;and those of practice,namely that the pain to gain ratio is just too high.These are deep issues which have received much attention;a good starting place for discussion of the role of Bayesian analysis in traditional science is Howson and Urbach (1989).Much of the argument can be captured in simple examples.Here’s one such, versions of which are often used to introduce the Bayesian idea to people who already408Subjective Bayesian Analysis have some familiarity with traditional statistical analysis.First,we can imagine carrying out Fisher’s famous tea-tasting experiment.Here an individual,Joan say,claims to be able to tell whether the milk or the tea has been addedfirst in a cup of tea.We perform the experiment of preparing ten cups of tea, choosing each time on a coinflip whether to add the milk or teafirst.Joan then tastes each cup and gives an opinion as to which ingredient was addedfirst.We count the number,X,of correct assessments.Suppose,for example,that X=9.Now compare the tea-tasting experiment to an experiment where an individual, Harry say,claims to have ESP as demonstrated by being able to forecast the outcome of fair coinflips.We test Harry by getting forecasts for tenflips.Let X be the number of correct forecasts.Suppose that,again,X=9.Within the traditional view of statistics,we might accept the same formalism for the two experiments,namely that,for each experiment,each assessment is independent with probability p of success.In each case,X has a binomial distribution parameters10 and p,where p=1/2corresponds to pure guessing.Within the traditional approach, the likelihood is the same,the point null is the same if we carry out a test for whether p=1/2,and confidence intervals for p will be the same.However,even without carrying out formal calculations,I would be fairly convinced of Joan’s tea tasting powers while remaining unconvinced that Harry has ESP.You might decide differently,but that is because you might make different prior judgements. This is what the Bayesian approach adds.First,we require our prior probability,g say,that Harry or Joan is guessing.Then,if not guessing,we need to specify a prior distribution q over possible values of p.Given g,q,we can use Bayes theorem to update our probability that Harry or Joan is just guessing and,if not guessing,we can update our prior distribution over p.We may further clarify the Bayesian account by giving a more careful description of our uncertainty within each experiment based on our judgements of exchangeability for the individual outcomes.This allows us to replace our judgements about the abstract model parameter p with judgements about observable experimental outcomes as the basis for the analysis.Therefore,the Bayes approach shows us exactly how and where to input our prior judgements.We have moved away from a traditional view of a statistical analysis, which attempts to express what we may learn about some aspect of reality by analysing an individual data set.Instead,the Bayesian analysis expresses our current state of belief based on combining information from the data in question with whatever other knowledge we consider relevant.The ESP experiment is particularly revealing for this discussion.I used to use it routinely for teaching purposes,considering that it was sufficiently unlikely that Harry would actually possess ESP that the comparison with the tea-tasting experiment would be self-evident.I eventually came to realise that some of my students considered it perfectly reasonable that Harry might possess such powers.While writing this article, I tried googling“belief in ESP”over the net,which makes for some intriguing reading. Here’s a particularly relevant discussion from an article in the September2002issue ofMichael Goldstein409 Scientific American,by Michael Sherme,titled“Smart People Believe Weird Things”. After noting that,for example,around60%of college graduates appear to believe in ESP,Sherme reports the results of a study that found“no correlation between science knowledge(facts about the world)and paranormal beliefs.”The authors,W.Richard Walker,Steven J.Hoekstra and Rodney J.Vogl,concluded:“Students that scored well on these[science knowledge]tests were no more or less sceptical of pseudo-scientific claims than students that scored very poorly.Apparently,the students were not able to apply their scientific knowledge to evaluate these pseudo-scientific claims.We suggest that this inability stems in part from the way that science is traditionally presented to students:Students are taught what to think but not how to think.”Sherme continues as follows:“To attenuate these paranormal belief statistics,we need to teach that science is not a database of unconnected factoids but a set of methods designed to describe and interpret phenomena,past or present,aimed at building a testable body of knowledge open to rejection or confirmation.”The subjective Bayesian approach may be viewed as a formal method for connecting experimental factoids.Rather than treating each data set as though it has no wider context,and carrying out each statistical analysis just as though this were thefirst investigation that had ever been carried out of any relevance to the questions at issue, we consider instead how the data in question adds to,or changes,our beliefs about these questions.If we think about the ESP experiment in this way,then we should expand the prob-lem description to reflect this requirement.Here is a minimum that I should consider. First,I would need to assess my probability for E,the event that ESP is a real phe-nomenon that at least some people possess.This is the event that joins my analysis of Harry’s performance with my generalised knowledge of the scientific phenomenon at issue.Conditional on E,I should evaluate my probability for J,the event that Harry possesses ESP.Conditional on J and on J complement,I should evaluate my proba-bilities for G,the event that Harry is just guessing and C,the event that either the experiment isflawed or Harry is,somehow,cheating;for example,the coin might be heads biased and Harry mostly calls heads.This is the event that captures my gen-eralised knowledge of the reliability of experimental procedures in this area.If there is either cheating or ESP,I need a probability distribution over the magnitude of the effect.What do we achieve by this formalism?First,this gives me a way of assessing my actual posterior probability for whether Harry has ESP.Second,if I can lay out the considerations that I use in a transparent way,it is easy for you to see how your conclusions might differ from mine.If we disagree as to whether Harry has ESP,then we can trace this disagreement back to differing probabilities for the general phenomenon, in this case ESP,or different judgements about particulars of the experiment,such as Harry’s possible ability at sleight of hand.More generally,by considering the range of prior judgements that might reasonably be made,I can distinguish between the extent to which the experiment might convince me as to Harry’s ESP,and the effect it might have on others.I could even determine how large and how stringently controlled an experiment would need to be in order to have a chance of convincing me of Harry’s410Subjective Bayesian Analysis powers.More generally,how large would the experiment need to be to convince the wider community?The above example provides a simple version of a general template for any scientific Bayesian analysis.There are scientific questions at issue.Beliefs about these issues require prior specification.Then we must consider the relevance of the scientific for-mulation to the current experiment along with all the possibleflaws in the experiment which would invalidate the analysis.Finally,a likelihood must be specified,expressing data variability given the hypotheses of interest.There are two versions of the subsequent analysis.First,you may only want to know how to revise your own beliefs given the data.Such private analyses are quite common. Many scientists carry out at least a rough Bayes assessment of their results,even if they never make such analysis public.Second,you may wish to publish your results,to contribute to,or even to settle, a scientific issue.It may be that you can construct a prior specification that is very widely supported.Alternately,it may be that,as with the ESP experiment,no such generally agreed prior specification may be made.Indeed,the disagreement between experts may be precisely what the experiment is attempting to resolve.Therefore,our Bayesian analysis of an experiment should begin with a probabilistic description whose qualitative form can be agreed on by everyone.This means that all features,in the prior and the likelihood,that cause substantial disagreement should have explicit form in the representation,so that differing judgements can be expressed over them.There is a rich literature on elicitation,dealing with how generalised expert knowledge may be converted into probabilistic form;for a recent overview,see Garthwaite et al.(2005). As with each other aspect of the scientific argument,such elicitation has two aims;first,to obtain sensible prior values and second,to make clear the scientific basis for assigning these values.Statistical aspects of the representation may employ standard data sharing methodologies such as meta-analysis,multi-level modelling and empirical Bayes,provided all the relevant judgements are well sourced.We can then produce the range of posterior judgements,given the data,which correspond to the range of “reasonable”prior judgements held within the scientific community.We may argue that a scientific case is“proven”if the evidence should be convincing given any reasonable assignment of prior beliefs.Otherwise,we can assess the extent to which the community might still differ given the evidence.We should make this analysis at the planning stage in order to design experiments that can be decisive for the scientific community or to conclude that no such experiments are feasible.All of this is clear in principle,though implementation of the program may be difficult in individual cases.Each uncertainty statement is a well sourced statement of belief by an individual.If individual judgements differ and if this is relevant,then such differences are reflected in the analysis.In practice,it is unusual tofind such a subjectivist approach within scientific analysis.Let us therefore consider objections and alternatives to the subjective Bayesian approach.Michael Goldstein411 3.2Objections and alternatives to scientific subjectivismThe principled objection to Bayesian subjectivism is that the subjective Bayesian ap-proach answers problems wrongly,because of unnecessary and unhelpful appeals to arbitrary prior assumptions,which should have no place in scientific analyses.Individ-ual subjective reasoning is inappropriate for reaching objective scientific conclusions, which form the basis of consensus within the scientific community.This objection would have more force if there was a logically acceptable alternative.I do not here want to dwell on the difficulties in interpretation of the core concepts of more traditional inference,such as significance and coverage properties:a valid confidence interval may be empty,for example when constructed by the intersection of a series of repeated confidence intervals;a statistically significant result obtained with high power may be almost certainly false,and so forth.Further,I do not know of any way to construct even the basic building blocks of the inference,such as the relative frequency probabilities that we must use if we reject the subjective interpretation,that will stand up to proper logical scrutiny.Instead,let us address the principled objection directly. We cannot consider whether the Bayes approach is appropriate withoutfirst clarifying the objectives of the analysis.When we discussed the analysis of physical models,we made the fundamental distinction between analysis of the model and analysis of the physical system.Analysing various models may give us insights but at some point these insights must be integrated into statements of uncertainty about the system itself. Analysing experimental data is essentially the same.We must be clear as to whether we are analysing the experiment or the problem.In the ESP experiment,the question is whether Harry has ESP,or,possibly,whether ESP exists at all.If we analyse the experimental data as part of a wider effort to address our uncertainty about these questions,then external judgements are clearly relevant. As described above,the beliefs that are analysed may be those of an individual,if that individual can make a compelling argument for the rationality of a particular belief spec-ification,or instead we may analyse the collection of beliefs held by informed individuals in the community.The Bayes analysis is appropriate for this task,as it is concerned to evaluate the relevant kinds of uncertainty judgements,namely the uncertainties over the quantities that we want to learn about,given the quantities that we observe,based on careful foundational arguments using ideas such as coherence and exchangeability to show why this is the unavoidable way to analyse our actual uncertainties.On the other hand,suppose that,for now,we only want to analyse the data from this individual experiment.Our goal,therefore,cannot be to consider directly the basic question about the existence of ESP.Indeed,it is hard to say exactly what our goal is,which is why there often is so much confusion in discussions between proponents of different approaches.All that we can say informally is that the purpose of such analysis is to provide information which will be helpful at some future time for whoever does attempt to address the real questions of interest.We are now in the same position as the modeller;we have great freedom in carrying out our analyses but we must be modest in the claims that we make for them.。