人工智能09贝叶斯网络

  • 格式:pptx
  • 大小:1.49 MB
  • 文档页数:57

下载文档原格式

  / 57
  1. 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
  2. 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
  3. 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
所有的概率推理和学习相当于不断重复加法和乘法法则
大纲
• Graphical models (概率图模型) • Bayesian networks
– Syntax(语法) – Semantics(语义) • Inference(推导) in Bayesian networks
什么是图模型?
概率分布的图表示 – 概率论和图论的结合
Independence /Conditional Independence
A and B are independent iff P(A| B) = P(A) or P(B| A) = P(B) or Biblioteka Baidu(A, B) = P(A) P(B)
A is conditionally independent of B given C: P(A | B, C) = P(A | C)
Local semantics: each node is conditionally independent of its nondescendants(非后代) given its parents
给定父节点,一个节点与它的非后代节点是条件独立 的
Theorem: Local semantics global semantics
• 在概率图模型中 – 每个节点表示一个随机变量(or 一组随机变量) – 边表示变量间的概率关系
Graphical Models in CS
• 处理不确定性和复杂性的天然工具 –贯穿整个应用数学和工程领域
• 图模型中最重要的思想是模块性概念 – a complex system is built by combining simpler parts.
• Also called 概率图模型 • They augment analysis instead of using pure
algebra(代数)
What is a Graph?
• Consists of nodes (also called vertices) and links (also called edges or arcs)
Causal Chains因果链
• 一个基本形式: – Is X independent of Z given Y?
– Evidence along the chain “blocks” the influence
Common Cause共同原因
• 另一个基础的形态: two effects of the same cause – Are X and Z independent? – Are X and Z independent given Y?
• Yes: remember the ballgame and the rain causing traffic, no correlation?
– Are X and Z independent given Y?
3. 执行推理和学习表示为图形化操作需要复 杂的计算
图的方向性
• 有向图模型 – 方向取决于箭头
• 贝叶斯网络 – 随机变量间的因果 关系
• More popular in AI and statistics
• 无向图模型 – 边没有箭头
• Markov random fields (马尔科夫随机场) –更适合表达变量之间的软约
Example contd.
Compactness(紧致性)
A
CcPoTmfobrinBaotoiolenasnofXpi waritehntk
Boolean values
parents
has
2k
rows
for
the
一个具有k个布尔父节点的布尔变量的条件概率表中有2k个独立的 可指定概率
Each row requires one number (the number for Xi = false is

• More popular in Vision and physics
Bayesian networks
一种简单的,图形化的数据结构,用于表示变量之间的依赖 关系(条件独立性),为任何全联合概率分布提供一种简 明的规范。
Syntax语法: a set of nodes, one per variable a directed(有向) , acyclic(无环) graph (link ≈ "direct influences") a conditional distribution for each node given its parents: P (Xi | Parents (Xi))—量化其父节点对该节点的影响
但是,在许多情景下不可能进行重复试验 发生第三次世界大战的概率是多少?
Bayesian: degree of belief. It is a measure of the plausibility (似然性) of an event given incomplete knowledge.
相信的程度,是在不确定知识的环境下对事件似然性的衡 量
概率
Queries can be answered by summing over atomic events 可以通过把对应于查询命题的原子事件的条目相加的方式来回答查询
For nontrivial domains, we must find a way to reduce the joint size Independence and conditional independence provide the tools
Why are Graphical Models useful
• 概率理论提供了“黏合剂”whereby – 使每个部分连接起来, 确保系统作为一个整体是一 致的 – 提供模型到数据的连接方法.
• 图理论方面提供: –直观的接口 • by which humans can model highly-interacting sets of variables – 数据结构 • that lends itself naturally to designing efficient general-purpose(通用的) algorithms
Probability theory can be expressed in terms of two simple equations概率理论可使用两个简单线性方程来表达
– Sum Rule (加法规则) • 变量的概率是通过边缘化或者求和其他变量获得的
– Product Rule (乘法规则) • 用条件表达联合概率
jpusftor1-Xpi )=
true
If each variable has no more than k parents, the complete network requires O(n ·2k) numbers
I.e., grows linearly with n, vs. O(2n) for the full joint distribution
Graphical models: 统一的框架
• 考虑传统的多变量的概率系统作为一般基础形式的实例 – mixture models(混合模型) , factor analysis(因子分 析) , hidden Markov models, Kalman filters(卡尔曼滤波 器) , etc. –在系统工程,信息论,模式识别和统计力学中被用到
In the simplest case, conditional distribution represented as a conditional probability table 条件概率表 (CPT) giving the distribution over Xi for each combination of parent values
Example
Topology(拓扑结构) of network encodes conditional independence assertions:
Weather 独立于其他变量
Toothache and Catch are conditionally independent given Cavity
– Observing the cause blocks influence between effects.
Common Effect共同影响
• 最后一种配置形态: two causes of one effect (v-structures) – Are X and Z independent?
For burglary net, 1 + 1 + 4 + 2 + 2 = 10 numbers (vs. 25-1 = 31)
Global semantics(全局语义)
The full joint distribution is defined as the product of the local conditional distributions: 全联合概率分布可以表示为贝叶斯网络中的条 件概率分布的乘积
Global semantics(全局语义)
The full joint distribution is defined as the product of the local conditional distributions: 全联合概率分布可以表示为贝叶斯网络中的条 件概率分布的乘积
Local semantics
在大多数情况下,使用条件独立性能将全联合概率的 表示由n的指数关系减为n的线性关系。
Conditional independence is our most basic and robust form of knowledge about uncertain environments.
Probability Theory
Bayesian networks 贝叶斯网络
Frequentist vs. Bayesian
客观 vs. 主观 Frequentist(频率主义者) : 概率是长期的预期出现频率.
P(A) = n/N, where n is the number of times event A occurs in N opportunities. “某事发生的概率是0.1” 意味着0.1是在无穷多样本的极限 条件下能够被观察到的比例
• 优势: –在某一领域中的专业技术能够在该领域中相互转化并 被充分利用 – Provides natural framework for designing new systems
图模型在机器学习中的角色
1. 形象化概率模型结构的简单方法
2. Insights into properties of model Conditional independence properties by inspecting graph
Probability概率
Probability is a rigorous formalism for uncertain knowledge 概率是对不确定知识一种严密的形式化方法
Joint probability distribution specifies probability of every atomic event 全联合概率分布指定了对随机变量的每种完全赋值,即每个原子事件的
Example
我晚上在单位上班,此时邻居John给我打电话说我家 警报响了,但是邻居Mary没有给打电话。有时轻微 的地震也会引起警报。那么我家真正遭贼了吗?
Variables: Burglary(入室行窃) , Earthquake, Alarm, JohnCalls, MaryCalls
网络拓扑结构反映出因果关系: – A burglar can set the alarm off – An earthquake can set the alarm off – The alarm can cause Mary to call – The alarm can cause John to call