Maximum entropy production principle in physics, chemistry and biology

格式：pdf
大小：562.11 KB
文档页数：45

下载文档原格式

Maximum Entropy Principle for the Microcanonical Ensemble

I.
INTRODUCTION
The seminal works of Jaynes [1, 2] presents the information theory approach to statistical physics using the Maximum Entropy Principle (MEP). In the original papers, Jaynes maximized the Shannon information entropy using constraints of normalization and average energy to obtain the canonical ensemble. Later on, Tsallis [3] maximized generalized information entropies, like the R´ enyi and Tsallis entropies, using constraints of normalization and average energy to obtain deformed exponential distributions that describe the behavior of nonextensive systems. In this paper we show that there is also a special information entropy associated with the microcanonical ensemble. This microcanonical information entropy is the phase-space volume entropy, originally due to P. Hertz [4] (see also [5]) that satisﬁes the heat theorem [6, 7, 8]). Using this entropy in the MEP with constraints of normalization and average energy, we obtain the condition that the energy distribution is a delta function, i.e., we derive the microcanonical ensemble from the MEP. In Section 2 we review the traditional application of the MEP to the microcanonical ensemble. The quantum statistical application of the MEP with discrete probabilities using the volume entropy is treated in Section 3. The classical statistical application is given in Section 4, which employs integration and functional diﬀerentiation with continuous probability distribution functions. The conclusion is given in Section 5.

浅谈最大熵原理和统计物理学

浅谈最大熵原理和统计物理学摘要在本文中我们将分别从物理和信息论角度简单讨论熵的意义并介绍由E.T.Jaynes所奠立基础的最大熵原理的原始理解。

透过研究理想气体，我们将阐述如何运用最大熵原理研究真实问题。

同时藉由简短分析统计物理学研究方法的问题，本文会给出最大熵原理更深层涵义及其应用。

我们将称之为最大熵原理第二延伸。

最后透过真实气体的研究，我们将描绘出如何运用第二延伸来帮助我们思考及研究热力学系统。

一、前言长时间以来人们对于熵有物理上的理解也有二、最大熵原理(Information theory) 上的理解。

物理上l、什么是最大熵原理信息论的熵可以说明热力学系统的演化方向、热平衡的达相信物理系学生和物理研究人员都很熟悉成与否亦或是代表系统的混乱程度等[1-3]。

在信Clausius的经验准则-热力学第二定律[1,2]。

该定息论里，信息熵则代表量测信息系统的可信度或者律说明当一个热力学系统达到最后热平衡状态时，是忽略度[3,4]。

然而不管物理或是信息论上对熵该系统的熵会达到最大值。

进一步的研究指出当系的理解，实际上仍局限于将熵视为一个量测的工统的熵最大时，其自由能将会成为最小。

在此一具。

正如我们可藉由系统能量的量测来了解系统状特性的影响下人们惯性的倾向于将熵视为类似能态稳定与否。

然而由于E.T.Jaynes的贡献，熵可量的巨观物理量。

此一物理量成为描述系统乱度的依据。

此后由于 Gibbs 引入 ensemble 观念，开视为一种研究问题的推理工具，这一层意义才为人所知[5,6]。

时至今日，我们虽然仍无法全盘了解启微观角度的研究方法因而奠立近代统计力学理熵的真正意含，但是我们也渐渐掌握熵在物理学尤解熵的理论基础。

在统计力学的观念中，观察者所其是统计物理中所能扮演的角色。

通过本文浅显的量测到该系统热力学性质之巨观物理量诸如系统介绍，我们将从过去Jaynes对于熵的认识到今日内能或压力，基本上只能以平圴值来表现。

Principle of Maximum Entropy

↑H↑
Figure 10.1: Dipole moment example. (Each dipole can be either up or down.) the system, “up-up,” “up-down,” “down-up,” and “down-down.” The energy of a dipole is proportional to the applied ﬁeld and depends on its orientation, and the energy of each state is the sum of the eቤተ መጻሕፍቲ ባይዱergies of the two dipoles. State A B C D Alignment up-up up-down down-up down-down Energy −md H 0 0 md H
Author: Paul Penﬁeld, Jr. Version 1.0.2, April 4, 2003. Copyright c 2003 Massachusetts Institute of Technology URL: /Courses/6.050/notes/chapter10.pdf TOC: /Courses/6.050/notes/index.html
10.1
Problem Setup
Before the Principle of Maximum Entropy can be used the problem domain needs to be set up. In cases involving physical systems, this means that the various states in which the system can exist need to be identiﬁed, and all the parameters involved in the constraints known. For example, the energy, electric charge, and other quantities associated with each of the quantum states is assumed known. It is not assumed in this step which particular state the system is actually in (which state is occupied) indeed it is assumed that we do not know and cannot know this with certainty, and so instead we deal with the probability of each of the states being occupied. In applications to nonphysical systems, again the various possible events have to be enumerated and the properties of each state known, particularly the values associated with each of the constraints. In these notes we will apply the general mathematical derivation to two examples, one a crude business model, and the other a crude model of a physical system.

最大熵模型简介

P {p | Ep( f j ) E~p( f j ),1 j k}
H ( p) p(x) log2 p(x)
x
p* arg max H ( p)
最大熵模型
❖ 例如：给定一个词
假定已知存在四种词性：名词、动词、介词、指代词 ❖ 如果该词在语料库中出现过，并且属于名词的概率为70%，则判断
Generative Model vs. Discriminative Model
❖ Generative Model (GM): P(Y|X)=P(X|Y)P(Y)/P(X)，通过求解P(X|Y)和P(Y)来求解P(Y|X)
❖ Discriminative Model (DM): 对P(Y|X)直接建模
纲要
❖ 最大熵原理 ❖ 最大熵模型定义 ❖ 最大熵模型中的一些算法 ❖ 最大熵模型的应用 ❖ 总结 ❖ 思考题
最大熵模型(Maximum Entropy
Model)
❖
假设有一个样本集合 (x1, x2 ,... xn )
特征(j对f1, pf2的...制fk )约可以表示为
，我们给出k个特征 , Ep( f j ) E~p( f j )
p(X=3)=p(X=4)=p(X=5)=p(X=6)=0.1
最大熵原理
❖ 最大熵原理：1957 年由E.T.Jaynes 提出。 ❖ 主要思想：
在只掌握关于未知分布的部分知识时，应该选取符合这些知识但熵值最大的概率分布。
❖ 原理的实质：
前提：已知部分知识关于未知分布最合理的推断＝符合已知知识最不确定或最随机的推断。这是我们可以作出的唯一不偏不倚的选择，任何其它的选择都意味着我们增加了其它的约束和假设，这些约束和假设根据我们掌握的信息无法作出。

熵的定义与熵增定律

熵的定义与熵增定律一、引言熵是热力学中一个重要的物理量，最初由德国物理学家鲁道夫·克勒齐于19世纪提出。

熵的概念在热力学、信息理论以及其他许多领域中都有广泛的应用。

本文将介绍熵的定义及其在自然界和科学研究中的应用，并探讨相应的熵增定律。

二、熵的定义熵是描述系统无序混乱程度的物理量，也可以理解为系统的不可逆性度量。

在热力学中，熵被定义为系统的总体微观状态的统计量。

具体而言，对于一个宏观系统，可以定义其熵为：S=−k∑p i lnp i其中，S表示系统的熵，p i表示系统处于第i个微观状态的概率，k为玻尔兹曼常数。

该公式表明，系统的熵与微观状态的概率分布有关，概率分布越均匀，系统的熵越大，系统越无序。

三、熵的应用熵在自然界和科学研究中具有广泛的应用。

以下是熵应用的几个方面：1. 热力学熵在热力学领域中起着至关重要的作用。

根据热力学第二定律，孤立系统的熵总是增加的。

当系统接受热量时，系统的熵增加；当系统排放热量时，系统的熵减少。

通过熵的概念和熵增定律，我们可以对热力学过程进行描述和分析。

2. 信息理论熵在信息理论中也有着重要的应用。

根据信息论的定义，信息熵用来描述信息的不确定性。

信息熵越大，信息越不确定，包含的信息量越大。

信息熵在数据压缩、通信和密码学等领域有广泛的应用，帮助我们理解信息的传输和处理。

3. 生态系统熵的概念也可以应用于生态系统的研究。

生态系统是一个开放的非平衡系统，其内部有能量和物质的交换。

通过熵的概念，我们可以量化生态系统中能量流和物质循环的无序程度，从而促进对生态系统的研究和管理。

4. 复杂系统熵的概念也被广泛应用于复杂系统的研究。

复杂系统由许多相互作用的组件组成，具有非线性、不确定性和自组织的特点。

通过熵的概念，我们可以研究和描述复杂系统中的相互作用、演化和行为模式，帮助我们理解和预测复杂系统的行为。

四、熵增定律根据熵的定义和熵增定律，我们可以得出以下结论：1.孤立系统的熵总是增加的：根据热力学第二定律，孤立系统的熵增加，鲁棒系统的不可逆性逐渐增强；2.受控系统的熵可以减少：在外部施加控制和约束时，系统的熵可以减少，系统趋于有序；3.系统的无序性趋于最大：在演化过程中，系统的无序性（熵）趋于最大，系统趋于平衡态。

一文搞懂HMM（隐马尔可夫模型）

⼀⽂搞懂HMM（隐马尔可夫模型）什么是熵(Entropy)简单来说，熵是表⽰物质系统状态的⼀种度量，⽤它⽼表征系统的⽆序程度。

熵越⼤，系统越⽆序，意味着系统结构和运动的不确定和⽆规则；反之，，熵越⼩，系统越有序，意味着具有确定和有规则的运动状态。

熵的中⽂意思是热量被温度除的商。

负熵是物质系统有序化，组织化，复杂化状态的⼀种度量。

熵最早来原于物理学. 德国物理学家鲁道夫·克劳修斯⾸次提出熵的概念，⽤来表⽰任何⼀种能量在空间中分布的均匀程度，能量分布得越均匀，熵就越⼤。

1. ⼀滴墨⽔滴在清⽔中，部成了⼀杯淡蓝⾊溶液2. 热⽔晾在空⽓中，热量会传到空⽓中，最后使得温度⼀致更多的⼀些⽣活中的例⼦:1. 熵⼒的⼀个例⼦是⽿机线，我们将⽿机线整理好放进⼝袋，下次再拿出来已经乱了。

让⽿机线乱掉的看不见的“⼒”就是熵⼒，⽿机线喜欢变成更混乱。

2. 熵⼒另⼀个具体的例⼦是弹性⼒。

⼀根弹簧的⼒，就是熵⼒。

胡克定律其实也是⼀种熵⼒的表现。

3. 万有引⼒也是熵⼒的⼀种(热烈讨论的话题)。

4. 浑⽔澄清[1]于是从微观看，熵就表现了这个系统所处状态的不确定性程度。

⾹农，描述⼀个信息系统的时候就借⽤了熵的概念，这⾥熵表⽰的是这个信息系统的平均信息量(平均不确定程度)。

最⼤熵模型我们在投资时常常讲不要把所有的鸡蛋放在⼀个篮⼦⾥，这样可以降低风险。

在信息处理中，这个原理同样适⽤。

在数学上，这个原理称为最⼤熵原理(the maximum entropy principle)。

让我们看⼀个拼⾳转汉字的简单的例⼦。

假如输⼊的拼⾳是"wang-xiao-bo"，利⽤语⾔模型，根据有限的上下⽂(⽐如前两个词)，我们能给出两个最常见的名字“王⼩波”和“王晓波 ”。

⾄于要唯⼀确定是哪个名字就难了，即使利⽤较长的上下⽂也做不到。

当然，我们知道如果通篇⽂章是介绍⽂学的，作家王⼩波的可能性就较⼤；⽽在讨论两岸关系时，台湾学者王晓波的可能性会较⼤。

熵与流

一条最快的路径（熵产生最大）会被
熵
重复最多（流动最大）从非平衡态A到任意状态B （可以为非
平衡），按照同样的推理也可以得到
一条“最可能”（重复最多）的路径，
它是熵产生最大的路径
任何时候，我们开始一个从A点的演化，将最有可能观察到熵产生最大的那条演化路径
Dewar的原思路
对于任意一条演化路径r，定义概率pr
有关热力学与进化论
Survival of the likeliest /thesis/detail.as
p?id=209&type=0
有关热力学与生物
请关注：
Eric Smith /profiles/?pid=75
统计解释
A
C D
E F B
给定一个网络从A到B有3条路径：
ACEFB ACFB ADB
每1秒都有一个粒子从A流入每个粒子在每个节点进行随机
选择一条边转移粒子到B点就消失
最短的路径具有最大的流动是因为观察者时间尺度的跳跃！
C E
F
C E
F
C E
总结
热力学第二定律（熵最大化）作为一种系统的目标会作用到不同对象上（可以是分子，可以是流动）
最大化信息熵可作为一种元优化而导出各种具体的优化（如最大流、最大熵产生）
熵来源于观察者观察客观世界的缺陷（忽略信息）
统计物理、最大熵原理蕴含着达尔文的自然选择原理（最适者生存最可能者生存）
到当N非常大的时候，观察者只能得到结论N/2，其他结
果“消失了”
解释
000000 000001 0…00…010 000111 001101 …… 111111

最大熵增益

最大熵增益最大熵增益（Maximum Entropy Gain）是一种常用的特征选择方法，常用于构建决策树和进行信息增益量化。

下面是关于最大熵增益的相关参考内容：1. 信息熵（Information Entropy）：在介绍最大熵增益前，需要先了解信息熵的概念。

熵的概念最早由香农提出，用于描述信息的不确定性。

在信息论中，信息熵常用于度量一个随机变量的不确定性，可以用以下公式表示：H(X) = -Σp(x)log2p(x)，其中p(x)为随机变量X取某个值x的概率。

2. 信息增益（Information Gain）：信息增益是用来度量特征对于决策问题的区分能力的指标。

在决策树的特征选择中，通常使用信息增益来选择最优特征。

信息增益可以用以下公式表示：Gain(D,A) = H(D) - Σ(Dv/D)H(Dv)，其中D表示数据集，A表示特征，Dv表示根据特征A的取值v划分的子数据集，H(D)是数据集D的信息熵。

3. 最大熵原理（Maximum Entropy Principle）：最大熵原理是一个基于最大熵原则的概率模型。

最大熵原理认为，在已知的一些有限信息下，应选择熵最大的概率模型作为预测模型。

最大熵原理通过最大熵模型来表示不确定性，可以通过最优化问题来求解模型参数。

4. 最大熵增益的计算方法：最大熵增益是基于最大熵原理的特征选择方法。

最大熵增益的计算方法包括以下几个步骤：首先，计算初始数据集的信息熵H(D)；然后，对于特征A的每个取值v，计算根据特征A的取值v划分后的数据集的信息熵H(D|A=v)；接着，计算特征A的信息增益Gain(D,A) = H(D) - Σ(Dv/D)H(Dv)；最后，选择信息增益最大的特征作为最优特征。

5. 最大熵增益的优缺点：最大熵增益是一种常用的特征选择方法，具有一定的优点和缺点。

优点是最大熵增益考虑了各个特征的不确定性，能够在一定程度上提高特征选择的准确性；缺点是最大熵增益的计算复杂度较高，需要计算每个特征的信息熵和条件熵，对于大规模数据集和高维特征空间的情况可能计算困难。

最大熵值法

~ LP (P ) ≡ log ∏ P(w | h ) h,w ~ P (h , w )
~ = ∑ P (h, w) log P(w | h )
h,w
指数型代入
~ ~ ) ~ LP (P ) = ∑ P (h, w) ∑ λi f i (h, w) ∑ P (h, w)log ∑ exp ∑ λi f i (h, w h ,w w i h ,w i ~ ~ = ∑ P (h, w)∑ λi f i (h, w) ∑ P (h )log ∑ exp ∑ λi f i (h, w) h ,w i h w i
h ,w
~ P( f ) = P ( f )
(任意的机率分布)
– 称为限制(方程式)
特徵与限制
~ ∑ P(h, w) f (h, w) = ∑ P (h, w) f (h, w)
h,w h,w
P(h, w) P (w | h ) = P(h )
~ P (h ) ≈ P (h )
~ ~ P (h )P(w | h ) f (h, w) = ∑ P (h, w) f (h, w) ∑
∵ (A) h, ∑ P(w | h ) = 1
w
γ
γ ∑ P(w | h ) = ∑ exp ∑ λi f i (h, w) exp ~ 1 = 1 P (h ) w w i exp exp γ ~ 1 ∑ exp ∑ λi f i (h, w) = 1 P (h ) w i γ ~ 1 = P (h ) 1 ∑ exp ∑ λ f (h, w)
简介
差补法(interpolation)与最大熵值法的差别
– 个别与整体训练
掷骰子问题
P(1) + P(2) + P(3) + P(4) + P(5) + P(6) = 1 1 P (1) + P(2 ) = 2

基于遥感的科尔沁沙地蒸散发时空动态

第 28 卷第 3 期 2021 年 6 月
水土保持研究 ResearchofSoiland WaterConservation
Vol.28,No.3 Jun.,2021
基于遥感的科尔沁沙地蒸散发时空动态
张鹏1,4,张圣微1,2,3,徐冉1,高露1,高文龙1,杜银龙1
(1.内蒙古农业大学水利与土木建筑工程学院,呼和浩特 010018; 2.内蒙古自治区水资源保护与利用重点实验室,呼和浩特 010018;3.内蒙古自治区农牧业大数据研究与应用重点实验室,呼和浩特 010018;4.内蒙古自治区鄂尔多斯水文勘测局,内蒙古鄂尔多斯 017020)
(ET)的时空变化以及不同土地覆被类型下 ET 影响因素,并得出以下结论:(1)研究区地表参数和 ET 估算值均与实
测值拟合良好,表明 MEP 模型与 SEBAL 模型可以为半干旱地区提供合理的 ET 估算值;(2)对比 SEBAL 模型与
MEP 模型发现:春季差异较小,夏季差异较大,主要集中在沙丘地:(3)ET 的时空分布为:时间上在 5—6 月呈上升趋
势,7—8月初保持较高状态,9—10月呈显著降低趋势,空间上呈现出湖泊较高,农田与草甸次之,沙丘最低的分布情
况,影响 ET 空间分布的主要因素是土壤类型、土壤含水量以及植被状况。
关键词:蒸散发;MEP 模型;SEBAL 模型;土地覆被;时空分布
中图分类号 :TV11;TP79
摘要:蒸散发(evapotranspiration,ET)的精准估算对于干旱半干旱地区的水资源规划以及荒漠化防治具有重要意

最大熵模型核心原理

最大熵模型核心原理一、引言最大熵模型(Maximum Entropy Model, MEM)是一种常用的统计模型，它在自然语言处理、信息检索、图像识别等领域有广泛应用。

本文将介绍最大熵模型的核心原理。

二、信息熵信息熵(Entropy)是信息论中的一个重要概念，它可以衡量某个事件或信源的不确定度。

假设某个事件有n种可能的结果，每种结果发生的概率分别为p1,p2,...,pn，则该事件的信息熵定义为：H = -∑pi log pi其中，log表示以2为底的对数。

三、最大熵原理最大熵原理(Maximum Entropy Principle)是指在所有满足已知条件下，选择概率分布时应选择具有最大信息熵的分布。

这个原理可以理解为“保持不确定性最大”的原则。

四、最大熵模型最大熵模型是基于最大熵原理建立起来的一种分类模型。

它与逻辑回归、朴素贝叶斯等分类模型相似，但在某些情况下具有更好的性能。

五、特征函数在最大熵模型中，我们需要定义一些特征函数(Function)，用来描述输入样本和输出标签之间的关系。

特征函数可以是任意的函数，只要它能够从输入样本中提取出有用的信息，并与输出标签相关联即可。

六、特征期望对于一个特征函数f(x,y)，我们可以定义一个特征期望(Expected Feature)，表示在所有可能的输入样本x和输出标签y的组合中，该特征函数在(x,y)处的期望值。

特别地，如果该特征函数在(x,y)处成立，则期望值为1；否则为0。

七、约束条件最大熵模型需要满足一些约束条件(Constraints)，以保证模型能够准确地描述训练数据。

通常我们会选择一些简单明了的约束条件，比如每个输出标签y的概率之和等于1。

八、最大熵优化问题最大熵模型可以被看作是一个最优化问题(Optimization Problem)，即在满足约束条件下，寻找具有最大信息熵的概率分布。

这个问题可以使用拉格朗日乘子法(Lagrange Multiplier Method)来求解。

普里高津最小熵产原理

普里高津最小熵产原理1. 简介普里高津最小熵产原理（Prigogine Minimum Entropy Production Principle）是由比利时物理学家伊利亚·普里高津（Ilya Prigogine）于20世纪60年代提出的一个重要物理原理。

该原理指出，开放系统在达到稳态时，其熵产率将趋向于最小值。

在热力学中，熵是一个衡量系统无序程度的物理量，而熵产则是指系统中熵的变化率。

根据第二定律热力学，封闭系统中的熵总是趋向于增加，直到达到平衡态。

然而，在开放系统中，通过与环境进行物质和能量交换，系统可以维持非平衡态，并且表现出自组织和自发性行为。

普里高津最小熵产原理通过对开放系统的分析，揭示了自然界中许多现象的基本规律，并在生命科学、复杂系统等领域得到广泛应用。

2. 基本原理普里高津最小熵产原理可以通过以下几个基本原理来解释：2.1 系统开放性首先，普里高津最小熵产原理要求系统是一个开放系统，即与外部环境进行物质和能量交换。

这种交换可以是通过输入和输出流的方式进行的。

对于生物系统而言，典型的开放系统包括细胞、器官和生物群体等。

对于非生物系统，例如大气环境、地球等也可以被视为开放系统。

2.2 稳态条件普里高津最小熵产原理适用于开放系统达到稳态时的情况。

稳态是指系统处于一个相对稳定的状态，其内部各项指标保持在一定范围内波动，而不会出现剧烈变化。

在稳态条件下，系统中各项参数的变化率趋向于零。

这意味着系统处于动态平衡状态，并且能够维持自身结构和功能。

2.3 熵产率最小根据普里高津最小熵产原理，当一个开放系统达到稳态时，其熵产率将趋向于最小值。

熵产率是指单位时间内系统中熵的增加量。

在达到稳态后，开放系统会通过输入和输出流与环境进行物质和能量交换。

这种交换使得系统能够维持非平衡状态，并且表现出自组织和自发性行为。

普里高津最小熵产原理认为，系统通过最小化熵产率来提高其自组织和自发性行为的效率。

通过减少熵产，系统能够更好地利用输入流中的能量，并将其转化为有用的工作。

熵产生原理与不可逆过程热力学简介

熵产生原理与不可逆过程热力学简介一、熵产生原理(Principl ｅ o ｆ E ｎｔropy —P ｒo ｄuction)熵增加原理是热力学第二定律的熵表述。

而这个原理用于判断任一给定过程能否发生,仅限于此过程发生在孤立体系内。

而对于给定的封闭体系中，要判断任一给定的过程是否能够发生,除了要计算出体系内部的熵变,同时还要求出环境的熵变，然后求总体的熵变。

这个过程就相当于把环境当成一个巨大的热源，然后与封闭体系结合在一起当成孤立体系研究。

但是一般来说,绝对的孤立体系是不可能实现的。

就以地球而言，任何时刻，宇宙射线或高能粒子不断地射到地球上。

另外，敞开体系也不能忽视，就以生物体为例，需要不停地与环境进行物质交换,这样才能保证它们的生存。

19４5年比利时人I. Ｐｒig ｏgine 将热力学第二定律中的熵增加原理进行了推广，使之能够应用于任何体系（封闭的、敞开的和孤立的）.任何一个热力学体系在平衡态时,描述系统混乱度的状态函数S 有唯一确定值,而这个状态函数可以写成两部分的和，分别称为外熵变和内熵变。

外熵变是由体系与环境通过界面进行热交换和物质交换时进入或流出体系的熵流所引起的.熵流(entrop ｙ fl ｕｘ）的概念把熵当作一种流体,就像是历史上曾经把热当作流体一样。

内熵变则是由于体系内部发生的不可逆过程（例如，热传导、扩散、化学反应等）所引起的熵产生(en ｔr ｏpy-production)。

由上述的概念，可以得到在任意体系中发生的一个微小过程，有：S d S d dS i e sys +=＝S d T Qi +δ （1—1）,式中S d e 代表外熵变,S d i 代表内熵变。

这样子就将熵增加原理推广到了熵产生原理.而判断体系中反应的进行，与熵增加原理一致,即0≥S d i （〉不可逆过程；= 可逆过程）（1—2）而文字的表述就是：“体系的熵产生永不为负值，在可逆过程中为0，在不可逆过程中大于0"。

maxent原理

maxent原理maxent原理（Maximum Entropy Principle）是一种概率模型的学习方法，它是根据已知的一些约束条件，通过最大熵原理来确定概率模型的参数。

maxent原理在自然语言处理、信息检索、机器学习等领域被广泛应用。

maxent原理的提出源于统计物理学中的热力学原理，即给定一些已知的约束条件下，选择概率模型时应该尽可能地减少对未知的偏见。

在自然语言处理中，maxent原理可以通过最大熵模型来解决分类问题。

最大熵模型是一种判别模型，其目标是找到一个在已知约束条件下，对未知数据分布偏见最小的模型。

最大熵模型的基本思想是，在已知约束条件下，选择一个概率分布，使得该分布的熵最大。

熵在信息论中表示随机事件的不确定性，熵越大表示不确定性越大。

maxent原理认为，当我们对未知数据分布的了解不足时，应该选择那个不带有任何偏见的分布，即熵最大的分布。

在应用最大熵模型进行分类时，我们首先需要确定一组特征函数，这些特征函数描述了输入数据与输出标签之间的关系。

然后，通过最大熵原理确定模型的参数，使得模型在训练数据上满足已知的约束条件。

最大熵模型的训练过程可以通过迭代算法来实现，常用的算法有改进的迭代尺度法（Improved Iterative Scaling，IIS）和改进的迭代尺度法（Generalized Iterative Scaling，GIS）。

这些算法通过迭代的方式不断调整模型的参数，直到满足约束条件为止。

maxent原理在自然语言处理领域的应用非常广泛。

例如，在文本分类任务中，可以通过最大熵模型来实现文本的自动分类。

在信息检索任务中，可以使用最大熵模型来对查询和文档进行匹配。

在机器翻译任务中，最大熵模型可以用于对句子的翻译做出最合理的选择。

最大熵模型具有很好的灵活性和扩展性，可以通过增加新的特征函数来提高模型的性能。

此外，最大熵模型的学习过程是一种无监督学习方法，不需要人工标注的训练数据，可以从大规模的无标注数据中学习。

极值风速风向的联合概率密度函数

极值风速风向的联合概率密度函数楼文娟;段志勇;庄庆华【摘要】基于最大熵原理,构建极值风速风向的联合概率密度函数,并与Copula函数建立相互关联.以我国某地的极值风数据为例,建立极值风速的Gumbel分布模型以及对应风向的二阶混合von Mises分布模型;使用非线性参数优化算法确定极值风速风向的联合分布模型.采用该模型计算各风向角下不同重现期的基本风速值,并与建筑结构荷载规范值(GB 50009-2012)进行对比.结果表明,联合分布模型能够有效表征实际风速风向的概率分布特征.分别采用Spearman秩相关系数和线性-角度变量相关系数对模型的相关性予以验证,探究模型的有效性.%A joint probability density function for representing both extreme wind direction and speed was constructed based on the maximum entropy principle and established relationship with Copula function.Taking the extreme wind records of somewhere in China as an example, the Gumbel distribution model for extreme wind speed and the second order mixture von Mises distribution model for corresponding wind direction were established respectively;then the joint probabilistic distribution model was determined using nonlinear optimization algorithm.Reference wind speeds of different recurrence intervals in all directions were calculated by applying the model, which were compared to the code values of building structure load (GB 50009-2012).Results show that the proposed joint probabilistic model describes the characteristics of the distribution of actual extreme wind speed and direction effectively.The correlation of joint probabilistic model wasverified by checking the Spearman rank and the linear-angular correlation coefficient respectively, which proves its validity.【期刊名称】《浙江大学学报（工学版）》【年(卷),期】2017(051)006【总页数】7页(P1057-1063)【关键词】极值风速;风向;联合概率密度函数;基本风速;相关系数【作者】楼文娟;段志勇;庄庆华【作者单位】浙江大学结构工程研究所,浙江杭州 310058;浙江大学结构工程研究所,浙江杭州 310058;温州瓯江口产业集聚区管理委员会,浙江温州 325026【正文语种】中文【中图分类】TU318在建筑设计中,风荷载是计算风振响应、确定抗风设计的基础.对于高层建筑、大跨度结构等柔性结构,风荷载是主要的控制性荷载,合理的风荷载值关乎工程建设的安全性和经济性.作为典型的随机动力荷载,风荷载通常是以极值风速为变量通过建立风速的概率密度函数而确定的,如我国建筑结构荷载规范(GB 50009-2012)[1]规定,以极值I型分布作为极值风速的概率分布模型确定建筑设计风荷载.国内外学者开展了大量关于极值风速概率密度分布数学模型和计算理论的研究[2-4],提出了有效的样本筛选、模型建立和参数优化方法.在样本筛选方面,提出了年最大值法、跨阈法和独立风暴法等抽样方法;在数学模型方面,建立了Gumbel分布、Frechet分布、Weibull分布、广义Pareto分布[5]等极值分布理论;基于数学模型和参数特征,发展了矩法、极大似然法、粒子群算法等参数优化方法.尽管这些方法理论不一,但在以极值风速为单值变量的基本风速预测中,均具有较好的计算优度.然而,风向角也是极值风速的重要特征参数,仅考虑风速大小而忽略风向角是不合理的.Simiu等[6]研究了飓风区的极值风荷载,指出若不考虑风向的影响,50 a重现期的极值风荷载明显偏保守.Goyal等[7]研究了风向对混合住宅群的结构可靠性影响,结果表明忽略风向将高估计算值.王钦华等[8]研究了风向对某超高层建筑等效静力风荷载的影响,结果表明不考虑风向影响的建筑物等效风荷载偏保守,应考虑风向对基本风压的影响.日本风荷载规范考虑风向对建筑物效应的影响,给出了建筑结构设计的风向折减因子[9].因此,以极值风速和风向角为双变量,建立极值风速风向的联合概率密度(joint probabilistic density function,JPDF)模型,对更精确、合理地计算结构风荷载具有重要的现实意义.有关极值风速风向的JPDF模型研究较少,主要原因是风向角变量是周期性的,且观测记录多为非连续的方位角.Johnson等[10]基于谐波函数建立了离散角变量的连续概率密度结构,能有效地拟合方向角变量的概率密度直方图,但该模型采用多个三角函数拟合,形式繁琐,且无法给出固定的概率分布.目前,比较常见的角变量分布有均匀分布、心形分布、包柯西分布、缠绕正态分布和von Mises分布等;其中,von Mises分布和混合von Mises分布被认为是最有效的描述角变量统计特性的分布[11],在图像分析[12]、大气污染防治[13]、风能评估[14]等领域得到了广泛的应用. 在风速风向的JPDF建模方面,陈隽等[15]采用谐波函数模拟风向角分布,并基于不同风向间风速分布相互独立的假定建立了风速风向的JPDF分析方法;范文亮等[16]基于乘法定理导出了离散-连续混合联合分布模型,并建立了风向风速的二维连续联合分布模型.这些模型需要对每个方位角的极值风速样本一一给出概率分布,并对不同风向下的极值风速分布相关性做出假定.然而,由于极值风速样本数量稀少,分布在某些方位角下的极值风速样本量更少,一般较难获得各方位角下的极值风速分布.事实上,风速和风向的联合分布属于角度-线性分布.Johnson等[10]从理论上推导得出,当给定角度变量和线性变量的边缘分布,可以根据最大熵原理导出2个变量的JPDF 结构.目前国外已有学者将该理论应用于风能预测[17]领域.本文根据我国某地的月极值风速和对应风向记录,分别建立极值风速的Gumbel分布和风向的二阶混合von Mises分布模型;在此基础上,使用最大熵原理构建极值风速风向的联合概率密度结构,并与Copula函数建立相互关联;采用该联合分布模型计算不同重现期下的基本风速,与建筑结构荷载规范(GB 50009-2012)中的规范值进行对比;最后,分别采用Spearman秩相关系数和线性-角度变量相关系数对模型的相关性予以验证,探究模型的有效性.1.1 极值风速分布在不同的抽样方法下,学者分别发展了相适应的数学模型和参数估计理论用于重现期下的基本风速计算[4].目前,运用最多的2种抽样方法分别是年最大值法和跨阈法[5].一般认为,以年最大值形成的风速样本服从极值I型(即Gumbel)分布,我国建筑结构荷载规范(GB 50009-2012)即采用该方法计算基本风速.跨阈法通过设置特定的阈值,建立由超越阈值风速形成的极值风速样本,并假定该极值风速样本服从广义帕累托(generalized pareto distribution,GPD)分布.为准确计算极值风速的边缘分布,采用上述2种模型进行对比.Gumbel分布和GPD分布的数学模型如下:Fv(v)=exp[-exp(-y)].(1) Gv(v)=1-(1+by)-1/b,y=(v-u)/a.(2)式中:a、u和b分别为尺度参数、位置参数和形状参数,v为极值风速度量.对Gumbel分布,采用极大似然法确定a、u的参数估计;对GPD分布,采用核拟合优度统计量法确定u,再根据极大似然函数法计算a、u的最佳近似值.1.2 风向圆周分布气象站一般以有限方位角形式记录风向,如国家基本站和一般站采用16个方位角整编风向.由于风向角的非连续性,通常基于风向概率密度直方图对风向分布进行建模.假设有一组离散风向角数据,分散于单位圆(0≤θ<2π)的不同角度区间.将单位圆等分为m份,集中在第k(1≤k≤m)区间的观测点数为nk,则各区间的风向角频度Pk和概率密度f(θk)分别为式中:x=Fv(v)、y=Fθ(θ).Johnson等[10]建议采用谐波函数对风向角直方图进行拟合,为防止概率密度函数出现负值或旁瓣过大,建议同时引入Bartlett窗函数作为权重函数.这种方法需要将与方位角数相等的谐波进行叠加,因此,其数学形式复杂,且不具有对风向分布的普适性.实际上,对风向的圆周分布,目前应用更为广泛的是混合von Mises分布[14].von Mises分布常被称为圆周正态分布,是最主要的用于描述方向数据的一种模型,其概率密度函数为式中:-π<μ≤π,κ>0;I0(κ)为零阶修正贝塞尔函数,计算式为研究表明,风向角变量的概率密度分布一般具有多峰值,单一von Mises分布不能完整描述风向圆周分布特性,通常采用多阶混合von Mises分布表示风向角变量的分布[18],其具体计算式为:式中:c为混合von Mises分布的阶数,根据风向峰值分布特征确定;ωi为各阶von Mises分布的权重系数;φi={ωi,μi,κi}为各阶参数.目前,关于von Mises分布的参数估计方法比较多,如MSBC算法[19]、SU算法[18]等.然而,这些迭代算法计算过程均较为繁琐,且涉及手动查表,实用性较差.因此,本文采用高效简捷的LM算法(Levenberg-Marquardt algorithm)进行参数估计.为保证良好的收敛性,需要对各参数进行初始赋值,具体赋值参见文献[14].1.3 基于最大熵原理的联合分布极值风速风向的联合分布属于角度-线性分布(angular-lineardistribution,AL),Johnson等[10]基于最大熵原理导出了AL分布的概率密度结构,可有效用于描述风速风向的联合分布.假设fv(v)和fθ(θ)分别为极值风速和相应风向的概率密度函数,对应的分布函数为Fv(v)和Fθ(θ),则根据最大熵原理可导出极值风速风向的JPDF形式如下式所示:式中:g(·)是角变量ξ的函数,本文采用二阶混合von Mises函数形式予以表示.从式(8)可以看出,最大熵原理实际上是将风速风向的联合分布函数与其各自的边缘分布函数连接在一起.早在1959年,Sklar指出:可以将一个联合分布分解为多个边缘分布和一个Copula函数,这个Copula函数描述了变量间的相关性[21].根据该理论,极值风速变量与风向变量间的Copula函数存在如下关系:式中:C=C[Fv(v),Fθ(θ)],为极值风速风向变量间的Copula函数.1.4 拟合优度检验为检验文中JPDF的拟合优度,采用确定系数计算理论值与实际值的差异:式中:Tk为理论累积频度值;T为Tk的平均值.η2介于0~1,其值越大,拟合效果越好. 本研究的极值风速风向样本源自我国某地1971年1月1日——2000年12月31日全部360 m的月最大风速和相应风向记录(以正北方向为0°,顺时针为正).月最大风速指每个月内10 min风速样本中的最大值,风向记录包含16个方位角.极值风速频度分布和风向玫瑰分别如图1、2所示.从图中可以看出,月极值风速主要集中在0~20 m/s以内,对应风向记录主要在NNE和SSW方向,具有典型的双峰值分布特征.因此,本文采用二阶混合von Mises拟合风向分布函数,即c=2.根据实测数据,计算得到实测风速风向的联合概率分布,结果如图3所示(图中θ表示风向).从图中可以看出,联合概率分布函数主要集中在10~15 m/s风速区间,风向峰值在25°和200°附近.对极值风速样本分别使用Gumbel分布和GPD分布建立模型,拟合参数如表1所示,分布曲线如图4所示,图中Fv为极值风速的累积分布值.可以看出,Gumbel分布与月极值风速样本的实测分布拟合得很好,计算拟合结果的确定系数接近于1,说明由极大似然估计法确定的Gumbel分布与实际累积分布的差异很小;GPD分布风速分位值略大于Gumbel分布,且与实测累积分布差异较大.因此,采用Gumbel模型建立极值风速风向联合分布.分别采用谐波函数和二阶混合von Mises分布建立风向圆周分布模型,各参数估计值如表2所示,拟合曲线如图5所示.可以看出,在峰值风向处,混合von Mises分布的计算结果略大于谐波函数值.由确定系数计算得到的两类函数拟合优度分别为R2F=0.998和R2von=0.988,说明这2种模型都能较好地表征风向圆周分布.下文将采用二阶混合von Mises分布模型建立极值风速风向联合分布.根据极值风速和风向的边缘分布模型,基于最大熵理论建立联合概率分布JPDF,其中g(ξ)仍采用二阶von Mises函数,拟合参数如表2所示,联合分布如图6所示.对比图3和图6可以看出,联合分布模型呈现2个明显的峰值,与实测分布吻合良好,但模型的连续、光滑处理导致实测数据局部“毛刺”现象消失.我国建筑结构荷载规范(GB 50009-2012)规定采用极值I型分布确定不同重现期下的基本风速.设重现期为N,则基本风速UGN的计算式为式中:a、u的具体取值参见表1.由式(12)计算得到的是不考虑风向角分布的基本风速.采用联合分布模型同样可计算得到全风向角下的基本风速UJN,其计算公式为理论上,由式(13)与式(12)计算获得的解析解应满足UJN=UGN;在特定风向角下,联合分布模型计算的极值风速边缘分布应与由相应风向角的极值风速Gumbel分布模型一致,即Fv,θ(v|θ)=G(v|θ).采用上述原理可检验联合分布模型建立过程是否正确.如图7、8所示分别为全风向角(即不考虑风向角的概率分布)下不同重现期的基本风速曲线,及风向角为NNE时的概率分布模型对比.从图中可以看出:1)由联合分布模型JPDF计算得到的基本风速-重现期曲线与由Gumbel分布模型计算得到的基本风速-重现期曲线完全吻合,说明参数优化过程正确有效;2)当风向角为NNE时,联合分布模型JPDF确定的极值风速边缘分布与实测结果及Gumbel分布模型几乎一致.上述检验结果表明:JPDF模型能够有效表征实际风速风向的概率分布特征. 现采用极值风速风向的JPDF分布模型计算不同风向角的基本风速.由定义确定不同风向角下基本风速U的计算式如下:式中:fv,θ(v|θ)表示风向角为θ时,极值风速的条件分布函数.式(14)较复杂,难以获得解析解.本文采用数值迭代法求解基本风速. 如图9所示为当重现期N=100 m时基本风速随风向角的变化规律.可以看出,在不同风向角下,基本风速具有一定波动,其中风向角θ=200°附近时,基本风速值最大,该结果与图3所呈现处的现象一致.如图10所示为当重现期分别为N=5,10,50, 100,600 m时的基本风速玫瑰图,图中同时列出由式计算得到的基本风速以示对比.从图中可知,在30°~210°区间(顺时针),由JPDF分布模型(式(14))计算得到基本风速值大于按规范计算得到的结果(式(12));而在区间210°~30°(顺时针)内,现象与之相反.值得注意的是,虽然由联合分布模型JPDF计算得到的基本风速具有波动性,但波动的幅度比较小.计算表明对50 a重现期即N=600 m时,波动幅度小于5%.这说明,计算样本中极值风速与风向的相关性较差.为验证上述结论,下文对极值风速风向的相关性进行计算.根据Sklar提出的Copula理论,Copula函数(式(10))表征2个变量间的相关性.基于Copula函数计算2个变量间的相关性测度,采用Spearman秩相关系数ρ进行衡量,计算式[21]为式中:x=Fv(v)、y=Fθ(θ).根据Copula函数的拟合结果,计算得到Spearman秩相关系数|ρ|=0.066 8,说明极值风速和风向正向变化的一致性较差.Spearman秩相关系数仅对变量间的变化方向一致性进行度量,并不能衡量变量间相关性程度. Mardia[22]提出采用线性-角度变量相关系数计算极值风速风向的相关性:式中:rvc=fcorr(v,cosθ),rvs=fcorr(v,sinθ),rvc= fcorr(cosθ,sinθ),fcorr(·)表示序列间的相关系数.计算结果表明,文中采用的极值风速与对应风向间的相关系数为r2=0.001 3,说明极值风速与对应风向不仅正向变化一致性较差,其相关性程度也较弱.这一结论与由联合分布模型计算得到的基本风速风向间的相关性表现一致.因此,极值风速风向联合分布模型能够较好地表征实际样本间的相关性.总体而言,该模型能比较有效地预测极值风速风向的概率分布.(1)二阶混合von Mises分布能较好地表征风向角的圆周分布,基于最大熵原理建立的极值风速风向联合概率密度函数与实测值吻合良好.(2)采用本文所建立的极值风速风向联合概率密度函数JPDF可以较有效地计算不同重现期下的基本风速,且在特定风向角下的极值风速分布与建筑结构荷载规范(GB 50009-2012)的分布规律相一致.(3)由Spearman秩相关系数和线性-角度变量相关系数计算结果表明,本文采用的极值风速与风向序列相关性较差,符合极值风速风向联合概率密度函数确定的变量相关性,验证了模型的有效性.需要说明的是,本文在建立极值风速风向的联合概率分布模型时,由于非主风向角和高风速值的数据量较少,对高保证率(如50年一遇、100年一遇)的基本风速值精度尚有待考证,后续工作应对数据量更丰富的风速风向样本进行研究.【相关文献】[1]中华人民共和国国家标准.建筑结构荷载规范:GB 50009-2012[S].北京:中华人民共和国住房和城乡建设部,2011.[2]FIELD C ing the GH distribution to model extreme wind speeds[J].Journal of Statistical Planning and Inference,2004,122(1):15-22.[3]段忠东,欧进萍,周道成.极值风速的最优概率模型[J].土木工程学报,2002,35(5):11-16.DUAN Zhong-dong,OU Jin-ping,ZHOU Dao-cheng. The optimal probabilistic distribution for extreme wind speed[J].China Civil Engineering Journal,2002, 35(5):11-16.[4]李宏男,王杨,伊廷华.极值风速概率方法研究进展[J].自然灾害学报,2009,18(2):15-26.LI Hong-nan,WANG Yang,YI Ting-hua.Advance in research on extreme wind speed models[J].Journal of Natural Disasters,2009,18(2):15-26.[5]SIMIU E,HECKERT N A.Extreme wind distribution tails:aƶpeaks over threshold approach[J].Journal of Structural Engineering,2014,122(5):539-547.[6]SIMIU E,HECKERT N A.Ultimate wind loads and direction effects in non-hurricane and hurricane-prone regions[J].Environmetrics,1998,9(4):433-444.[7]GOYAL P K,DATTA T K.Effect of wind directionality on the vulnerability of rural houses due to cyclonic wind[J].American Society of Civil Engineers,2014, 14(4):258-267.[8]王钦华,石碧青,张乐乐.风向对某超高层建筑等效静风荷载的影响[J].汕头大学学报:自然科学版,2012, 27(2):48-53.WANG Qing-hua,SHI Bi-qing,ZHANG Le-le.Influence of wind direction on equivalent static wind loads of a super high-rise building[J].Journal of Shantou University:Natural Science,2012,27(2):48-53.[9]TAMURA Y,OHKUMA T,KAWAI H,et al.Revision of AIJ recommendations for wind loads on buildings[C]∥Structures Congress.Portland:[s.n.],2015:1-10.[10]JOHNSON,RICHARD A,WEHRLY,et al.Some angular-linear distributions and related regression models[J].Journal of the American Statistical Association, 1978,73(363):602-606.[11]KAMISAN N A B,HUSSIN A G,ZUBAIRI Y Z. Finding the best circular distribution for southwesterly monsoon wind direction in Malaysia[J].Sains Malaysiana,2010,39(3):387-393.[12]CALDERARA S,CUCCHIARA R,PRATI A.Detection of abnormal behaviors using a mixture of von Misesdistributions[C]∥Proceedings of the 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.London:IEEE,2007:141-146.[13]ALLEN C T,YOUNG G S,HAUPT S E.Improving pollutant source characterization by better estimating wind direction with a genetic algorithm[J].Atmospheric Environment,2007,41(11):2283-2289.[14]CARTA J A,BUENO C,RAMIREZ P.Statistical modelling of directional wind speeds using mixtures of von Mises distributions:Case study[J].Energy Conversion and Management,2008,49(5):897-907.[15]陈隽,赵旭东.总体样本风速风向联合概率分析方法[J].防灾减灾工程学报,2009,29(1):63-70. CHEN Jun,ZHAO Xu-dong.Analytical method of joint probability density function of wind speed and direction from parent population[J].Journal of Disaster Prevention and Mitigation Engineering,2009, 29(1):63-70.[16]范文亮,李正良,张培.风向风速的联合概率结构建模[J].土木工程学报,2012,45(4):81-90.FAN Wen-liang,LI Zheng-liang,ZHANG Pei.Modeling of the joint probabilistic structure ofwind direction and speed[J].China Civil Engineering Journal, 2012,45(4):81-90.[17]CARTA J A,RAMIREZ P,BUENO C.A joint probability density function of wind speed and direction for wind energy analysis[J].Energy Conversion andManagement,2008,49(6):1309-1320.[18]ERDEM E,SHI parison of bivariate distribution construction approaches for analysing wind speed and direction data[J].Wind Energy,2011,14(1): 27-41.[19]CHANG-CHIEN S J,HUNG W L,YANG M S.On mean shift-based clustering for circular data[J].Soft Computing,2012,16(6):1043-1060.[20]HUNG W L,CHANG-CHIEN SJ,YANG M S.Selfupdating clustering algorithm for estimating the parameters in mixtures of von Mises distributions[J].Journal of Applied Statistics,2012,39(10):2259-2274.[21]韦艳华,张世英.Copula理论及其在金融分析上的应用[M].北京:清华大学出版社,2008.[22]MARDIA K V.Linear-circular correlation coefficients andrhythmometry[J].Biometrika,1976,63(2):403- 405.。

托卡马克等离子体的弛豫态分析

• 各类电流分布模式可以通过调节等离子体电阻、纵场强度、边界电场等可控参数来实现。特别重要的研究结果是发现存在一个决定弛豫态模式的关键参数E0/B0，当此参数增大到高于其临界值时，等离子体将由典型实验电流分布突变到其他形态。相反的过程也会在此关键参数由高于到低于其临界值时出现。
2
分析NSTX电流分布及其突变现象
DC-HICD以直流电压来维持一个持续的螺旋注入，偏压线圈电流形
成的角向磁通贯穿于两个电极之间，电压加于电极，持续注入与角向磁通交连的环向磁通，实现持续的螺旋注入。
基于湍动等离子体的弛豫性质，由最小能量原理，等离子体弛豫过程将使J/B比值趋向于空间均匀化，因而在等离子体内部区域得以产生并维持一个环向的驱动电流。即螺旋注入电流驱动的理论
and magnetic field (dash line) on mid-plane for =10.5.
There exists the reversion of both j and B in central part of plasmas
17
ASIPP
Different State () Achieved by Adjusting
摘要
应用最小耗散原理，以磁螺旋平衡和能量平衡作为限制条件，研究任意纵横比托卡马克等离子体的弛豫态特性。
首先应用变分原理得到体系的欧拉-拉格朗日方程，然后解析求解以对等离子体弛豫态性质进行分析，并进一步应用数值方法，对给定的参数和边界条件，得到欧拉-拉格朗日方程组及磁螺旋平衡和能量平衡的自洽解以及一些重要的等离子体参数。
(3) For > 9.65 -- There exists the reversion of both
j and B in the central part of plasmas. Their reversion points are quite close to each other when

最大熵原理的实际应用

最大熵原理的实际应用1. 简介最大熵原理（Maximum Entropy Principle）是一种基于信息论的数学模型，其主要思想是在满足已知约束条件的情况下，选择一个最平均、最中立的概率分布。

该原理广泛应用于概率模型、机器学习和自然语言处理等领域。

本文将介绍最大熵原理的核心概念，并探讨其在实际应用中的具体情况。

2. 最大熵原理的核心概念最大熵原理源自于热力学中的熵概念，熵可以衡量一个系统的不确定性。

在概率论和信息论中，熵被定义为表示随机变量不确定性的度量。

最大熵原理认为，在所有满足已知约束条件的概率分布中，熵最大的概率分布是最中立、最平均的分布。

3. 实际应用案例3.1 语言模型在自然语言处理中，语言模型是评估一段文字或句子的概率的模型。

最大熵原理可以用于语言模型的建模，通过已知的约束条件，找到一个最平均的概率分布。

以文本分类为例，已知一些文本的特征和类别，可以使用最大熵模型来建立分类器，通过最大化熵来提高分类的准确性。

3.2 信息检索在信息检索中，最大熵原理可以应用于构建查询模型。

已知用户的查询和文档的特征，可以使用最大熵模型来计算查询与文档的相关性，从而实现精准的文档检索。

3.3 自然语言处理在自然语言处理领域，最大熵原理可以用于解决多个问题，如词性标注、命名实体识别和句法分析等。

通过最大熵模型，可以根据已知的语言特征和标记约束，预测未知的词性、实体或句法结构，提高自然语言处理任务的准确性和效率。

3.4 机器学习最大熵原理在机器学习中也得到了广泛的应用。

它可以用于分类、回归和聚类等任务。

通过最大熵模型，可以从有限的标记样本中学习出一个最平均、最中立的分类器，提高分类的准确性。

4. 总结最大熵原理作为一种基于信息论的数学模型，在概率模型、机器学习和自然语言处理等领域具有广泛的应用。

本文介绍了最大熵原理的核心概念，并针对语言模型、信息检索、自然语言处理和机器学习等领域的实际应用，进行了详细的阐述。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Physics Reports 426(2006)1–45/locate/physrepMaximum entropy production principle in physics,chemistryand biologyL.M.Martyushev a ,b ,∗,V .D.Seleznev ba Institute of Industrial Ecology,Russian Academy of Sciences,S.Kovalevskoi 20A,Ekaterinburg,620219,Russiab Ural State Technical University,Ekaterinburg,620002,RussiaAccepted 5December 2005Available online 23January 2006editor:H.OrlandAbstractThe tendency of the entropy to a maximum as an isolated system is relaxed to the equilibrium (the second law of thermodynamics)has been known since the mid-19th century.However,independent theoretical and applied studies,which suggested the maximization of the entropy production during nonequilibrium processes (the so-called maximum entropy production principle,MEPP),appeared in the 20th century.Publications on this topic were fragmented and different research teams,which were concerned with this principle,were unaware of studies performed by other scientists.As a result,the recognition and the use of MEPP by a wider circle of researchers were considerably delayed.The objectives of the present review consist in summation and analysis of studies dealing with MEPP.The ﬁrst part of the review is concerned with the thermodynamic and statistical basis of the principle (including the relationship of MEPP with the second law of thermodynamics and Prigogine’s principle).Various existing applications of the principle to analysis of nonequilibrium systems will be discussed in the second part.©2005Elsevier B.V .All rights reserved.PACS:05.70.Ln;05.20.−y;05.65.+bKeywords:Maximum entropy production principle (MEPP);Ziegler’s and Prigogine’s principles;MEPP applicationsContents0.Introduction ...........................................................................................................21.MEPP in nonequilibrium thermodynamics .................................................................................41.1.Fundamentals of linear nonequilibrium thermodynamics .................................................................41.2.Critical review and development of Ziegler’s approach ..................................................................61.2.1.Formulation of Ziegler’s principle .............................................................................61.2.2.Some arguments for substantiation of Ziegler’s principle ..........................................................91.2.3.Deduction of Onsager’s principle from Ziegler’s principle .........................................................101.2.4.MEPP and the second law of thermodynamics ...................................................................111.2.5.On the possible “paradox”of the use of the variational approach ....................................................111.2.6.The relation of Ziegler’s maximum entropy production principle and Prigogine’s minimum entropy production principle ....121.3.Brief conclusion (13)∗Corresponding author.Tel.:+73433493516;fax:+73433743771.E-mail address:mlm@ecko.uran.ru (L.M.Martyushev).0370-1573/$-see front matter ©2005Elsevier B.V .All rights reserved.doi:10.1016/j.physrep.2005.12.0012L.M.Martyushev,V.D.Seleznev/Physics Reports426(2006)1–452.The MEPP in nonequilibrium statistical physics (13)2.1.MEPP and kinetic theory of gas (13)2.2.MEPP and general statistical theory of nonequilibrium processes(linear response theory,etc.) (16)2.3.The most probable trajectory of the evolution and the information theory (20)2.3.1.The method of the most probable path of the evolution (20)2.3.2.Introduction to Jaynes’method (22)2.3.3.The variational principle for the most probable state (23)2.3.4.The second derivation of MEPP using Jaynes’method (24)2.4.Brief conclusion (25)3.The use of MEPP in different sciences (26)3.1.MEPP in hydrodynamics.Transfer in the atmosphere and oceans (26)3.1.1.Convective transfer.Paltridge’s method (26)3.1.2.Statistical model of turbulence (30)3.2.MEPP in a crystal growth and other solid state transformations (31)3.3.MEPP and transfer of electrical charge,radiation,etc (36)3.4.MEPP in chemistry and biology (37)3.4.1.Berthelot and Shahparonov’s principles (37)ws of biological evolution (39)4.Conclusion (41)References (42)He who gains time,gains everything.G.B.Molière0.IntroductionThe notions of entropy and its production in equilibrium and nonequilibrium processes not only form the basis of modern thermodynamics and statistical physics,but have always been at the core of various ideological discussions concerned with the evolution of the world,the course of time,etc.These issues were raised by many outstanding scientists,including Clausius,Boltzmann,Gibbs and Onsager.As a result,today we have thousands of books,reviews and papers dedicated to properties of the entropy in different systems.The present review deals with the entropy production behavior in nonequilibrium processes.This topic is not new.What was the impetus to this study? Attempts toﬁnd some universal function,whose extremum would determine the development of a system,have been made at all times.A certain success was achieved in optics(Fermat’s principle),mechanics(the principle of least action), and some other disciplines.Entropy,which was assigned a semimystic signiﬁcance in the“world management”nearly at the moment it appeared,has been doomed to be a quantity describing the progress of nonequilibrium dissipative process.Great contribution has been done in this respect by two scientists,namely Clausius,who in1854–1862 introduced the notion of entropy in physics and advanced the concept of the thermal collapse of the Universe,and Prigogine.In1947the latter researcher proved the so-called minimum entropy production principle and then spent a number of years developing and popularizing the apparatus of nonequilibrium thermodynamics and his principle as applied to description of various nonequilibrium processes in physics,chemistry and biology.However,applications of this principle are relatively narrow as it was acknowledged by Prigogine himself and his opponents.Nevertheless, two essentially extreme opinions have been formed in the literature.Some scientists glorify the principle and think it is capable of describing various nonequilibrium processes to a certain extent.Other researchers,who observed weak points of the principle and unceasing efforts by Prigogine and his progeny to generalize it,are very skeptic about the possibility to formulate universal entropy principles,which would govern so diverse and dissimilar nonequilibrium processes.The so-called maximum entropy production principle(MEPP)is known much less(even among specialists in physics of nonequilibrium processes).This antipode,as its name seemingly means,of Prigogine’s principle has been overshadowed by its more famous twin.MEPP was independently proposed and used by several scientists throughout the 20th century when they dealt with general theoretical issues of thermodynamics and statistical physics or solved speciﬁcL.M.Martyushev,V.D.Seleznev/Physics Reports426(2006)1–453 problems.By this principle,a nonequilibrium system develops so as to maximize its entropy production under present constraints.The rigorous formulation,the origin and corollaries of this principle will be analyzed comprehensively in what follows.Here we shall only mention two principal features concerning the relationship between MEPP and the other two known statements about entropy.1.The second law of thermodynamics,as it was worded by Clausius,states that for an arbitrary adiabatic process entropy(S)of theﬁnal state is larger than or equal to entropy of the initial state.In terms of the entropy production ( ),this means that 0.In this case,MEPP obviously represents new additional statement meaning that the entropy production is not just positive,but tends to a maximum.Thus,apart from the direction of the evolution,which follows from the Clausius’s formulation,we have information about the movement rate of a system.However,if we consider the statistical interpretation of entropy,which follows from studies by Boltzmann and Gibbs,entropy not only tends to increase,but will increase to a maximum value,which is allowed by imposed constraints.Here,theﬁnal(equilibrium) state is most probable and is described by a maximum number of microscopic states.On the strength of this statistical interpretation of the second law,MEPP may be viewed as the natural generalization of the Clausius–Boltzmann–Gibbs formulation of the second law and,in some cases,even as a corollary.Indeed,let us take an isolated system in some nonequilibrium state.The system will reach equilibrium with time(of the order of the relaxation time)and,among a great number of possible states,it will be in a state,for which entropy is a maximum.Therefore,the change of entropy within a given time interval will also be a maximum among possible values and,since the system is isolated, the entropy production will be the largest.This relationship between the second law of thermodynamics and MEPP was noted earlier(see,for example[1–3]).2.The relationship between the minimum entropy production principle and MEPP is not simple.It has been the subject of long-standing discussions and will be considered in p.1.2.6.Here we shall note the following points.These variation principles are absolutely different.Although the extremum of one and the same function—the entropy production—is sought,these principles include different constraints and different variable parameters.These principles should not be mutually opposed since they are applicable to different stages of the evolution of a nonequilibrium system.It should be noted also that Prigogine[4](see,[5]too)stated more than once and adduced examples when a nonequilibrium system behaved oppositely to his principle of the minimum(Benard’s effect and structure instability during the biochemical evolution).He thought however that this situation is possible only in systems,which are far from the equilibrium.It will be demonstrated in this review that it is MEPP,rather than Prigogine’s principle,can pretend to be a universal principle governing the evolution of nonequilibrium dissipative systems.The present review is theﬁrst generalizing treatise on this topic.One more factor,which adds signiﬁcance to this study,is that publications on this topic are fragmented and different research teams,which are concerned with this principle,are unaware of studies performed by other scientists.1Therefore,the recognition and a wider use of MEPP in differentﬁelds of science have been considerably delayed.In theﬁrst and second chapters we shall describe in sufﬁcient detail the available thermodynamic and statistical basis of this principle.The third chapter will deal with diverse applications of MEPP.The most interesting and signiﬁcant thing is that many researchers,who used MEPP,were unaware of general theoretical approaches,which are presented in theﬁrst two chapters.They were guided exclusively by the common sense and utility of the principle for settling of some speciﬁc problems existing in physics and interdisciplinary sciences.A feature in common of those studies was that MEPP proved to be the missing link,which allowed understanding the evolution of a nonequilibrium system.On account of the diversity of the reading matter and the unwillingness to change traditional well-established designations of quantities,we recommend to read each section attentively so as not to misunderstand a particular symbol.In closing, we shall present conclusions and formulate problems,which,in our opinion,require attention in later studies.Since one of the objectives of this review was to attract attention of a wide circle of people working in various domains of science,we had to proceed,ﬁrst and foremost,from a maximum accessibility and a relative simplicity of the presentation of the material.In some cases,therefore,the rigor is impaired and some details were excluded from the review.2In our opinion,such presentation of the material,which refers to different areas of knowledge,is most reasonable today.Notice also that in this review we deliberately concentrated on evolution criteria for nonequilibrium systems,which are directly connected with entropy and MEPP.This is because we wanted to show that the idea about the relationship 1Important results were published not only in English,but also in other languages(Russian,German and French).2These drawbacks can be eliminated if the reader refers to original papers,which are cited in the review.4L.M.Martyushev,V.D.Seleznev/Physics Reports426(2006)1–45between entropy and the evolution of nonequilibrium systems,which has been known for over150years,still deserves attention and is very useful.We think also that MEPP is relatively simple,elegant and evident as compared to other plentiful hypotheses-criteria,most of which represent just some mathematical structures that are not obvious in physical terms and are not illustrative.1.MEPP in nonequilibrium thermodynamicsTheﬁrst part of this section presents necessary notions and some existing variational principles of nonequilibrium thermodynamics,including Prigogine’s principle.In the second part we shall consider Ziegler’s approach,which is based on the deductive formulation of nonequilibrium thermodynamics using MEPP.1.1.Fundamentals of linear nonequilibrium thermodynamicsTheﬁrst law of thermodynamics or the law of conservation of energy states that the heat Q,which is put into a system,is consumed for the work W done by the system over external bodies and the change of the internal energy of the system,d U[5,6]:Q= W+d U.(1.1) The second law of thermodynamics,which has many wordings[5,6],states that any equilibrium thermodynamic system has a unique state function or entropy S:T d S= Q,(1.2)where T is the temperature.As applied to nonequilibrium processes in an isolated system,the second law states that entropy cannot decrease. The entropy is one of main notions in thermodynamics and analysis of the behavior of this function or its derivatives is very important as it will be shown below in the text.Most processes around us are nonequilibrium processes and a local equilibrium is assumed for the use of a relationship like(1.2).This assumption is valid if it is possible to distinguish two characteristic times,that is,the time required to reach the equilibrium in the entire system and the time required to reach the equilibrium in some volume,which is small as compared to the size of a system under study.Theﬁrst time should be much longer than the second time.A relationship like(1.2)can be applied to each of these small elements.The assumption on the local equilibrium is, of course,very strong,but it holds for many systems[5,6].The variation of entropy with time t in a local unit volume can be written in the form[5–8]j s= −div(j s),(1.3) j t=X i J i,(1.4) iwhere s is the entropy density; is the entropy production density3(on the strength of second law of thermodynamics, the entropy production is always larger than or equal to zero);j s is the entropyﬂux density,which additively depends on thermodynamicﬂux densities J i;X i is thermodynamic forces(depending on particular problem conditions,the subscript i refers to differentﬂuxes(forces)and vector components).In1931–1932Onsager,who theoretically generalized empirical laws established by Fourier,Ohm,Fick and Navier, proposed a linear relationship between thermodynamicﬂuxes and forces[5–9]in the formL ik X k,(1.5) J i=kL ik=L ki,(1.6) where L ik is the matrix of kinetic coefﬁcients,which are independent of J i and X k.3Further in the text referred to as“the entropy production”for brevity.L.M.Martyushev,V .D.Seleznev /Physics Reports 426(2006)1–455The relationships (1.5)and (1.6)form the basis of the apparatus of linear nonequilibrium thermodynamics.They close the system of equations for the transfer of energy,momentum and mass,making it possible to solve the system.It should be emphasized that these expressions are valid only when thermodynamic forces are relatively small (in this case,a nearly linear relationship may be assumed between thermodynamic ﬂuxes and forces).However,the spectrum of problems,which can be solved in terms of this simple formalism,is sufﬁciently wide.At the same time,examples are known when calculations by the linear relationships (1.5)and (1.6)may prove to be incorrect (speciﬁcally when chemical reactions are concerned).In 1931Onsager proposed a variational principle,which can yield relationships (1.5)and (1.6)of linear nonequilib-rium thermodynamics [6,9].Thus,he gave the ﬁrst deductive formulation of linear nonequilibrium thermodynamics.This principle is worded as follows [9]:if values of irreversible forces X i are assigned,true ﬂuxes J i maximize the expression [ (X i ,J k )− (J i ,J k )],i.e.,J [ (X i ,J k )− (J i ,J k )]X =0,(1.7) (J i ,J k )=12 i,k R ik J i J k .(1.8)Here and henceforth J [...]X denotes the variation of the expression in square brackets with respect to J at constant X , is the dissipative function (it is postulated that >0),and R ik is the coefﬁcient matrix (it will be shown later in the text that this matrix is inverse to the L ik matrix).Let us deduce Eqs.(1.5)and (1.6)from Eqs.(1.7)and (1.8).We shall show ﬁrst that the tensor R ik may be viewed as symmetric.Since an arbitrary tensor R ik can be presented at all times as the sum of symmetric S ik and antisymmetric A ik tensors,expression (1.8)takes the form=12 i,k(S ik J i J k +A ik J i J k ).(1.9)If we take the second term,rearrange its members,and consider that A ii =0and A ik =−A ki for the antisymmetric tensor,it is easy to see that the antisymmetric part of the tensor R ik does not contribute to the dissipative function.Therefore,the tensor R ik is symmetric (R ik =R ki ).Substituting Eqs.(1.4)and (1.8)into Eq.(1.7)and replacing the variance by the derivative with respect to the corresponding ﬂuxes,we have j j J j ⎡⎣ i X i J i −12 i,k R ik J i J k ⎤⎦X=0.(1.10)The differentiation of Eq.(1.10)transforms it toX j =12 k (R jk +R kj )J k = k R jk J k .(1.11)Since is a positively deﬁned quadratic form,the solution to the (1.11)for unknown J k is existent and unique:J j =k R −1jk X k ≡ kL jk X k .(1.12)The designation R −1jk ≡L jk has been introduced.Since R ik is a symmetric matrix,the matrix R −1jk ,which is inverseto it,is symmetric too.Thus,it was shown that relationships (1.5),(1.6)can be obtained from the variational formulation (1.7),(1.8).It was demonstrated in the foregoing that the expression (X i ,J k )− (J i ,J k )has the only extremum point Eq.(1.11)or Eq.(1.12).Since is a homogeneous quadratic positively deﬁned function of ﬂuxes,the obtained point is the maximum point.Onsager’s variational principle was formulated in the space of thermodynamic ﬂuxes.An alternative formulation of this principle in the force space was proposed by Gyarmati [9,10].His formulation was worded as follows:if values of thermodynamic ﬂuxes J i are assigned,actual irreversible forces X i maximize the expression (X i ,J i )−Y (X i ,X k ),6L.M.Martyushev,V.D.Seleznev/Physics Reports426(2006)1–45i.e.X[ (X i,J i)−Y(X i,X k)]J=0,(1.13)Y(X i,X k)=12i,kL ik X i X k,(1.14)where Y(X i,X k)>0is the dissipative function in the force representation.Since entropy production is a symmetric bilinear form,then,as Gyarmati demonstrated,the formulations(1.7),(1.8) and(1.13),(1.14)are equivalent.Let us turn to Prigogine’s principle(theorem)(or the minimum entropy production principle),which is formulated as follows[8].Let Eqs.(1.5)and(1.6)be fulﬁlled in a system,irreversible forces X i(i=1,...,j;j<n,n being the number of forces in the system)are kept constant,and entropy production in the system is a minimum.In this case,ﬂuxes having the numbers i=j+1,...,n,which are conjugate to unﬁxed forces,disappear.This principle can be proved rather easily[8].It is necessary to substitute the expressions(1.5)and(1.6)into Eq.(1.4)and differentiate it with respect to unﬁxed forces.The obtained expressions are equal to within constants to ﬂuxes having the numbers j+1,...,n and,therefore,are zero,because entropy production is a minimum under the theorem condition.The corollary of this statement is that Prigogine’s system is stationary.A slightly different formulation of Prigogine’s principle is frequently considered and proved[6,10].It states that in the stationary nonequilibrium state,which is consistent with external restrictions(constant irreversible forces X i, where i=1,...,j;j<n,n being the number of forces in the system),entropy production in the system is a minimum if Eqs.(1.5)and(1.6)are fulﬁlled.It is seen from the above reasoning that Prigogine’s theorem is a simple corollary of Onsager–Gyarmati’s principle, because this principle is used to deduce the relationships(1.11)and(1.12).Prigogine’s result is obtained from these relationships using additional restrictions.It should be noted that other variational formulations of linear nonequilibrium thermodynamics are available(see, for example,[6,10,11]),which are equivalent to some extent to the aforementioned formulations.The apparatus of linear nonequilibrium thermodynamics is used extensively for the following reasons:(1)The relationship(1.5)allow solving a system of equations for the transfer of mass,momentum and energy, because in this case the number of equations is equal to the number of unknowns.(2)It is possible to describe cross-couplingﬂuxes(by means of off-diagonal coefﬁcients L ik)of chemical,electrical and other kinetic processes,which are characteristic of a wide range of physical,chemical and biological systems. (3)Additional information about values of kinetic coefﬁcients can be obtained using,for example,the circumstance that entropy production is a positively deﬁned quadratic form,or reciprocal relation(1.6).(4)Quantities(for example,entropy production),which have extremum values in the nonequilibrium state,also provide additional information about a system.Thus,linear nonequilibrium thermodynamics is very useful within its applications.It should be noted that linear nonequilibrium thermodynamics described the majority of engineering problems encountered in theﬁrst half of the 20th century and,therefore,the progress and the popularization of nonlinear equilibrium thermodynamics were very slow and were viewed by many researchers as a theoretical whim.As it is known,for a system to be described in terms of linear nonequilibrium thermodynamics,thermodynamic forces should be small.However,sometimes this condition proves to be too rough,speciﬁcally,with respect to chemical reactions.As a result,linear nonequilibrium thermodynamics cannot explain and describe many,and often principal,problems encountered today.They include self-organization,oscillation processes,etc.The possible generalization of Onsager’s thermodynamics(linear thermodynamics)to a nonlinear case using the maximum entropy production principle will be discussed in the section below.1.2.Critical review and development of Ziegler’s approach1.2.1.Formulation of Ziegler’s principleLet us work in theﬂux space{J k}and assume that we know the form of expression(1.4)for entropy production in this space.Our main task is toﬁnd X k as a function of{J k}.A local equilibrium is assumed for the system.L.M.Martyushev,V .D.Seleznev /Physics Reports 426(2006)1–457ss Fig.1.1.Variation of entropy s with time t for two possible trajectories of the development (two thermodynamic ﬂuxes at a preset force are possible).In this case,expression (1.4)transforms to(J i )=k X k (J i )J k .(1.15)To ﬁnd X k (J k )in the explicit form,Ziegler [1,12–15]proposed the maximum entropy production principle:If irreversible force X i is prescribed ,the actual ﬂux J i ,which satisﬁes the condition (J i )= i X i J i ,maximizes the entropy production (1.15).A similar principle originally appeared and was widely used in the theory of plasticity where it is called the principle of the maximum dissipation rate of mechanical energy or Mises’s principle [12,16,17].The wording of this principle is that the dissipation rate of mechanical energy in a unit volume during plastic deformation is a maximum in a truly stressed state among all stressed states allowed by the given condition of plasticity (the deformation rate is assumed to be constant)[17].Ziegler essentially extended this principle of the theory of plasticity to all nonequilibrium thermodynamics [1,13].In isolated systems d s/d t = and the maximum entropy production principle reﬂects the fact that an isolated system tends to the state with maximum entropy along the shortest possible path (by the fastest means).Fig.1.1gives an example illustrating the possible behavior of a system.The maximum entropy production principle can be presented mathematically asJ (J k )−(J k )−i X i J i X =0,(1.16)where is a Lagrangean multiplier.It was already mentioned in the foregoing that the subscripts may denote either different thermodynamic forces (ﬂuxes)or their spatial components.This principle can be also formulated in the force space when entropy production depends only on X k and ﬂuxes are prescribed,i.e.analogs of expressions (1.15),(1.16)are written simply substituting J for X .Expression (1.16)is interpreted geometrically as described below.Entropy production in the ﬂux space represents some surface (J k ).This surface is intersected by the plane i X i J i and ﬂuxes,which maximize (J k ),are chosen on their intersection line.The simplest form of entropy production is shown in Fig.1.2for two thermodynamic ﬂing the formulated principle,we ﬁnd the explicit expression for the thermodynamic force.For this purpose,transform Eq.(1.16):⎧⎪⎪⎪⎨⎪⎪⎪⎩j j J i (J k )− (J k )− i X i J i X, =0,j j (J k )− (J k )− i X i J i X,J=0.(1.17)8L.M.Martyushev,V.D.Seleznev/Physics Reports426(2006)1–45J1Fig.1.2.Simplest dissipative surface during a linear irreversible process.Here J1and J2are two thermodynamicﬂuxes and0denotes the point corresponding to the equilibrium state.Since forces are assumed to be assigned,Eq.(1.17)may be written in the form⎧⎨⎩j (J k)j J i−j (J k)j J i−X i=0,(J i)=iX i J i.(1.18)Let us denote =( −1)/ and rearrange theﬁrst expression in Eq.(1.18)asX i= j /j J i,(1.19) where the proportionality coefﬁcient is obtained using the second expression in Eq.(1.18):=ijj J i J i−1.(1.20)Expressions(1.19)and(1.20)are known as orthogonality conditions,because their geometrical meaning is that the thermodynamic force X i,which corresponds to theﬂux J i,is orthogonal to the surface (J k)=const(the line A in Fig.1.2).The analog of the orthogonality condition in the theory of the plasticity is called the associated law[17]. The relationship betweenﬂuxes and forces,which is given by Eqs.(1.19)and(1.20),is not unique.In a particular case,when Eqs.(1.19)and(1.20)have a unique solution(only one vector J corresponds to a preset vector X),the orthogonality condition determines only one extremum point.One can prove that this point corresponds to the maximum of entropy production[1,13].However,this is clear from geometrical considerations4too( >0and,if one takes into account the imposed restrictions,entropy production cannot be inﬁnitely large;it is reasonable to expect therefore that the only extremum point corresponds just to the maximum).In this particular case,the maximum entropy production principle and the orthogonality condition are equivalent.In the general case,the orthogonality condition obviously determines all extremum points in the corresponding area of the surface (J k),whereas the variational principle selects the point,at which the entropy production has the largest value.4See,for example,Fig.1.2.。