Numerical Approximation of a quasi-Newtonian Stokes Flow Problem with Defective Boundary Co
- 格式:pdf
- 大小:523.40 KB
- 文档页数:22
矩阵逆学习资料woodbury公式关于矩阵逆的补充学习资料:Woodbury matrix identity本⽂来⾃维基百科。
the Woodbury matrix identity, named after Max A. Woodbury[1][2] says that the inverse of a rank-k correction of some matrix can be computed by doing a rank-k correction to the inverse of the original matrix. Alternative names for this formula are the matrix inversion lemma, Sherman–Morrison–Woodbury formula or just Woodbury formula. However, the identity appeared in several papers before the Woodbury report.[3]The Woodbury matrix identity is[4]where A, U, C and V all denote matrices of the correct size. Specifically, A is n-by-n, U is n-by-k, C is k-by-k and V is k-by-n. This can be derived using blockwise matrix inversion.In the special case where C is the 1-by-1 unit matrix, this identity reduces to the Sherman–Morrison formula. In the special case when C is the identity matrix I, the matrix is known innumerical linear algebra and numerical partial differential equations as the capacitance matrix.[3]Direct proofJust check that times the RHS of the Woodbury identity gives the identity matrix:Derivation via blockwise eliminationDeriving the Woodbury matrix identity is easily done by solving the following block matrix inversion problemExpanding, we can see that the above reduces to and, which is equivalent to .Eliminating the first equation, we find that , which can be substituted into the second to find. Expanding and rearranging, we have, or . Finally, we substitute into our , and we have. Thus,We have derived the Woodbury matrix identity.Derivation from LDU decompositionWe start by the matrixBy eliminating the entry under the A (given that A is invertible) we getLikewise, eliminating the entry above C givesNow combining the above two, we getMoving to the right side giveswhich is the LDU decomposition of the block matrix into an upper triangular, diagonal, and lower triangular matrices.Now inverting both sides givesWe could equally well have done it the other way (provided that C is invertible) i.e.Now again inverting both sides,Now comparing elements (1,1) of the RHS of (1) and (2) above gives the Woodbury formulaApplicationsThis identity is useful in certain numerical computations where A?1has already been computed and it is desired to compute (A+ UCV)?1. With the inverse of A available, it is only necessary tofind the inverse of C?1+ VA?1U in order to obtain the result using the right-hand side of the identity. If C has a much smaller dimension than A, this is more efficient than inverting A+ UCV directly. A common case is finding the inverse of a low-rank update A+ UCV of A (where U only has a few columns and V only a few rows), or finding an approximation of the inverse of the matrix A+ B where the matrix can be approximated by a low-rank matrix UCV, for example using the singular value decomposition.This is applied, e.g., in the Kalman filter and recursive least squares methods, to replace the parametric solution, requiring inversion of a state vector sized matrix, with a condition equations based solution. In case of the Kalman filter this matrix has the dimensions of the vector of observations, i.e., as small as 1 in case only one new observation is processed at a time. This significantly speeds up the often real time calculations of the filter.See also:Sherman–Morrison formulaInvertible matrixSchur complementMatrix determinant lemma, formula for a rank-k update to a determinantBinomial inverse theorem; slightly more general identity. Notes:1.Jump up ^ Max A. Woodbury, Inverting modified matrices, MemorandumRept. 42, Statistical Research Group, Princeton University,Princeton, NJ, 1950, 4pp MR381362.Jump up ^ Max A. Woodbury, The Stability of Out-Input Matrices.Chicago, Ill., 1949. 5 pp. MR325643.^ Jump up to: a b Hager, William W. (1989). "Updating the inverse of amatrix". SIAM Review31 (2): 221–239. doi:10.1137/1031049.JSTOR2030425. MR997457.4.Jump up ^Higham, Nicholas (2002). Accuracy and Stability ofNumerical Algorithms (2nd ed.). SIAM. p. 258. ISBN978-0-89871-521-7. MR1927606.Press, WH; Teukolsky, SA; Vetterling, WT; Flannery, BP (2007), "Section 2.7.3. Woodbury Formula", Numerical Recipes: The Artof Scientific Computing (3rd ed.), New York: CambridgeUniversity Press, ISBN978-0-521-88068-8External links:Some matrix identitiesWeisstein, Eric W., "Woodbury formula", MathWorld.src="///doc/2742834010661ed9ad51f3e6.html /wiki/Special:CentralAutoLogin/start?type=1x1" alt="" title="" width="1" height="1" style="border: none; position: absolute;" />Retrieved from"/doc/2742834010661ed9ad51f3e6.html /w/index.php?title=Woodbury_matrix_identity&oldid=627695139"Categories:Linear algebraLemmasSherman–Morrison formula(来⾃维基百科。
统计学习理论的本质:英中文术语对照表来源:张学工译, VN Vapnik原著, 统计学习理论的本质, 清华大学出版社, 2000使用范围:南京师范大学计算机科学与技术学院研究生。
声明:任何人在其出版物使用或者上载到互连网都必须得到译者及出版社的许可。
AdaBoost algorithm (AdaBoost(自举)算法)163admissible structure (容许结构) 95algorithmic complexity (算法复杂度) 10annealed entropy (退火熵) 55ANOVA decomposition (ANOVA分解) 199a posteriori information (后验信息) 120a priori information (先验信息) 120approximately defined operator (近似定义的算子) 230 approximation rate (逼近速率) 98artificial intelligence (人工智能) 13axioms of probability theory (概率理论的公理) 60back propagation method (后向传播方法) 126basic problem of probability theory (概率论的基本问题) 62basic problem of statistics (统计学的基本问题) 63Bayesian approach (贝叶斯方法) 119Bayesian inference (贝叶斯推理) 34bound on the distance to the smallest risk (与最小风险的距离的界) 77 bound on the values of achieved risk (所得风险值的界) 77bounds on generalization ability of a learning machine (学习机器推广能力的界) 76canonical separating hyperplanes (标准分类超平面) 132capacity control problem (容量控制问题) 116cause-effect relation (因果关系) 9choosing the best sparse algebraic polynomial (选择最佳稀疏多项式)117choosing the degree of polynomial (选择多项式阶数) 116 classification error (分类错误) 19codebook (码本) 106complete (Popper's) nonfalsifiability (完全(波普)不可证伪性) 52 compression coefficient (压缩系数) 107consistency of inference (推理的一致性) 36constructive distribution-independent bound on the rate of convergence (构造性的不依赖于分布的收敛速度界) 69convolution of inner production (内积回旋) 140criterion of nonfalsifiability (不可证伪性判据) 47data smoothing problem (数据平滑问题) 209decision-making problem (决策选择问题) 296decision trees (决策树) 7deductive inference (演绎推理) 47density estimation problem (密度估计问题):parametric(Fisher-Wald) setting(参数化(Fisher-Wald)表示) 20nonparametric setting (非参数表示) 28discrepancy (差异) 18discriminant analysis (判别分析) 24discriminant function (判别函数) 25distribution-dependent bound on the rate of convergence (依赖于分布的收敛速度界) 69distribution-independent bound on the rate of convergence (不依赖于分布的收敛速度界) 69ΔΔ-margin separating hyperplane (间隔分类超平面) 132 empirical distribution function (经验分布函数) 28empirical processes (经验过程) 40empirical risk functional (经验风险泛函) 20empirical risk minimization inductive principle (经验风险最小化归纳原则) 20ensemble of support vector machines (支持向量机的组合) 163 entropy of the set of functions (函数集的熵) 42entropy on the set of indicator functions (指示函数集的熵) 42 equivalence classes (等价类) 292estimation of the values of a function at the given points (估计函数在给定点上的值) 292expert systems (专家系统) 7ε-insensitivity (ε不敏感性) 181ε-insensitive loss function (ε不敏感损失函数) 181feature selection problem (特征选择问题) 118function approximation (函数逼近) 98function estimation model (函数估计模型) 17Gaussian (高斯函数) 26generalized Glivenko-Cantelli problem (广义Glivenko-Cantelli问题)66generalized growth function (广义生长函数) 85generator random vectors (随机向量产生器) 17Glivenko-Cantelli problem (Glivenko-Cantelli问题) 66growth function (生长函数) 55Hamming distance (汉明距离) 104handwritten digit recognition (手写数字识别) 146hard threshold vicinity function (硬限邻域函数) 103hard vicinity function (硬领域函数) 269hidden Markov models (隐马尔可夫模型) 7hidden units (隐结点) 101Huber loss function (Huber损失函数) 183ill-posed problems (不适定问题): 9solution by variation method (变分方法解) 236solution by residual method (残差方法解) 236solution by quasi-solution method (拟解方法解) 236 independent trials (独立试验) 62inductive inference (归纳推理) 50inner product in Hilbert space (希尔伯特空间中的内积) 140 integral equations (积分方程):solution for exact determined equations (精确确定的方程的解)237solution for approximately determined equations (近似确定的方程的解) 237kernel function (核函数) 27Kolmogorov-Smirnov distribution (Kolmogorov-Smirnov分布) 87 Kulback-Leibler distance (Kulback-Leibler距离) 32Kuhn-Tücker conditions (库恩-塔克条件) 134Lagrangian multiplier (拉格朗日乘子) 133Lagrangian (拉格朗日函数) 133Laplacian (拉普拉斯函数) 277law of large number in the functional space (泛函空间中的大数定律)41law of large numbers (大数定律) 39law of large numbers in vector space (向量空间中的大数定律) 41 Lie derivatives (Lie导数) 20learning matrices (学习矩阵) 7least-squares method (最小二乘方法) 21least-modulo method (最小模方法) 182linear discriminant function (学习判别函数) 31linearly nonseparable case (线性不可分情况) 135local approximation (局部逼近) 104local risk minimization (局部风险最小化) 103locality parameter (局部性参数) 103loss-function (损失函数):for AdaBoost algorithm (AdaBoost算法的损失函数) 163for density estimation (密度估计的损失函数) 21for logistic regression (逻辑回归的损失函数) 156for pattern recognition (模式识别的损失函数) 21for regression estimation (回归估计的损失函数) 21 madaline(Madaline自适应学习机) 7main principle for small sample size problems (小样本数问题的基本原则) 28maximal margin hyperplane (最大间隔超平面) 131maximum likehood method (最大似然方法) 24McCulloch-Pitts neuron model (McCulloch-Pitts神经元模型) 2 measurements with the additive noise (加性噪声下的测量) 25 metric ε-entropy (ε熵度量) 44minimum description length principle (最小描述长度原则) 104 mixture of normal densities (正态密度的组合) 26National Institute of Standard and Technology (NIST) digit database (美国国家标准技术研究所(NIST)数字数据库) 173neural networks (神经网络) 126non-trivially consistent inference (非平凡一致推理) 36 nonparametric density estimation (非参数密度估计) 27normal discriminant function (正态判别函数) 31one-sided empirical process (单边经验过程) 40optimal separating hyperplane (最优分类超平面) 131overfitting phenomenon (过学习现象) 14parametric methods of density estimation (密度估计的参数方法) 24 partial nonfalsifiability (部分不可证伪性) 51Parzen's windows method (Parzen窗方法) 27pattern recognition problem (模式识别问题) 19perceptron (感知器) 1perceptron's stopping rule (感知器迭代终止规则) 6polynomial approximation of regression (回归的多项式逼近) 116 polynomial machine (多项式机器) 143potential nonfalsifiability (潜在不可证伪性) 53probability measure (概率测度) 59probably approximately correct (PAC) model (可能近似正确(PAC)模型) 13problem of demarcation (区分问题) 49pseudo-dimension (伪维) 90quadratic programming problem (二次规划问题) 133quantization of parameters (参数的量化) 110quasi-solution (拟解) 112radial basis function machine (径向基函数机器) 144random entropy (随机熵) 42radnom string (随机串) 10randomness concept (随机性概念) 10regression estimation problem (回归估计问题) 19regression function (回归函数) 19regularization theory (正则化理论) 9regularized functional (正则化泛函) 9reproducing kernel Hilbert space (再生核希尔伯特空间) 244 residual principle (残差原则) 236rigorous (distribution-dependent) bounds (严格(依赖于分布的)界) 85 risk functional (风险泛函) 18risk minimization from empirical data problem (基于经验数据最小化风险的问题) 20robust estimators (鲁棒估计) 26robust regression (鲁棒回归) 26Rosenblatt's algorithm (Rosenblatt算法) 5set of indicators (指示器集合) 73set of unbounded functions (无界函数集合) 77σ-algebra (σ代数) 60sigmoid function (S型(sigmoid)函数) 125small samples size (小样本数) 93smoothing kernel (平滑核) 100smoothness of functions (函数的平滑性) 100soft threshold vicinity function (软阈值领域函数) 103soft vicinity function (软领域函数) 269soft-margin separating hyperplane (软间隔分类超平面) 135spline function (样条函数):with a finite number of nodes (有限结点的样条函数) 194with an infinite number of nodes (无穷多结点的样条函数) 195 stochastic approximation stopping rule (随机逼近终止规则) 34 stochastic ill-posed problems (随机不适定问题) 113strong mode estimating a probability measure (强方式概率度量估计)63structural risk minimization principle (结构风险最小化原则) 94 structure (结构) 94structure of growth function (生长函数的结构) 79supervisor (训练器) 17support vector machines (支持向量机) 137support vectors (支持向量) 134support vector ANOVA decomposition (支持向量ANOVA分解) 199 SVM n approximation of the logistic regression (逻辑回归的SVM n逼近) 155SVM density estimator (SVM密度估计) 246SVM conditional probability estimator (SVM条件概率估计) 257 tails of distribution (分布的尾部) 78tangent distance (切距) 149training set (训练集) 18transductive inference (转导推理) 293Turing-Church thesis (Turing-Church理论) 177two layer neural networks machine (两层神经网络机器) 145two-sided empirical process (双边经验过程) 40U.S. Postal Service digit database (美国邮政数字数据库) 173 uniform one-sided convergence (一致单边收敛) 39uniform two-sided convergence (一致双边收敛) 39VC dimension of a set of indictor functions (指示函数集的VC维) 79 VC dimension of a set of real functions (实函数集的VC维) 81VC entropy (VC熵) 44VC subgraph (VC子图) 90vicinal risk minimization method(领域风险最小化) 268vicinity kernel(领域核):273one-vicinal kernel (单领域核) 273two-vicinal kernel (双领域核) 273VRM method (VRM方法):for pattern recognition (模式识别的VRM方法) 273for regression estimation (回归估计的VRM方法) 282for density estimation (密度估计的VRM方法) 284for conditional probability estimation (条件概率估计的VRM方法) 285for conditional density estimation (条件密度估计的VRM方法)286weak mode estimating a probability measure (弱方式概率度量估计)63weight decay procedure (权值衰减过程) 102。
非线性Klein-Gordon方程的最低阶混合元超收敛分析新模式樊明智;王芬玲【摘要】针对一类非线性Klein-Gordon方程利用最简单的双线性元Q11及Q01×Q10元建立了最低阶且自然满足Brezzi-Babuska条件的混合元逼近格式.基于双线性元的积分恒等式结果,建立了插值与Riesz投影之间的超收敛估计,再结合Q01×Q10元的高精度分析结果和插值后处理技术,在半离散和全离散格式下,导出了关于原始变量u和流量p分别在H1模和L2模意义下单独利用插值或Riesz投影所无法得到的超逼近性和超收敛结果.【期刊名称】《许昌学院学报》【年(卷),期】2016(035)005【总页数】9页(P1-9)【关键词】非线性Klein-Gordon方程;超逼近性和超收敛结果;混合有限元新模式;半离散和全离散格式【作者】樊明智;王芬玲【作者单位】许昌学院数学与统计学院,河南许昌461000;许昌学院数学与统计学院,河南许昌461000【正文语种】中文【中图分类】O242.21本文考虑如下的非线性Klein-Gordon方程其中Ω⊂R2为有界矩形区域,∂Ω为Ω的边界,X=(x,y),f(X,t)∈L2(Ω),γ是正常数,u0(X),u1(X)是已知充分光滑的函数,假设a(u),g(u)满足如下条件:(i)a(u)关于u一致有界,即存在正常数a0,a1满足,a0≤a(u)≤a1;(ii)a(u)和g(u)对变量满足Lipschitz条件,即存在正常数L使得Klein-Gordon方程具有丰富的实际背景和物理意义,它用于描述相对论量子力学和量子场论中自旋为零的粒子的最基本方程和Schrödinger方程的相对论形式,对于它的研究值得物理学家和数学家的高度关注.关于Klein-Gordon方程已有研究, 例如文献[1]对无界区域上一维Klein-Gordon方程建立一个显式差分格式, 并给出该格式的稳定性和收敛性结果;文献[2]和[3]研究了一维情形下的数值解;文献[4]讨论了二维Klein-Gordon方程存在唯一的整体解.不难看出,当a(u)=1和g(u)=sinu时,问题(1)变成了sine-Gordon方程, 因此sine-Gordon方程是问题(1)的特殊情况, 并得到一些有价值的成果[5~8], 由于Klein-Gordon方程比sine- Gordon方程复杂, 使问题的处理更为困难. 因此据我们所知到目前为止尚未见到有关问题(1)混合元格式的高精度分析.混合有限元方法与传统Galerkin有限元方法相比具有对空间的光滑度要求较低、并能同时得到原始变量和流量的误差估计等突出优势, 已成为一种常用的数值逼近方法.对于经典的混合有限元格式来说,混合元空间要满足Brezzi-Babuška条件[9,10] (简记为B-B条件),给构造合适的空间对带来一定的困难. 最近文献[11、12]给出了二阶椭圆问题新的混合有限元逼近格式, 具有自由度小且当逼近空间对满足包含关系时, B-B条件成立,同时又能避开散度算子带来的困扰等特点,文献[13]将此方法应用到线性抛物方程, 给出关于时间半离散混合格式和全离散化混合有限元格式,但仅仅得到了最优误差估计,文献[14]及[15]进一步研究了二阶椭圆问题和线弹性问题在新格式下的超收敛性,文献 [16]将其推广到线性Sobolev方程得到了非协调混合元格式半离散格式下的超收敛性和向后欧拉全离散格式下关于空间步长的超逼近性.该文目的是将Riesz投影的优势和插值的高精度分析方法的特色有机的结合起来, 利用Q11及Q01×Q10元针对方程(1)给出了最低阶混合元超收敛分析新模式. 借助于双线性元的高精度分析结果[17],得到了精确解的双线性插值与其Riesz投影之间的超收敛估计. 进一步的结合插值后处理技巧, 在半离散和全离散格式下,分别导出了原始变量u和分别在H1模和L2模意义下单独利用Riesz投影和插值技巧所无法得到的超逼近性和超收敛结果.设Th为Ω上的一族矩形剖分,∀K∈Th定义单元K的中心为(xK,yK),其边长分别为2hx,K,2hy,K,取}.单元K的顶点为a1(xK-hx,K,yK-hy,K),a2(xK+hx,K,yK-hy,K),a3(xK+hx,K,yK+hy,K),a4(xK-hx,K,yK+hy,K),单元K的边为(mod4). 定义有限元空间:其中Qij=span{xrys,0≤r≤i,0≤s≤j}.设和分别为和所诱导的插值算子,且满足这里是对应边∂K的单位切向量.引理1[17] 若u∈H3(Ω),则进一步地,若u∈H4(Ω),则若,则利用文献[17]中类似的方法可得如下的结论:引理2 若u∈H3(Ω)则证明为了得到高精度估计引进误差函数[17]:令u-Ihu=φ,对ω1∈Q01做Taylor展开有于是注意到‴,根据分部积分公式得∫Kφxdxdy=∫KF″(y)φxdxdy=(∫l3-∫l1)F′(y)φxdxdy-∫KF′(y)φxydx= F′(yK+hy,K)(φ(xK+hx,K,yK+hy,K)-φ(xK-hx,K,yK+hy,K))-F′(yK-hhy,K)(φ(xK+hx,K,yK-hy,K)-φ(xK-hx,K,yK-hy,K))-(F(yK+hy,K)∫l3φxydx-F(yK-hy,K)∫l1φxydx)+∫KF(y)φxyy=∫KF(y)uxyy,∫Kφx(y-yK)dxdy=‴y.根据式(6)~(8)和逆不等式可知∫Kφxω1dxdy=.同理可证利用式(9)和(10)该引理2得证.设的Riesz(或椭圆)投影算子,即对,满足并且Rh具有性质:设,有接下来,我们给出插值和投影之间的超收敛估计.引理3 若,则.证明根据式(2)和(11)得((Rhu-Ihu),(Rhu-Ihu))=((Rhu-u),(Rhu-Ihu))+((u-Ihu),(Rhu-Ihu))≤ch2|u|3|Rhu-Ihu|1.从而引理3得证.令=-u,则方程(1)可改写为令则问题(13)的变分形式为:求,使得其满足变分问题(14)的有限元逼近方程为:求满足定理1 设和分别是(14)和(15)的解,则有其中.证明首先令ξ.∀,根据式(14)和(15)有如下误差方程在(19a)和(19b)中分别令vh=ξt和ξt,并将(19a)+γ(19b)得(ξ,(η,.由Cauchy和Young不等式及式(12)可知基于式(11)得由假设(i)和(ii)可知,,根据式(21)~(24)将(20)变形为对(25)从0到t积分,并注意到ξ(X,0)=ξt(X,0)=0得将Gronwall引理应用于(26)可知借助(27)和引理3有,从而(16)式得证.另一方面,利用引理2和引理3得在 (19b)中利用引理2和式(4)、(26)及(27)得,即,从而式(17)成立,定理1证毕.注1 若将式(15)和(18)中的Rh换成插值Ih时可得如下结论:(1)结合式(3)可得半离散超逼近结果此时,u∈H4(Ω)的要求比本文定理1中的u∈H3(Ω)的光滑度要高.(Ⅱ)借助于文献[5~7]中的导数转移技巧有显然与定理1相比对ut的光滑度要求稍高.注2 若仅用投影时虽然可以得到关于空间步长的超逼近性,但如何构造关于投影的后处理算子仍然是悬而未决的问题.因此,到目前为止无法直接得到关于投影的超收敛结果.为了得到整体超收敛,我们先把Th相邻的四个小单元合并构成一个大单元(如图1).并设Zi(i=1,2,…,9)为四个小单元的所有顶点.在上根据文献[17]构造具有如下性质的插值后处理算子:其中)为上的连续函数空间,则算子分别满足如下性质:定理2 在定理1的条件下,有如下的整体超收敛结果证明利用定理1和(29)可知,).即式(31)得证.同理借助定理1和式(30)可证式(32),定理2证毕.在本节中我们将主要讨论全离散格式下的误差估计,仅讨论a(u)=a(X)的情形.设0=t0≤t1≤…tN-1≤tN=T是上步长为τ=T/N的剖分,tn=nτ, n=0,1,2,…,N,Un 代表t=tn时u(tn)在Vh中的逼近.为了方便起见,我们引入下面一些记号:定义(14)全离散逼近格式为:求满足其中utt(X,0)=-a(X)u1(X)+γΔu0(X)-g(u0(X))+f(X,0).定理和分别是(14)和(33)的解,则有其中.证明为了进行误差估计,记∀由(14)和(33)可导出如下误差方程其中根据(37b)可得在(37a)和(38)中分别取,并将(37a)+γ(38)有,不难看出,(39)的右端各项分别变形为接下来我们给出(39)式右端的估计,注意到借助于(43)有根据(11)可知利用假设(i)和插值理论得G3=.利用假设(i)和(43)将G4估计为.借助泰勒展开式直接计算有利用式(48)有.综合式(40)~(42)、(44)~(47)及(49)并取得其中.对式(50)关于j从1到n-1求和得).由初始条件和泰勒展开式得因此,再结合U1的定义可知再借助于式(52)并注意到ξ0=0得利用式(53)将式(51)变形为其中(ξn,ξn-1).根据Young不等式得,(ξn,,得选择适当小的τ,使1-cτ>0,再根据离散的Gronwall引理得.利用引理3和三角不等式有即式(34)得证.在式(37b)中令h=θn,再利用(4)和引理2及1的估计结果得,定理3得证.注3 若将式(36)中的Rh换成插值Ih时,结合(3)可得如下结论:其中.此时式(55)和式(56)中对解的光滑度要求比定理3偏高.注4 本文方法对抛物方程、双曲方程、抛物积分微分方程、双曲积分微分方程均使用.【相关文献】[1] Han H D. Zhang Z W. An analysis of the finite difference method for one-dimensional Klein- Gordon equation on unbounded domain[J]. Appl Numer Math, 2009, 59(7): 1 568-1 583.[2] Khalifa M E, Elgamal M. A numerical solution to Klein-Gordon equation with Dirichlet doundary condition[J]. Applied Mathematics Computation, 2005, 160(2): 451-475.[3] Wang Q F, Cheng D Z. Numerical solution of damped-nonlinear Klein-Gordon equations using variational method and finite element approach[J]. Applied Mathematics Computation, 2005, 162(1): 381-401.[4] Nakao H, Pavel I N. Wave operators to a quadraticnon nonlinear Klein-Gordon equation in two space dimensions[J]. Nonlinear Analysis TMA, 2009, 71(9): 3 826-3 833. [5] Shi D Y, Zhang D. Approximation of nonconforming quasi-Wilson element for sine-Gordon equa-tion[J]. Journal of Computational Mathematics, 2013,31(3):271-282.[6] 石东洋,张斐然.Sine-Gordon方程的一类非协调有限元分析[J].计算数学,2011,33(3):289- 297.[7] 王芬玲,石东洋.非线性sine-Gordon方程Hermite型有限元新的超收敛分析及外推[J].应用数学学报,2012,35(5):777-788.[8] 石东洋,王芬玲,赵艳敏.非线性sine-Gordon方程的各向异性线性元高精度分析新模式[J].计算数学,2014,36(3):245-256.[9] Babuska I. Error-bounds for finite element method[J]. Numerical Mathematics, 1971,16: 322- 333.[10] Brezzi F.On the existence,uniqueness and approximation of saddle-point problems arising from Lagrangian multipliers[J]. SIAM Journal on Numerical Analysis, 1974, 13: 185-197.[11] 陈绍春,陈红如.二阶椭圆问题新的混合元格式[J].计算数学,2010,32(2):213-218.[12] 史峰,于佳平,李开泰.椭圆型方程的一种新型混合有限元格式[J].工程数学学报,2011,28(2):231-236.[13] 李磊,孙萍,罗振东.抛物方程一种新混合有限元格式及误差分析[J].数学物理学报,2012,32A(6):1 158-1 165.[14] 石东洋,李明浩.二阶椭圆问题一种新格式的高精度分析[J].应用数学学报,2014,37(1):45-58.[15] Shi D Y, Li M H. superconvergence anaalysis of a stable conforming rectangular mixed finite ele-ments for the linear elasticity problem[J]. Journal of Computational Mathematics, 2014, 32(2): 205-214.[16] Shi D Y, Zhang Y D. High accuracy analysis of a new nonconforming mixed finite element sch-eme for Sobolev equation[J]. Applied Mathematics Computation, 2011, 218(7): 3 176-3 186.[17] 林群,严宁宁.高效有限元构造与分析[M].保定:河北大学出版社,1996.。
SIAM J.S CI.C OMPUT.2012Society for Industrial and Applied Mathematics Vol.34,No.2,pp.A1027–A1052A NEW TRUNCATION STRATEGY FOR THEHIGHER-ORDER SINGULAR V ALUE DECOMPOSITION∗NICK VANNIEUWENHOVEN†,RAF VANDEBRIL†,AND KARL MEERBERGEN†Abstract.We present an alternative strategy for truncating the higher-order singular value decomposition(T-HOSVD).An error expression for an approximate Tucker decomposition with orthogonal factor matrices is presented,leading us to propose a novel truncation strategy for the HOSVD,which we refer to as the sequentially truncated higher-order singular value decomposi-tion(ST-HOSVD).This decomposition retains several favorable properties of the T-HOSVD,while reducing the number of operations required to compute the decomposition and practically always improving the approximation error.Three applications are presented,demonstrating the effective-ness of ST-HOSVD.In thefirst application,ST-HOSVD,T-HOSVD,and higher-order orthogonal iteration(HOOI)are employed to compress a database of images of faces.On average,the ST-HOSVD approximation was only0.1%worse than the optimum computed by HOOI,while cutting the execution time by a factor of20.In the second application,classification of handwritten digits, ST-HOSVD achieved a speedup factor of50over T-HOSVD during the training phase,and reduced the classification time and storage costs,while not significantly affecting the classification error.The third application demonstrates the effectiveness of ST-HOSVD in compressing results from a nu-merical simulation of a partial differential equation.In such problems,ST-HOSVD inevitably can greatly improve the running time.We present an example wherein the2hour45minute calculation of T-HOSVD was reduced to just over one minute by ST-HOSVD,representing a speedup factor of 133,while even improving the memory consumption.Key words.tensor,sequentially truncated higher-order singular value decomposition,higher-order singular value decomposition,multilinear singular value decomposition,multilinear orthogonal projectionAMS subject classifications.15A03,15A69,15A72,65F99,65Y20DOI.10.1137/1108360671.Introduction.In this paper,we seek a good multilinear rank-(r1,r2,...,r d) approximation to the order-d tensor A∈R n1×n2×···×n d,using only numerical linear algebra tools.Formally,we are interested in an approximate solution tomin B A−B 2F(1.1)with B∈R n1×···×n d restricted to be of rank(r1,r2,...,r d)and r i≤n i.Here,we deal with the multilinear rank,as defined by Hitchcock[22].The above approximation problem is well-posed[13,Corollary4.5],but does not exhibit a known closed solution. This motivated the search for reliable numerical algorithms for solving problem(1.1).∗Submitted to the journal’s Methods and Algorithms for Scientific Computing section June1, 2011;accepted for publication(in revised form)January26,2012;published electronically April 3,2012.This work presents research results of the Belgian Network DYSCO(Dynamical Systems, Control,and Optimization)and was supported by the Interuniversity Attraction Poles Programme, initiated by the Belgian State Science Policy Office,and partially supported by the Research Council KU Leuven grants CoE OPTEC(optimization in engineering),OT/10/038(multi-parameter model order reduction and its applications),OT/11/055(spectral properties of perturbed normal matri-ces and their applications),and the Research Foundation—Flanders(Belgium)project G034212N (reestablishing smoothness for matrix manifold optimization via resolution of singularities).The scientific responsibility rests with its author(s)./journals/sisc/34-2/83606.html†Numerical Approximation and Linear Algebra Group,Department of Computer Science, KU Leuven,Leuven,Belgium(nick.vannieuwenhoven@cs.kleuven.be,raf.vandebril@cs.kleuven.be, karl.meerbergen@cs.kleuven.be).Thefirst author’s research was supported by a Ph.D.fellowship of the Research Foundation—Flanders(FWO).A1027A1028N.VANNIEUWENHOVEN,R.VANDEBRIL,AND K.MEERBERGEN Early approaches to this problem,as well as applications,originated in psycho-metrics.Tucker considered a decomposition of a third-order tensor[54,55,56],now known as the Tucker decomposition,into a set of three factor matrices and a third-order core tensor.Formally,we writea i,j,k=n1i =1n2j =1n3k =1s i ,j ,k x i,i y j,j z k,k(1.2)for a third-order tensor A=a i,j,k∈R n1×n2×n3.Here X=x i,i ∈R n1×n1, Y=y j,j ∈R n2×n2,and Z=z k,k ∈R n3×n3are the factor matrices,and S=s i ,j ,k ∈R n1×n2×n3is the core tensor.In[56],Tucker proposed algorithms for computing a Tucker decomposition(1.2).Of particular interest is the Tucker1 algorithm[56],which was later refined by De Lathauwer,De Moor,and Vande-walle[10].The refined algorithm is called the higher-order singular value decom-position(HOSVD)[10,56],and is particularly useful for constructing an approximate solution to problem(1.1).A rank-(r1,r2,r3)approximation can be obtained simply by restricting the factor matrices X,Y,and Z to thefirst r1,r2,and r3columns,respectively,and by restricting the core tensor to S =s i,j,k r1,r2,r3i,j,k=1.This truncatedHOSVD(T-HOSVD)has traditionally been the method of choice for obtaining a cheap approximate solution to problem(1.1).In this paper,we investigate a new technique for truncating the HOSVD,which is cheaper still to compute and often improves the approximation error with respect to the T-HOSVD.The problem we consider,in this paper,is different from the problem of deter-mining the best approximation of a given multilinear rank to A.This problem has been subjected to extensive research.In general,only locally optimal solutions[23] can be computed efficiently,and there is a wide variety of algorithms to accomplish this.Kroonenberg and de Leeuw were thefirst to propose such an(iterative)al-gorithm in[30].It is the popular alternating least squares(ALS)algorithm,called TUCKALS3in[30]but now more commonly known as the higher-order orthogonal iteration(HOOI)algorithm[2,3,11,30].HOOI alternates between the modes,com-puting the factor matrix in one mode byfixing the other factor matrices and solving a least squares problem.A new iteration is started if all factor matrices have been updated in this manner.This is repeated until convergence.Optimization-based algo-rithms to tackle problem(1.1)have been investigated only very recently and include Newton–Grassmann[15,26],quasi–Newton–Grassmann[51],and trust-region[24,25] methods on manifolds.The HOSVD has been applied in numerous application domains[29],such as image processing[9,32,39,61],pattern recognition[49,50,59,60,62],data mining [33,34,52,53],signal processing[12,19,36,37,38,45],psychometrics[54,55,56], chemometrics[5],and biomedicine[16,40,41].Aside from its use in applications,the HOSVD is also of considerable theoretical importance.For instance,the T-HOSVD is used to initialize iterative algorithms to compute the best rank-1approximation [7,28,63]or best approximation of a specified multilinear rank[11,15,24,26,51]. It is a building block for other tensor decompositions,such as the hierarchical Tucker decomposition[18,21],the Tucker tree decomposition[43,44],and the tensor-train decomposition[42].Another use is dimensionality reduction,in order to compute the canonical polyadic decomposition,or CANDECOMP/PARAFAC[29],more efficiently [6].In this paper,we present a new truncation strategy for the HOSVD.Our inves-tigation was spurred by an interesting remark made by Andersson and Bro in[2].HIGHER-ORDER SINGULAR VALUE DECOMPOSITION A1029 In proposing techniques to improve the computational efficiency of the HOOI algo-rithm,they briefly point out a different initialization scheme.Instead of initializing the factor matrices by the T-HOSVD,they essentially propose initializing it with thesequentially truncated HOSVD—the decomposition we study in this paper.However,the decomposition was not formalized,nor were its properties investigated in[2].The main contribution of this paper is a new truncation strategy for the HOSVDwhich always reduces the number offloating operations to compute the decomposition, and simultaneously improves the approximation error in many cases.Furthermore,the approximation error can be expressed in terms of the singular values that arecomputed as a byproduct of the approximation process.This allows for accurate numerical thresholding techniques withoutfirst computing the full HOSVD.The paper is structured as follows.In the next section,we state some basicdefinitions.Special emphasis is put in section3on multilinear orthogonal projections. In section4,we briefly present the HOSVD[10].Thereafter,in section5,we presentan expression for the error of any approximate orthogonal Tucker decomposition. Based on this error expression,a new truncation strategy is proposed in section6.The relationship between the error of the T-HOSVD and that of the ST-HOSVD isinvestigated in section7.In section8,we illustrate numerically that the ST-HOSVD often outperforms T-HOSVD in terms of approximation error and computation time.Finally,in section9,we summarize our main results.2.Preliminaries.In this section,some necessary preliminaries are discussed.First,some notational conventions are established.Tensors are typeset in an upper-case calligraphic font(A,S),matrices as uppercase letters(A,U),vectors as boldfacelowercase letters(u,v),and scalars as lowercase letters(a,b).I denotes the identity matrix of suitable dimensions.The scalar d denotes the order of the tensor.Thescalar k denotes an integer between1and d.A multilinear orthogonal projector thatprojects along mode k is denoted asπk;see also section3.Projectors,matrices,and tensors which are typeset with a bar(¯π1,¯U1,¯A)are related to the T-HOSVD,with a hat(ˆπ1,ˆU1,ˆA)they are related to the ST-HOSVD,and with a breve(˘π1,˘U1,˘A) they are related to any orthogonal Tucker approximation(including T-HOSVD andST-HOSVD).A permutation vector is denoted by square brackets,p=[1,2,3],anda set is denoted by curly brackets,p={1,2,3}.The tensor product is denoted by , and the Kronecker product by⊗.Tensor algebra.This paragraph is based on[8,13];for more details,the readeris referred to those references.A tensor is an element of the tensor product of a set of vector spaces.We are interested in tensor products of real vector spaces.That is,A∈R n1 R n2 ··· R n dis a tensor of order d over the real numbers.More practically,such an object can be represented as a d-array of numbers with respect to a given tensor basis.That is,A=a i1,i2,...,i d n1,n2,...,n di1,i2,...,i d=1∈R n1×n2×···×n d.In the above,A is specified with respect to the standard tensor basis of order d,E d={e i1 e i2··· e id}n1,n2,...,n di1,i2,...,i d=1,A1030N.VANNIEUWENHOVEN,R.VANDEBRIL,AND K.MEERBERGENwhere e i is the i th standard basis vector of suitable dimension.We may thus write A as a multilinear combination of these basis vectors,as follows:A=n1i1=1n2i2=1···n di d=1a i1,i2,...,i de i1e i2··· e id.(2.1)In this paper,we will identify the tensor with the d-array representing its coordinates with respect to a suitable basis.A tensor may be multiplied in each of its modes with a(different)matrix.LetA=a i1,i2,...,i d∈R n1×n2×···×n d and H(k)=h(k)p,q∈R m k×n k.Then,A may be transformed into a tensor B=b j1,j2,...,j d ∈R m1×m2×···×m d via themultilinear multiplication of A by H(k),k=1,...,d,b j1,j2,...,j d =n1i1=1n2i2=1···n di d=1a i1,i2,...,i dh(1)j1,i1h(2)j2,i2···h(d)jd,i d.We will write this more concisely[13]asB=(H(1),H(2),...,H(d))·A.Unfolding.Given a tensor A∈R n1×n2×···×n d,a mode-k vector v is defined as the vector that is obtained byfixing all indices of A and varying the mode-k index:v=A i1,...,i k−1,:,i k+1,...,i d with i j afixed value.We refer to the set of all mode-k vectorsof A as the mode-k vector space.The mode-k unfolding,or matricization[15],of A, denoted by A(k),is an n k×Πi=k n i matrix whose columns are all possible mode-k vectors.The specific order of the mode-k vectors within this unfolding is usually not important,as long as it is consistent.We assume the canonical order,as presented in[15].The column space of A(k)is the mode-k vector space;hence its name.Multilinear rank.The multilinear rank[22]of a tensor A∈R n1×···×n d is a d-tuple(r1,r2,...,r d),wherein r k is the dimension of the mode-k vector space.In other words,r k is the column rank of A(k).Inner product and norm.The Frobenius inner product of two tensors A,B∈R n1×···×n d is defined asA,B F:=n1i1=1···n di d=1a i1,...,i db i1,...,i d,and furthermore[15,Lemma2.1],for every1≤k≤d,A,B F=traceB T(k)A(k)=traceA T(k)B(k)= B,A F,(2.2)where trace(A):=ia ii is the trace of A.The induced Frobenius norm isA 2F:= A,A F= A(k) 2F,for any mode-k unfolding of A.The Frobenius norm for matrices is unitarily invariant. That is,if A∈R m×n and U∈R r×m,with r≥m,is a matrix with orthonormal columns,and V∈R s×n,with s≥n,is a matrix with orthonormal columns,then UAV T 2F= A 2F;see,e.g.,[58].HIGHER-ORDER SINGULAR VALUE DECOMPOSITION A1031 Multilinear multiplication.In this paragraph,we assume that the dimensions of the matrices involved in the multilinear multiplications are compatible.Multilinear multiplication in one mode,say1≤k≤d,with a matrix M can be interpreted as multiplying the mode-k vectors by M.That is,if M is at position k in the tuple, then[(I,...,I,M,I,...,I)·A](k)=M A(k).In general,the unfolding of a multilinear multiplication is given by(see[10,15])[(M1,M2,...,M d)·A](k)=M k A(k)(M1⊗M2⊗···⊗M k−1⊗M k+1⊗···⊗M d)T. Two multilinear multiplications can be transformed into one,as follows[13]: (L1,L2,...,L d)·[(M1,M2,...,M d)·A]=(L1M1,L2M2,...,L d M d)·A.3.Multilinear orthogonal projections.Key to this paper is the use of mul-tilinear orthogonal projections.An orthogonal projector[46,47]is a linear transfor-mation P that projects a vector u∈R n onto a vector space U⊆R n such that the residual u−P u is orthogonal to U.Such a projector can always be represented in matrix form as P=UU T,assuming that the columns of U form an orthonormal basis for the vector space U.De Silva and Lim[13]state that ifφk is an orthogonal projector from the vector space V k⊆R n k onto U k⊆V k,thenΦ=(φ1,φ2,...,φd)is a multilinear orthogonal projection from the tensor space V:=V1 V2 ··· V d onto the tensor subspace U1 U2 ··· U d⊆V.In this paper,we deal with an orthog-onal projector from R n1 ··· R n k−1 R n k R n k+1 ··· R n d onto the subspace R n1 ··· R n k−1 U k R n k+1 ··· R n d exclusively.This multilinear orthogonal projection is given byπk A:=(I,...,Ik−1matrices ,U k U T k,I,...,I)·A with A∈R n1×···×n d,(3.1)where we assume that the columns of U k form an orthonormal basis of the vector space U k.The subscript of the projectorπk indicates that it projects orthogonally along mode k.The projector satisfies the following properties.Every projectorπk is idempotent,πkπk A=πk A,and any two projectors commute,πiπj A=πjπi A.The orthogonal complement ofπk can be characterized explicitly by(1−πk)A=(I,...,I,I−U k U T k,I,...,I)·A,due to the multilinearity of the multilinear multiplication[13,eq.(2.9)].Finally,the projector is orthogonal with respect to the Frobenius norm[13,eq.(2.20)],A 2F= πk A 2F+ (1−πk)A 2F.(3.2)The above projector is used extensively in this paper.4.T-HOSVD.De Lathauwer,De Moor,and Vandewalle proved the following theorem in[10,section3].Theorem 4.1(HOSVD[10]).Every tensor A∈R n1×···×n d admits a higher-order singular value decomposition:A=(U1,U2,...,U d)·S,(4.1)A1032N.VANNIEUWENHOVEN,R.VANDEBRIL,ANDK.MEERBERGEN where the factor matrix U k is an orthogonal n k ×n k matrix,obtained from the SVD of the mode-k unfolding of A ,A (k )=U k Σk V T k ,(4.2)and the core tensor S ∈R n 1×···×n d can be obtained from S =(U T 1,U T 2,...,U T d )·A .The HOSVD can be employed to construct a low multilinear rank approximation to a tensor.Suppose that we want to approximate A by a rank-(r 1,r 2,...,r d )tensor ¯A ,with r k ≤n k for all 1≤k ≤d .The factor matrix ¯U k of the truncated HOSVD is obtained from a truncated SVD of the mode-k unfolding of the tensor [10,29],A (k )=U k Σk V T k = ¯U k ˜U k ¯Σk ˜Σk ¯V T k ˜V T k ,with ¯U k ∈R n k ×r k .(4.3)The approximation is then obtained by an orthogonal projection onto the tensor basis,represented by these factor matrices.That is,¯A :=¯π1¯π2···¯πd A :=(¯U 1¯U T 1,¯U 2¯U T 2,...,¯U d ¯U T d )·A =:(¯U 1,¯U 2,...,¯U d )·¯S ≈A ,wherein the truncated core tensor is defined as (¯U T 1,¯U T 2,...,¯U T d )·A =:¯S ∈R r 1×r 2×···×r d .In the remainder,we will always denote the T-HOSVD projector onto mode k by ¯πk A :=(I,...,I,¯U k ¯U T k ,I,...,I )·A .5.Error of a truncated orthogonal Tucker decomposition.In this section,we present an explicit formula for the error of an approximate Tucker decomposition with orthogonal factor matrices.This formula provides insights into the structure ofthe optimization problem.In the next section,we propose a new greedy optimization algorithm,based on the error expansion presented in the next theorem.It is well known (see,e.g.,[29])that problem (1.1)may be rewritten as min ˘S∈R r 1×···×r d ˘U i ∈R n i ×r i A −(˘U 1,˘U 2,...,˘U d )·˘S F ,with ˘U i ,i =1,2,...,d ,a matrix with orthonormal columns.Furthermore,if the fac-tor matrices {˘U k }k are fixed,then the core tensor ˘S that minimizes the approximation error is given by ˘S =(˘U T 1,˘U T 2,...,˘U T d )·A ,as proved in [11].Therefore,given fixed factor matrices,the optimal approximation to A is obtained by a multilinear orthog-onal projection onto the tensor basis spanned by the columns of the factor matrices .The error of this optimal approximation is investigated in the next theorem.Theorem 5.1(error of a truncated orthogonal Tucker decomposition ).Let A ∈R n 1×···×n d .Let A be approximated by ˘A :=˘π1˘π2···˘πd A :=(˘U 1˘U T 1,˘U 2˘U T 2,...,˘U d ˘U T d )·A ≈A ,in which the factor matrices ˘U k ∈R n k ×r k have orthonormal columns.The approxi-mation error is then given,for any permutation p of {1,2,...,d },by A −˘A 2F = ˘π⊥p 1A 2F + ˘π⊥p 2˘πp 1A 2F + ˘π⊥p 3˘πp 1˘πp 2A 2F +···+ ˘π⊥p d ˘πp 1···˘πp d −1A 2F ,HIGHER-ORDER SINGULAR VALUE DECOMPOSITIONA1033312(a)(b)(c)Fig.5.1.Theorem 5.1states that the squared approximationerror of the truncated orthogonal Tucker decomposition is given by the sum of the squared norms of the areas shaded in grey in the three tensor representations.This particular shading of the areas corresponds to the permutation[3,2,1].and it is bounded by A −˘A 2F ≤d k =1 ˘π⊥p k A 2F ,(5.1)wherein ˘π⊥k :=1−˘πk .Proof .Assume w.l.o.g.that p =[1,2,...,d ].We introduce a telescoping sum:A −˘A =(A −˘π1A )+(˘π1A −˘π2˘π1A )+···+(˘πd −1···˘π1A −˘πd ···˘π1A )=˘π⊥1A +˘π⊥2˘π1A +˘π⊥3˘π1˘π2A +···+˘π⊥d ˘π1···˘πd −1A .(5.2)Consider any two distinct terms in the above.Let,for i <j ,˘B (i ):=(I −˘U i ˘U T i )A (i )(˘U 1˘U T 1⊗···⊗˘U i −1˘U T i −1⊗I ⊗···⊗I ),˘C (i ):=˘U i ˘U T i A (i )(˘U 1˘U T 1⊗···⊗˘U i −1˘U T i −1⊗˘U i +1˘U T i +1⊗···⊗˘U j −1˘U T j −1⊗I −˘U j ˘U T j ⊗I ⊗···⊗I );then we notice that ˘C T (i )˘B (i )=0,because ˘U i ˘U T i is a projector.From (2.2),trace ˘C T (i )˘B (i ) = ˘π⊥i ˘π1···˘πi −1A ,˘π⊥j ˘π1···˘πi ···˘πj −1A F =0.This entails that ˘π⊥i ˘π1···˘πi −1A ⊥˘π⊥j ˘π1···˘πi ···˘πj −1A ;i.e.,these projected tensors are orthogonal with respect to the Frobenius norm.As the above holds for any i <jand orthogonality is reflexive,all the terms in (5.2)are orthogonal with respect to one another,in the Frobenius norm.That completes the first part of the proof.The second part follows readily from the observation that an orthogonal projection onto a subspace can only decrease the Frobenius norm,due to (3.2).In Figure 5.1we visualize the above theorem for a third-order tensor and for the permutation p =[3,2,1].The cube is partitioned into octants.The shaded area in Figure 5.1(a)corresponds to π⊥3A ,in Figure 5.1(b)it corresponds to π⊥2π3A ,and in Figure 5.1(c)to π⊥1π3π2A .Other permutations result in different octants to be summed,but the resulting error is clearly the same.The following error bounds of the T-HOSVD are an immediate corollary of The-orem 5.1.The upper bound was already stated in [10,Property 10].Here,a new elegant proof based on the previous theorem is presented.11The proof in [10]unfortunately contains an error,as its second equality does not hold.A1034N.VANNIEUWENHOVEN,R.VANDEBRIL,ANDK.MEERBERGEN Corollary 5.2(T-HOSVD error bounds ).Let A ∈R n 1×···×n d ,and let ¯A be the rank-(r 1,...,r d )T-HOSVD of A .Let the SVD of A (k )be as in (4.3).The error of the T-HOSVD approximation ¯A to A is then bounded by max k ˜Σk 2F ≤ A −¯A 2F ≤d k =1 ˜Σk 2F .(5.3)Proof .The lower bound is proved by noticing that Theorem 5.1holds for any permutation p of {1,2,...d },the independence of the error on the permutation p ,and the positivity of the terms.The upper bound follows readily from (5.1)and the definition of the T-HOSVD factor matrices in (4.3).The T-HOSVD can be interpreted as an algorithm that minimizes the upper bound in Theorem 5.1,providing a strong rationale for the T-HOSVD approximation.Indeed,minimizing the upper bound yields min π1,...,πd A −π1···πd A 2F ≤min π1,...,πd d k =1 π⊥k A 2F =d k =1min πk π⊥k A 2F ,(5.4)where the last equality follows from noticing that every term is minimized over a different projector,and that the T-HOSVD projectors are determined independently from one another.The solution to this minimization problem is given by choosing πk to project onto the dominant subspace of the mode-k vector space,as in the T-HOSVD.Coincidentally,this also minimizes the lower bound in Corollary 5.2.6.Sequentially truncated HOSVD.In this section,we propose an alterna-tive truncation strategy for the HOSVD.Contrary tothe T-HOSVD,the order in which the modes are processed is relevant and leads to different approximations.The order in which modes are processed is denoted by a sequence p .Throughout this section,we present our results only for the processing order p =[1,2,...,d ],as this significantly simplifies the notation.It should be stressed,however,that many of the results depend on the permutation p .For instance,the approximation error of our algorithm depends on the order in which the modes are processed.6.1.Definition.Optimization problem (1.1)can be expressed asmin π1,...,πd A −π1π2···πd A 2F =min π1,...,πd π⊥1A 2F + π⊥2π1A 2F +···+ π⊥d π1···πd −1A 2F =min π1 π⊥1A 2F +min π2 π⊥2π1A 2F +min π3 ···+min πd π⊥d π1···πd −1A 2F ,by applying Theorem 5.1.Consider the minimization over π1.It is not unreasonable to assume that the final error depends more strongly on the term π⊥1A 2F than on theother terms.The subsequent terms will be minimized over other projectors,thereby diminishing the importance of a single projector.Therefore,ˆπ1:=arg min π1 π⊥1A 2F might be a good approximation to the optimal projector.By repeating the above argument for the projector π2,and then π3,and so on,we arrive at min π1,...,πd A −π1π2···πd A 2F ≤ ˆπ⊥1A 2F + ˆπ⊥2ˆπ1A 2F +···+ ˆπ⊥d ˆπ1···ˆπd −1A 2F = A −ˆπ1···ˆπd A 2F .(6.1)HIGHER-ORDER SINGULAR VALUE DECOMPOSITIONA1035Herein,the hat-projectors are defined recursively by ˆπk :=arg min πk π⊥k ˆπ1···ˆπk −1A 2F =arg max U k ∈R n k ×r k U k U T k [ˆπ1···ˆπk −1A ](k ) 2F .(6.2)Every processing order p of the modes yields a different optimization problem,andsolution,of the above form.The truncation strategy that we propose consists of computing the solution to optimization problem (6.2).The solution can be obtained,for some 1≤k ≤d ,from a truncated SVD of the mode-k unfolding of ˆπ1···ˆπk −1A .It is a tensor of the same size as A ,namely n 1×n 2×···×n d .However,ˆU k =arg max U k ∈R n k ×r k U k U T k A (k )(ˆU 1ˆU T 1⊗ˆU 2ˆU T 2⊗···⊗ˆU k −1ˆU T k −1⊗I ⊗···⊗I )T 2F =arg max U k ∈R n k ×r k U k U T k A (k )(ˆU 1⊗ˆU 2⊗···⊗ˆU k −1⊗I ⊗···⊗I ) 2F .Consequently,it is possible to compute the optimal projector by means of a truncated SVD of (ˆU T 1,...,ˆU T k −1,I,...,I )·A ,which is a tensor of size r 1×r 2×···×r k −1×n k ×···×n d .Whenever the tensor is strongly truncated,r i is much smaller than n i ,thereby significantly improving the computational performance.The above is summarized in Algorithm 1,which computes the solution of optimization problem (6.2).2Algorithm puting the sequentially truncated HOSVD.input :Tensor A ,truncation rank (r 1,r 2,...,r d ),and processing order p .output :Truncated core tensor ˆS and factor matrices {ˆU k }k .ˆS ←A for k ←p 1,p 2,...,p d do %Compute the compact singular value decomposition of ˆS (k )ˆS (k )= U 1U 2 Σ1Σ2 V T 1V T 2 ,with U 1∈R n k ×r k ˆU k ←U 1ˆS (k )←Σ1V T 1end Note that,in Algorithm 1,we compute the compact 3SVD,not the full SVD.The compact SVD is then truncated to the appropriate rank.Clearly,it is possible to replace this with an iterative algorithm that computes the desired singular triplets.In practice,the latter approach can be more efficient if the multilinear rank with which to approximate the tensor is very small.Definition 6.1(ST-HOSVD ).A rank-(r 1,...,r d )sequentially truncated higher-order singular value decomposition (ST-HOSVD)of a tensor A ∈R n 1×···×n d ,corre-sponding to the processing order p =[1,2,...,d ],is an approximation of the formA ≈(ˆU 1,ˆU 2,...,ˆU d )·ˆS =:ˆA p ∈R n 1×n 2×···×n d ,whose truncated core tensor is defined as (ˆU T 1,ˆU T 2,...,ˆU T d )·A =:ˆS ∈R r 1×r 2×···×r d ,2A MATLAB implementation using the Tensor Toolbox [4]can be found in [58].3The compact SVD of A ∈R n ×m is a decomposition such that A =USV T and U ∈R n ×r ,S ∈R r ×r ,and V ∈R m ×r ,with r the rank of A .A1036N.VANNIEUWENHOVEN,R.VANDEBRIL,AND K.MEERBERGEN312ˆS3Fig.6.1.A graphical depiction of the sequential truncation of a third-order tensor,correspond-ing to the processing order[3,2,1]of the modes.and where every factor matrixˆU k∈R n k×r k has orthonormal columns.In terms of orthogonal multilinear projectors,one writesˆA p:=ˆπ1ˆπ2···ˆπdA:=(ˆU1ˆU T1,ˆU2ˆU T2,...,ˆU dˆU T d)·A.The k th partially truncated core tensor is defined as(ˆU T1,ˆU T2,...,ˆU T k,I,...,I)·A=:ˆS k∈R r1×···×r k×n k+1×···×n d,(6.3)withˆS0:=A andˆS d=ˆS.The rank-(r1,...,r k,n k+1,...,n d)partial approximation to A is defined as(ˆU1,ˆU2,...,ˆU k,I,...,I)·ˆS k=:ˆA k∈R n1×n2×···×n d,withˆA0:=A andˆA d=ˆA.The factor matrixˆU k,1≤k≤d,is the matrix of the r k dominant left singular vectors of the mode-k vector space ofˆS k−1.It is obtained from the rank r k trun-cated singular value decomposition of the(k−1)th partially truncated core tensor,as follows:ˆS k−1 (k)=U kΣk V T k=ˆUk˜UkˆΣk˜ΣkˆV Tk˜V Tk,(6.4)whereinˆΣk=diag(ˆσk,1,ˆσk,2,...,ˆσk,rk )and˜Σk=diag(˜σk,rk+1,˜σk,r k+2,...,˜σk,n k).The ST-HOSVD computes a sequence of approximations,ˆA0,ˆA1,...,ˆA d,such that the multilinear rank ofˆA k equals,in thefirst k modes,the desired dimension of the corresponding vector space.We term our approach“sequential”in that the mode-k projector depends on the previously computed projectors.In the remainder,we denote the ST-HOSVD projector onto mode k byˆπk A:=(I,...,I,ˆU kˆU T k,I,...,I)·A.In Figure6.1,we illustrate the operation of Algorithm1.It represents the trun-cation of a third-order tensor A=ˆS0to the ST-HOSVD core tensorˆS=ˆS3,whereby the modes are processed in the order p=[3,2,1].First,the truncated SVD of the mode-3vector space is computed.By projecting onto the span of the matrix of left singular vectorsˆU3,the“energy”in the tensors is reordered.The Frobenius norm of the area shaded in gray is ˜Σ3 2F,whereas the white area has a Frobenius norm equal to ˆΣ3 2F.By projecting onto the dominant subspace of the mode-3vector space, we retain only the nonshaded area ofˆS0,resulting in the approximationˆS1.In the next step,mode2is processed.The SVD is computed,and by projecting onto the space spanned by the left singular vectors,the energy is reordered.To obtain the next approximation,the shaded area ofˆS1is set to zero.The procedure proceeds analogously in the last step.In the end,the ST-HOSVD core tensor is obtained.。
With y y x x R J R J 1,1== and ()y x y x x yk k k J k J Q −+=22, the coefficients i a in the equation (9’) are:()()()()y x y x x y y x y x y x y x y x k k k J k J k k J J J J J J Q k k g a +−−+−++−=2222209991418()()()()(()()()()()())y y x x y x y x x y y x y x y x y x x y y x y x y x y x x y y x y x x y y x y x x y y x y y x x y x k J k J k k k J k J k k J J k k k J k J J J k k J J k J k J J J k J k J J J k J k J J J k J k J J J Q g a ++++++−+−+−+−−+++++=593576286332323223322222/31()()()(()()()())y x y x x y y x y x x y y x y x y x y x y x x y y x y x y x k k k k k J k J k J k J k k J J J J k k k J k J J J k k J J Q g a +++++−−−+−+++=4422222222222330141813()()()()()x y y x y x y x y x x y y x k J k J J J k k k J k J J J Q g a 332/13233+++−+=14=a1. Lord Rayleigh, “Investigations of the character of the equilibrium of an incompressible heavy fluid of variable density” Proc.Lond.Math.Soc. 14, 170 (1883)2. R.M.Davies, G.I.Taylor, “The mechanics of large bubbles rising through extended liquids and through liquids in tubes” Proc.Roy.Soc. A200, 375 (1950)3. D.H.Sharp, “An overview of Rayleigh-Taylor instability” Physica D 12, 3 19844. S.Chandrasekhar, Hydrodynamic and hydromagnetic stability, 3d ed. (Dover Publications Inc., New York, 1981), pp.428-477; K.O.Mikaelian, “Effect of viscosity on Rayleigh-Taylor and Richtmeyer-Meshkov instabilities” Phys.Rev.E 47, 375 (1993)5. K.I.Read, “Experimental investigation of turbulent mixing by Rayleigh-Taylor instability” Physica D 12, 45 (1984); Yu.A.Kucherenko at al, “Experimental study into the asymptotic stage of separation of the turbulized mixtures in gravitationally stable mode” Laser Part. Beams 15, 17 (1997); Izv. Akad. Nauk SSSR, Mekh. Zhidk. Gaza 6, 157 (1978)6. G.Dimonte M.Schneider, “Density ratio dependence of Rayleigh-Taylor mixing for sustained and impulsive acceleration histories” Phys.Fluids 12, 304 (2000);M.Schneider, G.Dimonte, B.Remington, “Large and small scale structures in Rayleigh-Taylor mixing” Phys.Rev.Lett. 80, 3507 (1998)7. F.H.Harlow, J.E.Welch, “Numerical study of large-amplitude free-surface motion” Phys.Fluids 9,842 (1966); D.L.Youngs, “Numerical simulations of turbulent mixing by Rayleigh-Taylor instability” Physica D 12, 32, (1984); R.Menikoff, C.Zemach,“Rayleigh-Taylor instability and the use of conformal maps for ideal fluid flow”p.Phys. 51, 28 (1983); G.R.Baker, D.I.Meiron, S.A.Orszag, “Vortexsimulations of the Rayleigh-Taylor instability” Phys.Fluids 23, 1485 (1980); G.Gardner, J.Glimm, O.McBryan, R.Menikoff, D.H.Sharp, Q.Zhang, “The dynamics of bubble growth for Rayleigh-Taylor instability” Phys.Fluids 31, 447 (1988)8. X.L.Li, “Study of three-dimensional Rayleigh-Taylor instability” Phys.FluidsA5, 1904 (1993); X.L.Li, “A numerical study of three-dimensional bubble merger in the Rayleigh-Taylor instability” Phys.Fluids 8, 336 (1996)9. D.L.Youngs, “Three-dimensional numerical simulations of turbulent mixing by Rayleigh-Taylor instability” Phys.Fluids A 3, 1312 (1991)10. U.Alon, J.Hecht, D.Offer, D.Sharts, “Power-laws and similarity of Rayleigh-Taylor and Richtmyer-Meshkov mixing fronts at all density ratios” Phys.Rev.Lett.74, 534 (1995)11. S.I.Abarzhi, “Stable steady flows in Rayleigh-Taylor instability”Phys.Rev.Lett.89, 1332 (1998); “Non-linear three-dimensional Rayleigh-Taylor instability” Phys.Rev.E, 59, 1729 (1999)12. A.V.Shubnikov and V.A.Koptsik, Symmetry in science and art, Plenum New York, 197413. yzer, “On the instability of superposed fluids in a gravitational field”Astrophys.Jour. 122, 1 (1955); P.R.Garabedian, “On steady-state bubbles generated by Taylor instability” Proc.Roy.Soc. A241, 423 (1957)14. S.I.Abarzhi, “The stationary spatially periodic flows in Rayleigh-Taylor instability: solutions multitude and its dimension” Physica Scripta T66, 238 (1996);“Stationary solutions of the Rayleigh-Taylor instability for spatially periodic flows:questions of uniqueness, dimensionality and universality” Zh.Eksp.Teor.Fiz. 110,1841 (1996) (Sov.Phys.JETP 83, 1012)15. S.I.Abarzhi and N.A.Inogamov, “Stationary solutions in the Rayleigh-Taylor instability for spatially periodic flow” Zh.Eksp.Teor.Fiz. 107,245 (1995) (Sov.Phys.JETP 80,132 (1995)); N.A.Inogamov, “Stationary solutions in the theory of the hydrodynamic Rayleigh-Taylor instability” Pis’ma Zh.Eksp.Teor.Fiz. 55, 505 (1992) (Sov.Phys.JETP Lett. 55, 521 (1992))16. A.Oparin, “Numerical simulations of Rayleigh-Taylor instability using quasi-monotonic grid-characteristic approach” private communication, 1998; M.F.Ivanov,A.M.Oparin, V.G.Sultanov, V.E.Fortov, “Certain features of development of the Rayleigh-Taylor instability in 3D geometry” Doklady Physics 44, 491 (1999) (Doklady Akademii Nauk 367, 464 (1999))17. R.J.Taylor, J.P.Dahlburg, A.Iwase, J.J.Gardner, D.E.Fyfe O.Willi,“Measurement and Simulation of Laser Imprinting and Consequent Rayleigh-Taylor growth” Phys.Rev.Lett.76, 1643 (1996); M.M.Marinak, S.G.Gledinning, R.J.Wallace, B.A.Remington, et al, “Non-linear evolution of a three-dimensional multimode perturbation” Phy.Rev.Lett. 80, 4426 (1998);18. Dalziel SB, Linden PF, Youngs DL, “Self-similarity and internal structure of turbulence induced by Rayleigh-Taylor instability” Jour.Fluid Mechanics, 399, 1 (1999)19. V.E.Zakharov, “Stability of periodic waves of finite amplitude on deep fluid surface” Prikl.Mat.Teor.Fiz 2, 86 (1968) (Sov.Math. Appl. Math. and Theor. Phys. 2, 86 (1968)); E.Fermi and J.von Newman, “Taylor instability of an incompressible liquid”(1951) in book: E.Fermi, Collected papers, The University of Chicago Press, Chicago 1962, v.2, 81620. J.W.Jacobs and I.Catton, “Three-dimensional Rayleigh-Taylor instability. Part I. Weakly nonlinear theory” Jour. Fluid Mech. 187, 329 (1988)21. X.Y.He, R.Y.Zhang, S.Y.Chen, G.D.Doolen, “On three-dimensional Rayleigh-Taylor instability” Phys.Fluids 11, 1143 (1999)22. J.Hecht, U.Alon, D.Shvarts, "Potential flow models of Rayleigh-Taylor and Richtmeyer-Meshkov bubble fronts" Phys.Fluids 6, 4019 (1994)23. V.S.Arpachi, A.Asmaeeli, “On interface dynamics” Phys.Fluids 12, 1244 (2000)Figure captionsFIG.1. Bubbles generated by the Rayleigh-Taylor instability. Flow symmetry group is p2mm , λx,y are spatial periods in the plane (x, y ), rotation axis 2 and mirror planes m are perpendicular to the plane of translations. Black circle marks the positions of bubble top.FIG.2.(a). Plane of parameters R x,y of steady solutions family, finite values of aspect ratio x y k k . Physical region is bounded by the flattened bubbles (“solitary spikes”) ∞→y x R ,and by the edge solutions cr y x R ,. Dashing line relates to bubbles with circular contour ()()2x y y x k k R R=. Bubbles are y -elongated at ()()2x y y x k R R < and x -elongated at ()()2x y y x k k R R >.(b). Low-symmetric steady solutions under the dimensional crossover 0→x y k k : the solutions with ()2x y y x k k R R ≥ are restricted to region of flattened bubbles (“solitary spikes”), while solutions with ()2x y y x k k R R < become solutions of the 2D flow.FIG.3. The real parts of the Lyapunov exponents for low-symmetric steady solutions.(a). Aspect ratio 2/1=x y k k and the radius of curvature ∞≡y R .(b). Limiting case of “square flow” p4mm , 1=x y k k and x y R R =.FIG.4. Stability region for the low-symmetric bubbles in first approximation (dashing area). Finite values of aspect ratio x y k k , dashing line relates to the bubbles with circular contour, ()()2x y y x k k R R =.FIG.5. Stability of steady solutions under dimensional crossover 0→x y k k .Dashing lines are real parts of the Lyapunov exponents of the 2D flow, unstable mode D D w 23− is plotted at 1.0=x y k k and 001.0=δ. cr c R k R ≈=2. Black points are the Hopf bifurcations, which bound the region of stability of the 2D flow at N =2.FIG.6. Secondary instabilities of the low-symmetric bubbles under spatial modulations: merging and splitting.。
Numerical Approximation of a quasi-Newtonian Stokes Flow Problem with Defective Boundary ConditionsVincent J.Ervin∗Hyesuk Lee∗Abstract.In this article we study the numerical approximation of a quasi-Newtonian Stokesflow problem where only theflow rates are specified at the inflow and outflow boundaries.A variational formulation of the problem,using Lagrange multipliers to enforce the statedflow rates,is given. Existence and uniqueness of the solution to the continuous,and discrete,variational formulations is shown.An error analysis for the numerical approximation is also given.Numerical computations are included which demonstrate the approximation scheme studied.Key words.quasi-Newtonian Stokesflow;Defective boundary condition;Power lawfluidAMS Mathematics subject classifications.65N301IntroductionIn this paper we investigate the numerical approximation of a quasi-Newtonian Stokesflow problem with defective boundary conditions.For well-posedness of a Newtonianfluidflow problem suitable boundary conditions are required to uniquely define the solution.Perhaps the simplest of these is to specify the velocity at each point on the boundary of the domain.Often what is assumed is that the flow is fully developed at the inflow and outflow boundaries,which justifies a parabolicflow profile at these boundaries.Typically a no slip(i.e.velocity=0)is assumed along the other portions of the boundary of the domain.However,in many physical problems the assumption of fully developed flow at the inflow and outflow is either unreasonable or highly ually what is know in physicalfluidflow problems are the various inflow and outflowflow rates.In[6]Formaggia,Gerbeau,Nobile,and Quarteroni discuss the defective boundary condition problem for the time dependent Navier-Stokes equation.They introduce a Lagrange multiplier approach to enforceflow constraints at the inflow and outflow portions of the boundary.For the steady-state Stokes problem,they show existence and uniqueness of the solution forflow rates imposed using the Lagrange multiplier formulation.Herein we extend this work to analyze a quasi-Newtonian Stokesflow problem subject to specified inflow and outflowflow rates.We establish existence and uniqueness of solution for the continuous and discrete variational problems,and present an error analysis for the numerical approximation.∗Department of Mathematical Sciences,Clemson University,Clemson,SC,29634-0975,USA.email: vjervin@ hklee@.Partially supported by the NSF under grant no.DMS-0410792.1Initially it is,perhaps,somewhat perplexing to note that for the uniqueness of solution to the variational problem for:(i)the Dirichlet problem we require´d(the dimension of the space)conditions be specified at each point on the boundary,whereas(ii)the defective boundary condition problem only requires a single scalar be specified at inflow and outflow boundaries(and´d conditions at other boundary points).This seeming anomaly is explained in Lemma2.1(see also[6]Proposition2.1, [11]pg.341).Specifically,the variational formulation for the defective boundary condition problem implicity imposes that across each of the inflow and outflow boundaries the total stress normal to the boundaries is a constant,and the extra stress lying in the surface of the inflow and outflow boundaries is zero.In[11]Heywood,Rannacher,and Turek also investigated the defective boundary condition problem for the time dependent Navier-Stokes equations.They considered both the case of specifiedflow rates at the inflow and outflow boundaries,and also the case of the mean specified pressure at the inflow and outflow boundaries.For the specifiedflow rate problem,the formulation they considered (and proved existence of a steady state solution)involved the construction of suitableflux-carrier vector functions.The numerical approximation of the quasi-Newtonian Stokesflow problem,with homogeneous boundary conditions has been previously studied in several papers[2,5,7,13,16].This paper is organized as follows.In Sections2.1and2.2we describe the model problem,state our assumptions on the model,and introduce appropriate mathematical notation.We show in Section 2.3that the corresponding variational formulation,in which theflow rate boundary conditions are weakly imposed using Lagrange multipliers,is well posed.A numerical approximation scheme is presented in Section3,and its solution shown to exist.A priori error estimates for the numerical approximation are derived in Section4.Numerical results are presented in Section5.2Mathematical ModelMotivated by physical considerations we consider the numerical approximation of a threefield, quasi-Newtonian Stokesflow problem withfixedflow-rate boundary conditions.2.1Problem SpecificationLetΩdenote a bounded domain in I R´d,´d=2or3,whose boundary∂Ωis decomposed into the union ofΓand several disjoint sections S1,S2,...,S m,m≥2.We are interested in the numerical approximation ofσ=g(u),inΩ,(2.1)−∇·σ+∇p=f,inΩ,(2.2)∇·u=0,inΩ,(2.3)u=0,onΓ,(2.4)2GammaOmegamS 3S 2S 1S Figure 2.1:Illustration of flow domain.subject to the specified flow rates across the surfaces S iS iu ·n ds =Q i ,for i =1,...,m.(2.5)We use n to denote the outward (from Ω)normal to the surface.Because of the incompressibility condition (2.3)it follows thatm i =1Q i =0.(2.6)Note that equations (2.1)-(2.5),can only determine the pressure,p ,up to an arbitrary constant.Below we fix this constant by requiring p to have mean value 0over Ω.The general form of the (algebraic)constitutive equation we assume in our analysis (see A1,A2,A3,in Section 2)is motivated by the study of fluids having a power law constitutive equation,i.e.σ=ν0|d (u )|r −2d (u ),ν0>0,1<r <2,(2.7)where σdenotes the extra stress tensor,u the fluid velocity,and d (u ):=(∇u +∇u T )/2the rate of deformation tensor.The power law model has been used to model the viscosity of many polymeric solutions and melts over a considerable range of shear rates [10].Other constitutive equations having a similar form to the power law model include [3,13,14]:Ladyzhenskaya Law [12]:σ=ν0+ν1|∇u |r −2d (u ),ν0≥0,ν1>0,r >1,(2.8)used in modeling fluids with large stresses,Carreau Law :σ=ν0 1+|d (u )|2 (r −2)/2d (u ),ν0>0,r ≥1,(2.9)3used in modeling visco-plasticflows and creepingflow of metals.2.2Notation/AssumptionsWe made the following assumptions regarding the constitutive equation(2.1)for the stressσ.A1:g(u)is(formally)uniquely invertible to obtaind(u)=˘g(σ)σ,(or∇u=ˇg(σ)σ)and the inverse is continuous.For G(σ):=˘g(σ)σ,A2:(G(s)−G(t)):(s−t)≥c|s−t|r ,∀s,t∈I R´d×´d,(2.10) A3:|G(s)−G(t)|≤M(|s|+|t|)r −2|s−t|,∀s,t∈I R´d×´d.(2.11)For r∈I R,r>1,we denote its unitary conjugate by r ,satisfying r−1+r −1=1.For problems of physical interest,1<r≤2,e.g.,shear thinningfluids.We therefore assume that 1<r≤2,and consequently,2≤r <∞.Properties A2and A3imply that G(·)is strongly monotone,and Lipschitz continuous for bounded arguments[4].Used in the analysis below are the following function spaces and norms.T:=L r (Ω)´d×´dsym=τ=(τij);τij=τji;τij∈L r (Ω);i,j=1,...,´d,with norm τ T:=Ω|τ|r dΩ1/r.X:=v∈W1,r(Ω)´d:v|Γ=0,with W k,p(Ω)denoting the usual Sobolev space notation.We take for the norm on X, v X:=Ω|d(v)|r dΩ1/r,which is equivalent to the usual · W1,r norm by the Poincar´e-Friedrichs lemma.P:=L r 0(Ω)=q∈L r (Ω):Ωq dΩ=0,with norm q P:=Ω|q|r dΩ1/r.We use V X to denote the subspace of X defined byV X:={v∈X:Ωq∇·v dΩ+mi=1βiS iv·n ds=0,∀(q,β)∈P×I R m},and letV T:={τ∈T:Ωτ:d(v)dΩ=0,∀v∈V X}.For a Banach space Y,Y denotes its dual space with associated norm · Y .Forσ,τtensors and u,v vectors,we use ·,· to denote the scalar quantities σ,τ :=σ:τ,and u,v :=u·v.42.3Lagrange Multiplier ApproachWe consider the following variational formulation to(2.1)-(2.5):Given f∈X ,Q∈I R m,determine (σ,u,p,λ)∈T×X×P×I R m,such thata(σ,τ)−b(τ,u)=0,∀τ∈T,(2.12) b(σ,v)−s(v,(p,λ))=<f,v>,∀v∈X,(2.13)s(u,(q,β))=mi=1Q iβi,∀(q,β)∈P×I R m,(2.14)wherea(σ,τ):=Ω˘g(σ)σ:τdΩ,(2.15)b(τ,u):=Ωτ:d(u)dΩ,(2.16)s(v,(p,λ)):=Ωp∇·v dΩ+mi=1λiS iv·n ds.(2.17)The Lagrange multiplierλ∈I R m is introduced to include theflow constraints(2.5)in the variational formulation,see[1],[6],[9].Equivalence of the Differential Equations and Variational FormulationsThe variational formulation is obtained by multiplying the differential equations by sufficiently smooth functions,integrating over the domain and,where appropriate,applying Green’s theorem. The constraint equations are imposed weakly using Lagrange multipliers.For a smooth solution the steps used in deriving the variational equations can be reversed to show that equations(2.1)-(2.5) are satisfied.In addition we have that a smooth solution of(2.12)-(2.14)satisfies the following additional boundary conditions(see[6]).For n the outward normal on S i,express the extra stress vector on S i,σ·n,asσ·n=s n n+s T,where s n=(σ·n)·n and s T=σ·n−s n n.The scalar s n represents the magnitude of the extra stress in the outward normal direction to S i,and s T the component of the extra stress vector which lies in the plane of S i.Lemma2.1Any smooth solution of(2.12)-(2.14)satisfy the additional boundary conditions−p+s n|Si =λi and s T|Si=0,i=1,...,m.(2.18)Proof:The proof follows as in[6].Remark:The equations(2.1)-(2.5)do not uniquely define a solution,but rather a set of solutions. The variational formulation(2.12)–(2.14)chooses a solution from the solution set.Specifically,5(2.12)–(2.14)chooses the solution which satisfies(2.18).A different variational formulation may result in a different selection for the solution from the solution set.(See,for example,[6].) Unique Solvability of(2.12)–(2.14)There are two main steps in showing that(2.12)–(2.14)is uniquely solvable.Step1involves showing that the(2.12)–(2.14)can be reduced to an equivalent problem involving onlyσ.Step2demonstrates that the stress is uniquely ed in Step1is the following lemma.Lemma2.2([8],Remark4.2,pg.61)Let(X, · X)and(M, · M)be two reflexive Banach spaces.Let(X , · X )and(M , · M )be their corresponding dual spaces.Let B:X→M be a linear continuous operator and B :M →X the dual operator of B.Let V=ker(B)be the kernel of B;we denote by V o⊂X the polar set of V:V o={x ∈X ,<x ,v>=0,∀v∈V} and˙B:X/V→M the quotient operator associated with B.The following three properties are equivalent:(i)∃c>0,such thatinf q∈M supv∈X<Bv,q>q M v X≥c,(ii)B is an isomorphism from M onto V o andB q X ≥C B q M ∀q∈M ,(iii)˙B is an isomorphism from X/V onto M and˙B˙v M ≥C B ˙v X/V∀˙v∈X/V.As thefirst part of Step1,we show that(p,λ)can be eliminated from(2.12)–(2.14).To do this we use the following inf-sup condition.(See also[17].)Lemma2.3There exists C P RX>0such thatinf (q,β)∈P×I R m supu∈Xs(u,(q,β))u X (q,β)P×I R m≥C P RX,(2.19)where (q,β)P×I R m:= q P+ β I R m. Proof:Fix(q,β)∈P×I R m and letˆq=|q|r /r−1qq r −1P,ˆβ=ββ I R m.(2.20)Note that(q,ˆq)= q P, ˆq P =1,and<ˆβ,β>= β I R m and ˆβ I R m=1.Next,we introduceδ∈I R and h∈W1−1/r,r(∂Ω),a piecewise constant function,defined byh=ˆβi/meas(S i)on S i,i=1,...,m0on∂Ω\∪m i=1S i,(2.21)δ=∂Ωh ds−Ωˆq dΩ/meas(Ω).(2.22) 6Consider the Neumann problem:Given f∈W k,p(Ω),g∈W k+1−1/p,p(∂Ω),1<p<∞,determine ˙φ∈W k,p(Ω)/I R(the quotient space)satisfying−∆˙φ=f,inΩ,(2.23)∂˙φ∂n=g,on∂Ω.(2.24)From[8],pg.15,with ˙φW k+2,p(Ω)/I R :=infφ∈˙φφ W k+2,p(Ω),we have the existence and uniqueness of˙φand a constant C=C(k,p,Ω)satisfying˙φW k+2,p(Ω)/I R ≤Cf W k,p(Ω)+g W k+1−1/p,p(∂Ω).(2.25)Additionally,forφ∈˙φwe have that ˙φW k+2,p(Ω)/I R≡|φ|W k+1,p(Ω).For the choices f=ˆq+δ,g=h,k=0,p=r,let˙φbe given by(2.23)-(2.24). Remark:The choice of the constantδguarantees that the compatibility conditionΩf dΩ=∂Ωg dsis satisfied.We have thatˆq W0,r(Ω)=1,(by construction),(2.26)h W1−1/r,r(∂Ω)≤C1 ˆβ I R m=C1,(2.27) as h is piecewise constant(and the equivalence offinite dimensional norms).Also,Ωˆq dΩ≤ ˆq P 1 P=C2,(2.28)∂Ωh ds≤ ˆβ I R m 1 I R m=C3,(2.29) and thus δ W0,r(Ω)≤C4.Letting u=−∇φ,φ∈˙φ,we have that u∈W1,r(Ω)´dand from(2.25)–(2.29)u W1,r(Ω)≤ ˙φ W2,r(Ω)≤C(1+C4+C1)≤C5.(2.30)Also,from(2.23)-(2.24),∇·u=ˆq+δ,andS iu·n ds=S ih ds=ˆβi,i=1,...,m.Hence,s(u,(q,β))=(∇·u,q)+mi=1βiS iu·n ds=(ˆq+δ,q)+<ˆβ,β> = q P+ β I R m= (q,β)P×I R m,7as(δ,q)=0for q∈P(=L r 0(Ω)).Thus,supu∈W1,r(Ω)s(u,(q,β))(q,β)P×I R mu W1,r(Ω)≥1C5,from which(2.19)directly follows.We now state and prove the existence and uniqueness of the solution to(2.12)–(2.14).Theorem2.1Given f∈X and Q∈I R m,there exists a unique(σ,u,p,λ)∈T×X×P×I R m satisfying(2.12)–(2.14).Proof:From Lemma2.3and Lemma2.2(i),(iii),with the associations X=X,M=P×I R m, B:X→(P×I R m) defined byB(v):=s(v,(·,·)),V=ker(B),we have that there exists˙u∈X/V,such thats(˙u,(q,β))=mi=1Q iβi,∀(q,β)∈P×I R m,with ˙u X/V≤1/C s Q I R m.Note: ˙u X/V:=inf v∈˙u v X.As the cosets in X/V are closed,we can choose u s∈˙u such thatu s X= ˙u X/V≤1/C s Q I R m.(2.31) Let u=˜u+u s.Then,solving(2.12)–(2.14)is equivalent to:Findσ∈X,˜u∈V X,such thata(σ,τ)−b(τ,˜u)=(τ,u s),∀τ∈T,(2.32)b(σ,v)=<f,v>,∀v∈V X.(2.33)Now,note that for v∈X,andτ=|d(v)|r−2d(v)∈T, τ T= v r/rX,andb(τ,v) τ T =v rXvX= v X.Thusinf v∈X supτ∈Tb(τ,v)τ T v X≥1,(2.34)i.e.b(τ,v)satisfies an inf-sup condition over X×T.As above,there existsσb∈T such thatb(σb,v)=<f,v>,∀v∈X,with σb T≤1C bf X .(2.35) 8Let σ=˜σ+σb .Then,solving (2.32),(2.33)is equivalent to:Find ˜σ∈V T ,such thata (˜σ+σb ,τ)=(τ,u s )∀τ∈V T .(2.36)From assumptions A2and A3we have that G (τ):V T →V Tis a continuous,coercive,strictly monotone operator on a real,separable,reflexive Banach space [15].Hence,there exists a unique ˜σ∈V T satisfying (2.36).This then also uniquely determines σ∈T .The inf-sup condition (2.34),together with (2.32),uniquely determine ˜u∈V X and hence also u =˜u+u s ∈X .Finally,the inf-sup condition (2.19)and equation (2.13)uniquely determine p ∈P and λ∈I R m .We now establish a bound for σ T which we use below in Section 4in deriving a priori estimates for the numerical approximation.Estimates for u ,p ,and λcan also be derived.Corollary 2.1For σ∈T satisfying (2.12)–(2.14)we have that there exists C >0such thatσ T ≤C ( f X + Q r/rI Rm ).(2.37)Proof :From (2.36),with the choice τ=˜σ,we havea (˜σ+σb ,˜σ)=(˜σ,u s )≤2−r ˜σ rT +C u s rX≤2−r( ˜σ+σb T + σb T )r+C u s r X≤ ˜σ+σb rT + σb rT +C u s r X .(2.38)Using assumption (2.10),a (˜σ+σb ,˜σ)=Ω˘g (˜σ+σb )(˜σ+σb ):˜σd Ω=Ω˘g (˜σ+σb )(˜σ+σb ):(˜σ+σb )−Ω˘g (˜σ+σb )(˜σ+σb ):σb d Ω≥Ωc |˜σ+σb |rd Ω− G (˜σ+σb ) L r σb T≥c ˜σ+σb rT −c 2M rG (˜σ+σb ) r L r −C σb rT .(2.39)We use (2.11)to estimate the second term on the RHS of (2.39).G (˜σ+σb ) r L r = Ω|G (˜σ+σb )|r d Ω≤M r Ω(|˜σ+σb |+0)r −2|˜σ+σb −0| rd Ω=M r Ω|˜σ+σb |(r−1)r d Ω=M r Ω|˜σ+σb |rd Ω=M r˜σ+σb rT .(2.40)9Combining(2.38)-(2.40)we have thatc2−˜σ+σb r T≤Cσb r T+ u s r X.Asσ=˜σ+σb,and using the estimates(2.31)and(2.35),we obtain(2.37).3Discrete ApproximationWe now describe the discrete approximation problem corresponding to(2.12)-(2.14)and show that the problem is well defined.Analogous to the continuous problem existence and uniqueness for the discrete problem relies on the approximating spaces satisfying suitable inf-sup conditions.We begin by describing thefinite element approximation framework used in the analysis.LetΩ⊂I R´d(´d=2,3)be a polygonal domain and let T h be a triangulation ofΩmade of triangles (in I R2)or tetrahedrals(in I R3).Thus,the computational domain is defined byΩ=∪K;K∈T h.We assume that there exist constants c1,c2such thatc1h≤h K≤c2ρKwhere h K is the diameter of triangle(tetrahedral)K,ρK is the diameter of the greatest ball(sphere)included in K,and h=max K∈Th h K.Let P k(A)denote the space of polynomials on A of degree nogreater than k.Then we define thefinite element spaces as follows.T h:=τ∈T∩C(¯Ω)2×2:τ|K∈P l(K),∀K∈T h,(3.1)X h:=v∈X∩C(¯Ω)2:v|K∈P k(K),∀K∈T h,(3.2)P h:=q∈P∩C(¯Ω):q|K∈P n(K),∀K∈T h.(3.3)We assume that the velocity-stress and the pressure-velocity spaces satisfy the following(typical) discrete inf-sup condition:There exists constants C XT h,C P Xh>0,such thatinf v∈X h supτ∈T hb(τ,v)τ T v X≥C XT h,(3.4)inf q∈P hsupv∈X hΩq∇·v dAq P v X≥C P Xh.(3.5)Discrete Approximation Problem:Given f∈X ,and Q∈I R m,determine(σh,u h,p h,λh)∈T h×X h×P h×I R m such thata(σh,τh)−b(τh,u h)=0,∀τh∈T h,(3.6) b(σh,v h)−s(v h,(p h,λh))=<f,v h>,∀v h∈X h,(3.7)s(u h,(q h,βh))=mi=1Q iβi,∀(q h,βh)∈P h×I R m.(3.8) 10For the analysis a more general inf-sup condition than that given in(3.5)is needed.This is estab-lished using the following two lemmas.(See also[17].)Lemma3.1There exists C RXh>0such thatinf β∈I R msupv h∈X hmi=1βiS iv h·n dsv h X β I R m≥C RXh.(3.9)Outline of Proof:From inspection of(3.9)we see that we would like to choose v h∈X h such that v h·n=βi on each S i,and v h X≤c β I R m.This is done by constructing a suitable v h,i,withv h,i|Sj =0,j=i and then letting v h=iv h,i.We focus our attention on a single S i.We will assume that on S i n(x)·n(y)≥c>0at all points x,y∈S i,for which n is defined.(That is,on S i the normal n does not vary by more than90 degrees.If the normal does vary by more than90degrees consider the surface as two surfaces.) For ease of explanation,consider S i as a straight line segment from(0,0)to(|S i|,0).Fix a depth d i such that the rectangle R with vertices:(|S i|/6,0),(5|S i|/6,0),(5|S i|/6,d i),(|S i|/6,d i)lies inΩ. Introduce the labelling of the following points:A:=(|S i|/6,0),B:=(5|S i|/6,0),C:=(5|S i|/6,d i), D:=(|S i|/6,d i),E:=(|S i|/3,0),F:=(2|S i|/3,0),G:=(2|S i|/3,d i),and H:=(|S i|/3,d i).Let˜n=n|(|Si|/2,0)and g i the continuous,piecewise bi-linear,function defined by g i|E,F=βi,andg i|A,B,C,D,G,H=0.(See Figure3.1.In Figure3.1xi=x/|S i|,and eta=y/d.).1.0Figure3.1:Plot of g i/βi.We define the function˜v i as˜v i|Ω\R=0,and˜v i|R=g i˜n.Then,βiS i ˜v i·n ds=βiFE(βi˜n)·n ds≥c iβ2i|S i|/3.11Also,˜vi X =R|˜vi |r dA +R|∇˜vi |r dA 1/r=|βi | (r +2)d i |S i |/(3(r +1)2)+6r −22d i /(|S i |r −1(r +1))+(6r +7)|S i |/(18(r +1)d r −1i) 1/r.Now,there exists h 0such that for all h ≤h 0there exists v h,i ∈T h such that ˜v i −v h,i ∞≤c i βi /6and ˜vi −v h,i X ≤|βi |.Then, m i =1βi S i v h ·n ds v h X ≥m i =1βi S i v h,i ·n ds m i =1v h,i X ≥ m i =1 βi S i ˜v i ·n ds −c i β2i |S i |/6 m i =1( ˜v i X + ˜v i −v h,i X )≥ m i =1c i β2i |S i |/6 mi =1ˆc i |βi |≥C β ,from which (3.9)then follows.Lemma 3.2For h sufficiently small,there exists C P RXh >0such thatinf(q h ,β)∈P h ×I Rmsupv h ∈X hs (v h ,(q h ,β))v h X (q,β) P ×I R m≥C P RXh .(3.10)Proof :Let (p h ,β)∈P h ×I R m .From Lemma 3.1,there exists ˆuh ∈X h such that ˆu h X = β I R m and mi =1βi S i ˆuh ·n ds ˆuh X ≥c 1 β I R m .(3.11)Let X 0h :={v h ∈X h :v h |∂Ω=0},and consider the (discrete)power-law problem:Determine˜u h ∈X 0h ,˜p h ∈P h such that(|d (˜u h )|r −2d (˜u h ),d (v ))−(˜p h ,∇·v )=0,∀v ∈X 0h ,(3.12)(q ,∇·˜uh )=(q , p h 1−r /rP |p h |r/r −1p h −∇·ˆuh ),∀q ∈P h .(3.13)Note thatp h 1−r /r P|p h |r/r −1p h −∇·ˆuh ∈L r (Ω).Existence and uniqueness of ˜uh ∈X 0h ,˜p h ∈P h satisfying (3.12),(3.13)follows analogous to the proof of Theorem 2.1.(See also [8,2]).From (3.12),(3.13)with the choices v =˜u h ,and q =˜p h , ˜u h r X =(|d (˜u h )|r −2d (˜u h ),d (˜u h ))=(˜p h ,∇·˜u h )=(˜p h , p h 1−r /rP |p h |r /r −1P p h −∇·ˆuh )≤ ˜p h P p h 1−r /r P |p h |r /r −1P p h L r + ∇·ˆu h L r≤ ˜p h P ( p h P +C ˆuh X )= ˜p h P p h P + β I R m.(3.14)12Also,from the inf-sup condition for spaces X0hand P h we havec ˜p h P≤supv∈X0h (˜p h,∇·v)v X=supv∈X0h (|d(˜u h)|r−2d(˜u h),d(v))v X≤supv∈X0h ( |d(˜u h)|r−2d(˜u h)L rd(v) L rv X= |d(˜u h)|r−2d(˜u h)L r= ˜u h r/rX.(3.15) Combining(3.14)and(3.15)we have the estimate˜u h X≤p h P+C β I R m.(3.16)Let u h=˜u h+ˆu h.Then,using(3.13)and(3.11)s(u h,(p h,β))=Ωp h∇·˜u h dΩ+Ωp h∇·ˆu h dΩ+mi=1βiS i˜u h·n ds+mi=1βiS iˆu h·n ds=Ωp h p h 1−r /rP|p h|r /r−1p h dΩ+mi=1βiS iˆu h·n ds≥cp h 2P+ β 2I R m.(3.17)Thus,using(3.11),(3.16),we havesup v h∈X h s(v h,(p h,β))v h X≥s(u h,(p h,β))u h X≥Cp h P+ β I R m,from which(3.10)immediately follows.We now state and prove the existence and uniqueness of solutions to(3.6)–(3.8).Theorem3.1Given f∈X and Q∈I R m,there exists a unique(σh,u h,p h,λh)∈T h×X h×P h×I R m satisfying(3.6)–(3.8).In addition,σh T≤C( f X + Q r/rI R m).(3.18) Proof:With the inf-sup conditions given in(3.4)and(3.10)the proof of existence follows exactly as for the continuous problem in Theorem2.1.Similarly,the norm estimate forσh follows as that forσgiven in Corollary2.1.134A Priori Error EstimateIn this section we derive an error estimate for the error in the approximation(σh,u h,p h,λh) satisfying(3.6)–(3.8),and(σ,u,p,λ)satisfying(2.12)–(2.14).The proof of the estimates gives in Theorem4.1follow along the same lines as the proofs for the existence and uniqueness,except for the error estimates we work backwards.The procedure to establish existence and uniqueness was to reduce the problem to an equivalent problem forσ(or σh)on a subspace of the solution space.To obtain the error estimates we begin by considering the determining equations forσh,u h,over a ing the coercivity and continuity assumptions (2.10),(2.11),an error estimate for σ−σh over the subspace is constructed.We then show that the estimate over the subspace can be extended to the entire solution space.Useful in the analysis below is the following inf-sup condition which follows from(3.4)and(3.10). Lemma4.1For h sufficiently small,there exists a constant C XT P Rh>0such thatinf v∈X hsup(τ,q,β)∈T h×P h×I R mb(τ,v)−s(v,(q,β))(τ,q,β)T×P×I R mv X≥C XT P Rh,(4.1)where (τ,q,β)T×P×I R m:= τ T+ q P+ β I R m. Proof:For v∈X h,from(3.4)there existsτv such thatb(τv,v)≥C XT h2τv T v X.(4.2)We now consider two cases.Firstly,if s(v,(q,β))=0,∀(q,β)∈P h×I R m,then(4.1)follows immediately from(4.2).Otherwise,from the definition of s(v,(q,β)),there exists(q v,βv)∈P h×I R m such that s(v,(q v,βv))<0and (q v,βv)P h×I R m= τv T.Thus,sup (τ,q,β)∈T h×P h×I R m b(τ,v)−s(v,(q,β))(τ,q,β)T×P×I R m≥b(τv,v)−s(v,(q v,βv))(τv,q v,βv)T×P×I R m≥C XT h τv T v X2( τv T+ τv T),from which(4.2)then follows.Theorem4.1For(σ,u,p,λ)satisfying(2.12)–(2.14)and(σh,u h,p h,λh)satisfying(3.6)–(3.8),for h sufficiently small,we have that there exists a constant C>0such thatσ−σh r T≤Cinfτh∈T hσ−τh r T+ σ−τh r T+infv h∈X hu−v h r X+infq h∈P hp−q h r P,(4.3)u−u h X≤Cσ−σh T+infv h∈X hu−v h X,(4.4)p−p h P+ λ−λh I R m≤Cσ−σh T+infq h∈P hp−q h P.(4.5) 14Proof :We have that (σh ,u h ,p h ,λh )satisfiesa (σh ,τh )−b (τh ,u h )=0,∀τh ∈T h ,(4.6)b (σh ,v h )−s (v h ,(p h ,λh ))=<f ,v h >,∀v h ∈X h ,(4.7)s (u h ,(q h ,βh ))=mi =1Q i βi ,∀(q h ,β)∈P h ×I R m .(4.8)Introduce the affine subspaces ˜Xh ⊂X h ,˜K h defined by ˜Xh :={v h ∈X h :s (v h ,(q h ,β))=m i =1Q i βi ,∀(q h ,βh )∈P h ×I R m },(4.9)˜Kh :={τh ∈T h :b (τh ,v h )−s (v h ,(p h ,λh ))=<f ,v h >,∀v h ∈˜X h }.(4.10)Note that σh ∈˜Kh and u h ∈˜X h .From (2.10),(2.11)we havec σ−σh rT≤a (σ,σ−σh )−a (σh ,σ−σh )=a (σ,σ−τh )−a (σh ,σ−τh )+a (σ,τh −σh )−a (σh ,τh −σh )≤ΩM (|σ|+|σh |)r−2|σ−σh ||σ−τh |d Ω+a (σ,τh −σh )−a (σh ,τh −σh )(4.11)Now,noting that 1<r ≤2,and hence r /r ≥1,ΩM (|σ|+|σh |)r −2|σ−σh ||σ−τh |d Ω≤ ΩM r (|σ|+|σh |)(r−2)r |σ−τh |r d Ω 1/r σ−σh T ≤ σ−σh r T +CM r Ω2(r −2)r |σ|(r −2)r +|σh |(r −2)r|σ−τh |r d Ω≤ σ−σh rT+C Ω|σ|(r −2)r +|σh |(r −2)r r /(r −r )d Ω (r −r )/r Ω|σ−τh |rd Ωr/r≤ σ−σh rT+C σ r T + σh rT (r −r )/rσ−τh r T≤ σ−σh rT +C σ−τh rT .(4.12)With the choice τh ∈˜Kh ,using (2.12)and (4.6)a (σ,τh −σh )−a (σh ,τh −σh )=b (τh −σh ,u )−b (τh −σh ,u h )=b (τh −σh ,u )(since τh and σh are in ˜Kh )=b (τh −σh ,u −v h ),(for v h ∈˜Xh )= Ω(τh −σh ):d (u −v h )d Ω= Ω(σ−σh ):d (u −v h )d Ω+ Ω(τh −σ):d (u −v h )d Ω≤ σ−σh T u −v h X + σ−τh T u −v h X≤ σ−σh r T +C σ−τh r T + u −v h rX.(4.13)15Combining(4.11)–(4.13)gives an error bound for σ−σh T in terms of the best approximations ofσand u in the sets˜K h and˜X h,respectively.Next we show that we can lift these best approximations from˜K h and˜X h to T h×X h.This is done in two steps.Firstly lifting from˜K h to˜W h,and then using the discrete inf-sup condition to go from˜W h to T h×X h.Let˜Wh:={(τh,q h)∈T h×P h:b(τh,v h)−s(v h,(q h,λh))=<f,v h>,∀v h∈X h}.(4.14) Note that if(τh,q h)is in˜W h thenτh is in˜K h.Hence,inf τh∈K h σ−τh T≤inf(τh,q h)∈˜W h(σ,p)−(τh,q h) T×P.(4.15)From the inf-sup conditions(4.1)we have that there exist operatorsΠ1:T→T h,andΠ2:P→P h such thatb(τ−Π1τ,v h)−s(v h,(q−Π2q,λh))=0,∀v h∈X h,(4.16) and(Π1τ,Π2q) T×P≤˜C (τ,q) T×P,∀(τ,q)∈T×P.(4.17) Consider(τh,q h)∈T h×P h and introduce˜σ:=τh−Π1(τh−σ),and˜p:=q h−Π2(q h−p).Then, for all v h∈X hb(˜σ,v h)−s(v h,(˜p,λh))=b(σ,v h)−s(v h,(p,λh))=<f,v h>,which implies(˜σ,˜p)∈˜W h.Also,using(4.17),(˜σ,˜p)−(τh,q h) T×P= (Π1(σ−τh),Π2(p−q h) T×P≤˜C (σ−τh,p−q h) T×P.(4.18) With(˜σ,˜p)as defined above,using(4.17),(4.18)and the triangle inequality,inf (τh,q h)∈˜W h (σ,p)−(τh,q h) T×P≤inf(τh,q h)∈T h×P h(σ,p)−(˜σ,˜p) T×P≤inf(τh,q h)∈T h×P h( (σ,p)−(τh,q h) T×P+ (˜σ,˜p)−(τh,q h) T×P)≤(1+˜C)inf(τh,q h)∈T h×P h(σ,p)−(τh,q h) T×P.(4.19)Using an analogous argument with the inf-sup condition(3.10)it is straight forward to show thatinf v h∈˜X h u−v h X≤C infv h∈X hu−v h X.(4.20)Combining(4.11)–(4.13),(4.15),(4.19),and(4.20)we then haveσ−σh r T≤Cinfτh∈T hσ−τh r T+ σ−τh r T+infv h∈X hu−v h r X+infq h∈P hp−q h r P.16。