A Simple Block-Based Lossless Image Compression Scheme
- 格式:pdf
- 大小:56.32 KB
- 文档页数:5
Lossless Compression of DSA Image Sequence Based on PCMD2 CoderJi Zhen, Mou Xuanqin, Jiang Yifeng, Cai YuanlongImage Processing Center, Xi’an Jiaotong Univ.Xi’an, P.R.C., 710049E-mail:**************.cn,**************.cnABSTRACT:ⅠINTRODUCTIONMedical image compression has been mainly concerned with lossless coding techniques[1], which ensure that all significant information for diagnostic purposes is retained in the reconstructed images with the compression ratio of around 2:1.The Digital Image and Communications in Medicine proposed standard (DICOM3),that adopted by the American College of Radiology and National Electrical Manufacturing Associa-tion(ACR-NEMA), includes lossless coding and compression of medical images. However, recent studies concerning “Visually lossless” and “In-formation preserving” indicate that the reconstruc-tion error is accepted from a medical point of view. ACR-NEMA announced a call for proposals for lossy compression that has been included in DI-COM3.0 in 1995. This class of compression tech-niques is subjective definitions and extreme cau-tion msut be taken into considerance in P.R.C,where the criteria remains ambiguous and many complex legal and regulatory issues would arise.The objective of compressing images is to reduce the data volume and achieve a low bit rate. Compressing a digital medical image can facilitate its storage and transmission. With regard to the legislation problems, physician prefers diagnosing with uncorrupted medical image. The popular lossless coding schemes includes Huffman, arith-metic, run-length encoding(RLE) and LZW[2]. More effective coding method is desired for the fast growth of PACS (Picture Archiving and Communication System) and Teleradiology.Acqusition of 3-D and 4-D medical image sequences are becoming more usual nowadays, especially in the case of dynamic studies with MRI,CT,DSA,PET. A new lossless compression method is proposed for the image sequence gener-ated by DSA ( Digital Subtraction Angiography ) apparatus, which is now common as X-ray, CT. Compressing an image sequence is equal to com-pression of 3D data, which contains both space and time redundance. The well-known MPEG[3] provides the perfect resolution of the lossy codingof image sequence. It has proved effective and proposed many practicable techniques, which could be exploited.Differential pulse code modulation (DPCM)[4] predictive coding method is predomi-nant in lossless compression. In this paper, high order(2-order) DPCM coder is introduced that can exploit the correlation between one-order differen-tial images and two-order ones to benefits of com-pression, which has highly competitive compres-sion performance and remains practical. We pre-sent the description of this proposed lossless image sequence coder in followed sections.ⅡCHARACTERISTICS OF DSA IMAGESImage compression techniques usually consistof concrete mathematical models. In practical ap-plications, the specific images, for which the available priori knowledge could be exploited in order to develop an optimized compression scheme.A typical DSA image sequence is shown in Figure 1,which consists of :Figure 1. a typical DSA image sequence (a) M represents mask image, acquired before in-travenous or intraarterious injection. (b) L(n) means the sequence of live images, acquired after intravenous or intraarterious injection. N is the volume of the whole sequence. (c) S(n) is the sub-traction image. S(n)=L(n)-M (n=0,…,N-1).(d) SD(n) is the differential subtraction image. SD(n)=S(n)-S(n-1) (n=1,…,N-1). It is obvious that the whole sequence could be repre-sented in three equivalent formats: Format 1. M and L(n); Format 2. M and S(n);Format 3. M 、S(0) and SD(n).Firstly, the entropy of whole images is calcu-lated separately, H x pp jj j()log ,=−∑ (p jmeans the probability of the jth gray in a image). The followed table could be acquired. In this table. The image resolution is 1024*1024*10bits. Table 1Image H(x) M 6.65 L(n) 6.61 S(n) 4.68 SD(n) 3.67Secondly, the correlation between two images is defined as {})()()(k n I n I E k R +⋅=. We separately obtain the correlation of S(n) and SD(n). The correlation between S(n) is shown in Figure 2(a) and between SD(n) is in (b).From the above curve, followed conclusions about the characteristic of DSA images could be made:(a) when K<5, the correlation R(k) between S(n) often remains very high.(b) the correlation R(k) in SD(n) decreases quickly while K increases. However, the R(1) could hold 0.60, which is useful and important in the pro-posed compression method.Since the three represent formats are equivalent in the mathematical mean, compressing the whole sequence could be implemented in three ways, which would take use of different tech-niques and get different results. It is evident that compression result would better by adopting the last represent than two others because of followed reasons: 1.the entropy is smallest, which mean that the highest lossless compression ratio is prospec-tive theoretically. 2.the R(k) of common signal decreases distinctly after applying one-order dif-ferential, Which means that the two-order differ-ential operation seems worthless. However, it is not the same for the DSA images. The correlation between S(n) holds high, which results in the cer-tain meanness of two-order differential images SD(n). The compression performance would be improved by taking the full advantage of this characteristic.Ⅲ COMPRESSION FRAMEWORK A. Differential Pulse Code ModulationDifferential pulse code modulation(DPCM) exploits the property that the values of adjacent pixels in an image are often similar and highly correlated. The general block diagram of DPCM is shown in Figure 3.The value of a pixel is predictedas a linear combination of a few neighbor pixel values, which is represented as follows: ∑∈−−=Rj i r e n j m i X n m X ),(),(),(αwhere e X is the predicted value, R meansa. R(k) between S(n)b. R(k) between SD(n)Figure 2. the R(k) curve of S(n) and SD(n)xEncoderDecoderneighbor region, ),(n m αare prediction coeffi-cients specified by the image characteristic. Theprediction error defined as ),(),(),(j i X j i X j i e e−= is then coded and trans-mitted. The Decoder reconstructs the pixels as ),(),(j i e j i X X q e r +=In order to keep lossless compression, only thepredictor would be included in DPCM coder. It isa “Predictor + Entropy Coder” lossless DPCMcoding .[5]B. Two Order Differential Pulse Code ModulationThe D PCM 2 coder is different from thetwo-demensional(2-D)DPCM method andthree-demensional(3-D) one[6].For a pixel ),,(j i n X in S(n), applying the DPCM withthe variable n indicates the its extension ofthree-dimension. The same is for the SD(n,x,y) in Format 3, which means two order differential be-cause SD(n) is derived from one differential op-eration oneself. The D PCM 2 coder includes four steps:1.For the mask image M, apply the RICE[7]coding method.2.Construct an optimal predictor. The ARMR model is adopted according to the image attribute. The equation is : )()1,1()1,1(),1()0,1()1,()1,0(e r X X b n m Xr n m Xr n m Xr X −+−−+−+−=αααThe last component of right part means the two-order differential and there is not quantization er-rors.3.For the every pixel S(n,x,y) in S(n), apply to the linear prediction with the variable n.4.Apply adaptive arithmetic algorithm to the error signal ),,(y x n e p . The whole coding process is shown in Figure4.Figure 4. the scheme of D PCM 2coding The dash rectangle constitute the D PCM 2 lossless coder. The decoding process is obvious . Ⅳ EXPERIMENTAL STUDY Figure 5 shows typical DSA images acquiredin Forth Military Medical Univ. of P.R.C. The fig-ure gives the image M, L(60),S(60),S(59) and SD(60) separately. The compression results through this pro-posed method are given as follows: average image size after compression is 463Kbyte. average compression ratio cr ==102410241046364782261***.:average bit rate pixel bit B /42.41024*10248*579338== average compression efficiency η===H x B ()..44236783% The followed table gives the compari-son with Huffman,Arithmetic and DPCM. The compression procedure is equivalent to still image compression without reducing the correlation between images.Huffman Arithmetic DPCM D PCM2Cr 1.36:1 1.47:1 1.53:1 2.26:1 B7.35 6.76 6.76 4.42From above table, it is obvious that this proposed compression technique performs better than others.ⅤCONCLUSIONWith applying the proposed method to the DSA apparatus, perfect experiment performance is deserved. The compression ratio is better than or-dinary techniques. The computational time and space also prove practicable and robust. Although significant progress Many researches remain.ACKNOWLEDGMENTThe authors wish to thank Dr.Sun Li-jun,FMMU for his cooperation on acquisition of some important DSA images.(a) M (b) L(60)(c) S(60) (d) S(59) (e) SD(60)(enhancement for display)Figure 5. (a) mask image (b) live image (c)and (d) subtraction image (e) differential subtraction image1.S.Wong, L.Zaremba.D.Gooden,and H.K.Huang. “Radiologic image compression-A Reivew”,Proc. IEEE,vol,83,pp194-219,Feb.1995.2.A.K.Jain, Fundamentals of Digital Image Processing. Eaglewood Cliffs,NJ:Prentice-Hall,1989.3.MPEG(ISO/IEC JTC1/SC29/WG11).Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to 1.5MBit/s ISO.CD 11172,Nov,1991.4.G.Bostelmann,“A simple high quality DPCM-Codec for video telephony using 8 Mbit per sec-ond”,NTZ,27(3),pp115-117,1974.5.L.N.Wu, Data Compression & Application, Electrical Industry Publisher,pp116-120,19956.P.Roos and M.A.Viegever, “Reversible 3_D decorrelation of medical images”, IEEE, Medical Im-aging,12(3),pp413-420,1993.7.D.Yeh, ler, “The development of lossless data compression technology for remote sensing application”, pp307-309,In IGRASS’94.。
改进Retinex⁃Net的低光照图像增强算法欧嘉敏1 胡 晓1 杨佳信1摘 要 针对Retinex⁃Net存在噪声较大㊁颜色失真的问题,基于Retinex⁃Net的分解-增强架构,文中提出改进Ret⁃inex⁃Net的低光照图像增强算法.首先,设计由浅层上下采样结构组成的分解网络,将输入图像分解为反射分量与光照分量,在此过程加入去噪损失,抑制分解过程产生的噪声.然后,在增强网络中引入注意力机制模块和颜色损失,旨在增强光照分量亮度的同时减少颜色失真.最后,反射分量和增强后的光照分量融合成正常光照图像输出.实验表明,文中算法在有效提升图像亮度的同时降低增强图像噪声.关键词 低光照图像增强,深度网络,视网膜大脑皮层网络(Retinex⁃Net),浅层上下采样结构,注意机制模块引用格式 欧嘉敏,胡晓,杨佳信.改进Retinex⁃Net的低光照图像增强算法.模式识别与人工智能,2021,34(1):77-86.DOI 10.16451/ki.issn1003⁃6059.202101008 中图法分类号 TP391.4Low⁃Light Image Enhancement Algorithm Based onImproved Retinex⁃NetOU Jiamin1,HU Xiao1,YANG Jiaxin1ABSTRACT Aiming at the problems of high noise and color distortion in Retinex⁃Net algorithm,a low⁃light image enhancement algorithm based on improved Retinex⁃Net is proposed grounded on the decomposition⁃enhancement framework of Retinex⁃Net.Firstly,a decomposition network composed of shallow upper and lower sampling structure is designed to decompose the input image into reflection component and illumination component.In this process,the denoising loss is added to suppress the noise generated during the decomposition process.Secondly,the attention mechanism module and color loss are introduced into the enhancement network to enhance the brightness of the illumination component and meanwhile reduce the image color distortion.Finally,the reflection component and the enhanced illumination component are fused into the normal illumination image to output.The experimental results show that the proposed algorithm improves the image brightness effectively with the noise of enhanced image reduced.Key Words Low⁃Light Image Enhancement,Deep Network,Retinal Cortex Theory⁃Net,Shallow Up⁃per and Lower Sampling Structure,Attention Mechanism ModuleCitation OU J M,HU X,YANG J X.Low⁃Light Image Enhancement Algorithm Based on Improved Retinex⁃Net.Pattern Recognition and Artificial Intelligence,2021,34(1):77-86.收稿日期:2020-05-12;录用日期:2020-09-17 Manuscript received May12,2020;accepted September17,2020国家自然科学基金项目(No.62076075)资助Supported by National Natural Science Foundation of China(No. 62076075)本文责任编委黄华Recommended by Associate Editor HUANG Hua1.广州大学电子与通信工程学院 广州5100061.School of Electronics and Communication Engineering,Guang⁃zhou University,Guangzhou510006 由于低光照环境和有限的摄像设备,图像存在亮度较低㊁对比度较低㊁噪声较大㊁颜色失真等问题,不仅会影响图像的美学㊁人类的视觉感受,还会降低运用正常光照图像的高级视觉任务的性能[1-3].为了有效改善低光图像质量,学者们提出许多低光照图像增强算法,经历灰度变换[4-5]㊁视网膜皮层理论[6-11]和深度神经网络[12-19]三个阶段.早期,通过直方图均衡[4-5]㊁伽马校正等灰度变换方法对低亮度区域进行灰度拉抻,可达到提高暗区亮度的目的.然而,因为未考虑像素与其邻域像素的关系,灰度变换常会导致增强图像缺乏真实感.第34卷 第1期模式识别与人工智能Vol.34 No.1 2021年1月Pattern Recognition and Artificial Intelligence Jan. 2021Land [6]提出视网膜皮层理论(Retinal CortexTheory,Retinex).该理论认为物体颜色与光照强度无关,即物体具有颜色恒常性.基于该理论,相继出现经典的单尺度视网膜增强算法(Single Scale Retinex,SSR)[7]和色彩恢复的多尺度视网膜增强算法(Multi⁃scale Retinex with Color Restoration,MSR⁃CR)[8].算法主要思想是利用高斯滤波器获取低光照图像的光照分量,再通过像素间逐点操作求得反射分量作为增强结果.Wang 等[9]利用亮通滤波器和对数变换平衡图像亮度和自然性,使增强后的图像趋于自然.Fu 等[10]设计用于同时估计反射和光照分量(Simultaneous Reflectance and Illumination Esti⁃mation,SRIE)的加权变分模型,可有效处理暗区过度增强的问题.Guo 等[11]提出仅估计光照分量的低光照图像增强算法(Low Light Image Enhancementvia Illumination Map Estimation,LIME),主要利用局部一致性和结构感知约束条件计算图像的反射分量并作为输出结果.然而,这些基于Retinex 理论模型的算法虽然可调整低光图像亮度,但增亮程度有限.研究者发现卷积神经网络(Convolutional NeuralNetwork,CNN)[12]与Retinex 理论结合能进一步提高增强图像的视觉效果,自动学习图像的特征,解决Retinex 依赖手工设置参数的问题.Lore 等[13]提出深度自编码自然低光图像增强算法(A Deep Auto⁃encoder Approach to Natural Low Light Image Enhan⁃cement,LLNet),有效完成低光增强任务.Lü等[14]提出多边低光增强网络(Multi⁃branch Low⁃Light En⁃hancement Network,MBLLEN),学习低光图像到正常光照图像的映射.Zhang 等[15]结合最大信息熵和Retinex 理论,提出自监督的光照增强网络.Wei 等[16]基于图像分解思想,设计视网膜大脑皮层网络(Retinex⁃Net),利用分解-增强架构调整图像亮度.Zhang 等[17]基于Retinex⁃Net 设计低光增强器.然而,由于噪声与光照水平有关,Retinex⁃Net 提取反射分量后,图像暗区噪声高于亮区.因此,Retinex⁃Net 的增强结果存在噪声较大㊁颜色失真的问题,不利于图像质量的提升.为此本文提出改进Retinex⁃Net 的低光照图像增强算法.以Retinex⁃Net 的分解与增强框架为基础,针对噪声问题,在分解网络采用浅层上下采样结构[15],利用反射分量梯度项[15]作为损失.同时为了改善增强图像的色彩偏差,保留丰富的细节信息,在增强网络中嵌入注意力机制模块[18]和颜色损失[19].实验表明,本文算法在LOL 数据集和其它公开数据集上取得较优的视觉效果和客观结果.1 改进Retinex⁃Net 的低光照图像增强算法改进Retinex⁃Net 的低光照图像增强算法框图如图1所示.S lowS normalR highI highI lowR lowI end图1 本文算法框图Fig.1 Flowchart of the proposed algorithm87模式识别与人工智能(PR&AI) 第34卷 Retinex理论[6]认为彩色图像可分解为反射分量和光照分量:S=R I.(1)其中: 表示逐像素相乘操作;S表示彩色图像,可以是任何具有不同曝光程度的图像;R表示反射分量,反映物体内在固有属性,与外界光照无关;I表示光照分量,不同曝光度的物体光照分量不同.本文算法主要利用2个相互独立训练的子网络,分别是分解网络与增强网络.具体地说,首先,分解网络以数据驱动方式学习,将低光照图像和与之配对的正常光照图像分解为相应的反射分量(R low, R normal)和光照分量(I low,I normal).然后,增强网络以低光图像的光照分量I low作为输入,在结构感知约束下,提升光照分量的亮度.最后,重新组合增强的光照分量I en与反射分量R low,形成增强图像S en,作为网络输出.1.1 分解网络的浅层上下采样结构由于式(1)是一个不适定问题[20],很难设计适用于多场景的约束函数.本文算法以数据驱动的方式进行学习,不仅能解决该问题,还能进一步提高网络的泛化能力.如图1所示,在训练阶段,分解网络以低光照图像S low和与之对应的正常光照图像S normal 作为输入,在约束条件下学习输出它们一致的反射分量R low和R normal,及不同的光照分量I low和I normal.值得注意的是,S low与S normal共享分解网络的参数.区别于常用的深度U型网络(U⁃Net)结构及Retinex⁃Net简单的堆叠卷积层,本文算法的分解网络是一个浅层的上下采样结构,由卷积层与通道级联操作组成,采样层只有4层,网络训练更简单.实验表明,运用此上下采样结构变换图像尺度时,下采样操作一定程度上舍去含有噪声的像素点,达到降噪效果的目的,但同时会引起图像的模糊.因此为了提高分解图像清晰度,减少语义特征丢失,在图像上采样后应用通道数级联操作,可给图像补偿下采样丢失的细节信息,增强清晰度.在浅层上下采样结构中,首先,使用1个9×9的卷积层提取输入图像S low的特征.然后,采用5层以ReLU作为激活函数的卷积层变换图像尺度,学习反射分量与光照分量的特征.最后,分别利用2层卷积层及Sigmoid函数,将学习到的特征映射成反射图R low和光照图I low后再输出.对于分解网络的约束损失,本文算法沿用Retinex⁃Net的重构损失l rcon㊁不变反射率损失l R及光照平滑损失l I.另外为了在分解网络中更进一步减小噪声,添加去噪损失l d.因此总损失如下:l=l rcon+λ1l R+λ2l I+λ3l d,其中,λ1㊁λ2㊁λ3为权重系数,用于平衡各损失分量.对于L1㊁L2范数和结构相似性(Structural Similarity, SSIM)损失的选择,当涉及图像质量任务时,L2范数与人类视觉对图像质量的感知没有很好的相关性,在训练中容易陷入局部最小值,而SSIM虽然能较好地学习图像结构特征,但对平滑区域的误差敏感度较低,引起颜色偏差[21].因此本文算法使用L1范数约束所有损失.在分解网络中输出的结果R low和R normal都可与光照图重构成新的图像,则重构损失如下:l rcon=∑i=low,normalW1Rlow I i-S i1+∑j=low,normal W2R normal I j-S j1,其中 表示逐像素相乘操作.当i为low或j为normal 时,权重系数W1=W2=1,否则W1=W2=0.001.对于配对的图像,使用较大的权重能够使分解网络更好地学习配对图像的特征.对于配对的图像对,使用较大的权重可使分解网络更好地学习配对图像的特征.不变反射率损失l R是基于Retinex理论的颜色恒常性,在分解网络中主要用于约束学习不同光照图像的一致反射率:l R=Rlow-R normal1.对于光照平滑损失l I,本文采用结构感知平滑损失[16].该损失以反射分量梯度项作为权重,在图像梯度变化较大的区域,光照变得不连续,从而亮度平滑的光照图能保留图像结构信息,则l I=ΔIlow exp(-λgΔR low)1+ΔI normal exp(-λgΔR normal)1,其中,Δ表示图像水平和垂直梯度和,λg表示平衡系数.Rudin等[22]观察到,噪声图像的总变分(Total Variation,TV)大于无噪图像,通过限制TV可降低图像噪声.然而在图像增强中,限制TV相当于最小化梯度项.受TV最小化理论[22-23]启发,本文引入反射分量的梯度项作为损失,用于控制反射图像噪声,故称为去噪损失:l d=λΔRlow1.当λ值增加时,噪声减小,同时图像会模糊.因此对于权重参数的选择十分重要,经过实验研究发现,当权重λ=0.001时,图像获得较好的视觉效果.1.2 增强网络的注意力机制如图1所示,增强网络以分解网络的输出I low作97第1期 欧嘉敏 等:改进Retinex⁃Net的低光照图像增强算法为输入,学习增强I low的亮度,将增强结果I en与分解网络另一输出R low重新结合为增强图像S en后输出.在增强网络中,I low经过多个下采样块生成较小尺度图像,使增强网络有较大尺度角度分配光照,从而具有调节亮度的能力.网络采用上采样方式重构局部光照,对亮的区域分配较低亮度,对较暗的区域调整较高亮度.此外,将上采样层的输出进行通道数的级联,在调整不同局部光照的同时,保持全局光照一致性.而且跳过连接是从下采样块引入相应的上采样块,通过元素求和,强制网络学习残差.针对Retinex⁃Net出现的颜色失真问题,在增强网络中嵌入注意力机制模块.值得注意的是,与其它复杂的注意力模块不同,注意力机制模块由简单卷积层和激活操作组成,不要求强大的硬件设备,也不需要训练多个模型和大量额外参数.在光照调整过程中,可减少对无关背景的特征响应,只激活感兴趣的特征,提高算法对图像细节的处理能力和对像素的敏感性,指导网络既调整图像亮度又保留图像结构.由图1可见,注意力模块的输入是图像特征αi㊁βi,输出为图像特征γi,i=1,2,3,表示注意力机制模块的序号.αi为下采样层输出的图像特征,βi为上采样层的输出特征.这2个图像特征分别携带不同的亮度信息,两者经过注意力模块后,降低亮度无关特征(如噪声)的响应,使输出特征γi携带更多亮度信息被输入到下一上采样层,提高网络对亮度特征的学习能力.αi与重建尺度后的βi分别经过一个独立的1×1卷积层,在ReLU激活之前进行加性操作.依次经过1×1卷积层㊁Sigmoid函数,最后与βi通过逐元素相乘后将结果与αi进行通道级联.在此传播过程中,注意机制可融合不同尺度图像信息,同时减少无关特征的响应,增强网络调整亮度能力.独立于分解网络的约束损失,增强网络调整光照程度是基于局部一致性和结构感知[16]的假设.本文算法除了沿用Retinex⁃Net中约束增强网络的损失外,在实验中,针对Retinex⁃Net出现的色彩偏差,增加颜色损失[19],因此增强网络损失:L=L rcon+L I+μL c,其中,L rcon为增强图像的重构损失,L rcon=Snormal-R low I en1,L I表示结构感知平滑损失,L c表示本文的颜色损失,μ表示平衡系数.L rcon定义表示增强后的图像与其对应的正常光照图像的距离项,结构感知平滑损失L I 与分解网络的平滑损失类似,不同的是,在增强网络中,I en以R low的梯度作为权重系数:L I=ΔIen exp(-λgΔR low)1.此外,本文添加颜色损失L c,衡量增强图像与正常光照图像的颜色差异.先对2幅图像采用高斯模糊,滤除图像的纹理㊁结构等高频信息,留下颜色㊁亮度等低频部分.再计算模糊后图像的均方误差.模糊操作可使网络在限制纹理细节干扰情况下,更准确地衡量图像颜色差异,进一步学习颜色补偿.颜色损失为L c=F(Sen)-F(S normal)21.其中:F(x)表示高斯模糊操作,x表示待模糊的图像.该操作可理解为图像每个像素以正态分布权重取邻域像素的平均值,从而达到模糊的效果,S en为增强图像,S normal为对应的正常光照图像,F(x(i,j))=∑k,lx(i+k,j+l)G(k,l),G(k,l)表示服从正态分布的权重系数.在卷积网络中G(k,l)相当于固定大小的卷积核,G(k,l)=0.æèçç053exp k2-l2öø÷÷6.2 实验及结果分析2.1 实验环境本文算法采用LOL训练集[16]和合成数据集[16]训练网络.测试集选取LOL的评估集㊁DICM数据集㊁MEF数据集.在训练过程中,网络采用图像对训练,批量化大小(Batch Size)设为32,块大小(Patch Size)设为48×48.分解网络的损失平衡系数λ1=0.001,λ2=0.1,λ3=0.001.增强网络的平衡系数μ=0.01,λg=10.本文采用自适应矩估计优化器(Adaptive Moment Estima⁃tion,Adam).网络的训练和测试实验均在Nvidia GTX2080GPU设备上完成,实现代码基于TensorFlow框架.为了验证本文算法的性能及效果,采用如下对比算法:Retinex⁃Net,SRIE[10],LIME[11],MBLLEN[14]㊁文献[15]算法㊁全局光照感知和细节保持网络(Global Illumination⁃Aware and Detail⁃Preserving Net⁃work,GLADNet)[24]㊁无成对监督深度亮度增强(Deep Light Enhancement without Paired Supervision, EnlightenGAN)[25].在实验过程中,均采用原文献提供的模型或源代码对图像进行测试.采用如下客观评估指标:峰值信噪比(Peak Signal to Noise Ratio,PSNR)㊁结构相似性(Structural08模式识别与人工智能(PR&AI) 第34卷Similarity,SSIM)[26]㊁自然图像质量评估(NaturalQuality Evaluator,NIQE)[27]㊁通用图像质量评估(Universal Quality Index,UQI)[28]㊁基于感知的图像质量评估(Perception⁃Based Image Quality Evaluator,PIQE)[29].SSIM㊁PSNR㊁UQI 值越高,表示增强结果图质量越优.相反,PIQE㊁NIQE 值越高,表示图像质量越差.2.2 消融性实验为了进一步验证本文算法各模块的有效性,以Retinex⁃Net 为基础设计消融性实验,利用PSNR 衡量噪声水平,采用SSIM 从亮度㊁对比度㊁结构评估图像综合质量.实验结果如表1所示,表中S⁃ULS 表示浅层上下采样结构,l d 表示去噪损失.Enhan _I low 表示增强网络输入仅为光照分量,AMM 表示注意力机制模块,L c 表示颜色损失.参数微调1表示增强网络的平滑损失系数由原Retinex⁃Net 的3设为1;参数微调2是增强网络的平滑损失系数为1,批量化大小由16设为32.表1 各改进模块及损失的消融性实验结果Table 1 Ablation experiment results of improved modules and loss序号基础框架改进方法PSNRSSIM1-Retinex⁃Net 16.7740.55923Retinex⁃Net 添加S⁃ULS,不添加l d 添加S⁃ULS,添加l d17.45217.4940.6890.699456Retinex⁃Net+S⁃ULS+l d添加Enhan _I low ,不添加AMM,不添加L c 添加Enhan _I low ,添加AMM,不添加L c 添加Enhan _I low ,添加AMM,添加L c17.89718.00218.0910.7030.7080.70478Retinex⁃Net+S⁃ULS+l d +AMM+L c参数微调1参数微调218.27218.5290.7190.720 表1中序号2给出以Retinex⁃Net 为基础,采用浅层上下采样结构作为分解网络的结果.相比Retinex⁃Net,PSNR 值显著提高,表明此结构可抑制由图像分解带来的噪声.在此基础上添加去噪损失,进一步降低噪声,见序号3.由此验证浅层上下采样结构与去噪损失的有效性.在本文算法中,由于采用两步训练的方式,即先训练分解网络后训练增强网络,因此在验证浅层上下采样结构和去噪损失的有效性后,以此为基础评估增强网络引入的注意力机制模块和颜色损失的有效性.在Retinex⁃Net 中增强网络的输入为反射分量与光照分量通道级联后的结果.该设置一定程度上会导致反射分量丢失图像结构和细节,同时影响光照分量的亮度提升.为此,先设置序号4的实验验证上述分析.由结果可见:PSNR㊁SSIM 值大幅提高,证明此分析的正确性,表明本文算法的增强网络仅以光照分量作为输入的有效性.另外,从序号5结果看出,利用注意力模块后,图像噪声显著降低,这归功于注意力模块可减少对图像无关特征的响应,集中注意力学习亮度特征,从而降低图像噪声水平.在颜色损失的消融性实验中,尽管客观数值上没有直观体现颜色的恢复,但根据图2和图3可知,该损失是有效的.为了使各模块更好地发挥优势,本文算法对参数进行微调.从序号7㊁序号8的实验结果可见,微调参数后本文算法各模块作用进一步体现,取得更优结果. (a)输入图像 (b)参考图像 (a)Input image (b)Ground truth18第1期 欧嘉敏 等:改进Retinex⁃Net 的低光照图像增强算法 (c)SRIE (d)LIME (e)GLADNet (f)MBLLEN (g)EnlightenGAN (h)Retinex⁃Net (i)文献[15]算法 (j)本文算法 (i)Algorithm in reference[15] (j)The proposed algorithm图2 各算法在LOL数据集上的视觉效果Fig.2 Visual results of different algorithms on LOLdatasetA B C D(a)输入图像(a)Inputimages(b)LIME(c)GLADNet28模式识别与人工智能(PR&AI) 第34卷(d)MBLLEN(e)EnlightenGAN(f)Retinex⁃Net(g)文献[15]算法(g)Algorithm in reference[15](h)本文算法(h)The proposed algorithm图3 各算法在DICM㊁MEF数据集上的视觉效果Fig.3 Visual results of different algorithms on DICM and MEF datasets38第1期 欧嘉敏 等:改进Retinex⁃Net的低光照图像增强算法2.3 对比实验各算法在3个数据集上的客观评估结果如表2所示,表中黑体数字表示最优结果,斜体数字表示次优结果.在LOL 数据集上,SSIM 可从亮度㊁对比度㊁结构度量2幅图像相似度,与人类视觉系统(Human Vision System,HVS)具有较高相关性[21,30],可较全面体现图像质量.从表2可见,在SSIM㊁UQI 指标上,本文算法取得最高数值,表明低光照图像经本文算法增强后图像质量得到明显提升.从图2和图3发现,本文算法也提升视觉效果的表现力.由表2的LOL 数据集上结果可知,在PSNR 指标上,本文算法总体上优于先进算法.根据文献[21]㊁文献[30]和文献[31]的研究,PSNR 指标因为容易计算,被广泛用于评估图像,但其计算是基于误差敏感度,在评估中常出现与人类感知系统不一致的现象,因此与图像主观效果结合分析能更好地体现图像质量.结合图2和图3的分析,GLADNet 的增强图像饱和度较低,存在颜色失真现象.文献[15]算法使图像过曝光.本文算法在Retinex⁃Net 基础上显著降低图像噪声,保留图像丰富结构信息,相比其它方法,视觉效果更佳,符合人类的视觉感知系统.在LOL 数据集上,对比大部分算法,本文取得与参考图像相近的NIQE 数值,表明本文算法的增强结果更接近参考图像.DICM㊁MEF 数据集没有正常光照图作为参照.本文只采用盲图像质量评估指标(NIQE㊁PIQE)评估各算法.在PIQE 指标上,本文算法取得最优值.对于NIQE,虽然未取得较好优势,但相比Retinex⁃Net,本文算法取得更好的增强结果.综上所述,虽然本文算法未在所有指标上取得最优结果,但仍有较高优势.在与人类视觉感知系统有较好相关性的SSIM 指标及噪声抑制和避免过曝能力上,本文算法最优.表2 各算法在3个数据集上的客观评估结果Table 2 Objective evaluation results of different algorithms on 3datasets 算法LOL 数据集SSIMUQI PSNRNIQEDICM 数据集PIQE NIQEMEF 数据集PIQE NIQESRIE0.4980.48211.8557.28716.95 3.89810.70 3.474LIME 0.6010.78916.8348.37815.60 3.8319.12 3.716GLADNet 0.7030.87919.718 6.47514.85 3.6817.96 3.360MBLLEN0.7040.82517.5633.58412.293.27012.043.322EnlightenGAN 0.6580.80817.483 4.68414.613.5627.863.221文献[15]算法0.7120.86019.150 4.79316.21 4.71811.78 4.361Retinex⁃Net 0.5590.87916.7749.73014.16 4.41511.90 4.480本文算法0.7200.88018.5294.49010.11 3.9607.77 3.820参考图像1--4.253---- 由图2可见,SRIE㊁LIME㊁EnlightenGAN 的增亮程度有限,增强结果偏暗.GLADNet㊁MBLLEN 改变图像饱和度,降低图像视觉效果.相比Retinex⁃Net,本文算法的增强结果图噪声水平较低,可保持图像原有的色彩.由图3可见,SRIE 增强程度远不足人类视觉需求,在图3的图像A 中,未展示其增强结果.在图3中,根据人眼视觉系统,首先能判断LIME㊁GLADNet㊁MBLLEN㊁EnlightenGAN 对人脸的增亮程度仍不足,GLADNet㊁MBLLEN 分别存在饱和度过低和过高现象.而文献[15]算法㊁Retinex⁃Net㊁本文算法能较好观察到人脸细节,但同时从左下角细节图可见,Retinex⁃Net 人脸的边缘轮廓对比度过强.从颜色上进一步分析可知,文献[15]算法亮度过度增强,导致图像颜色失真,如天空的颜色对比输入图像色调偏白㊁曝光难以观察远处的景物等.同样观察图3中图像B 右下角细节,经过分析可得,文献[15]算法㊁Retinex⁃Net㊁本文算法增强程度满足人类视觉需求,但文献[15]算法过曝光,从Retinex⁃Net 细节图可见人物服饰失真.对比其它算法,本文算法结果的亮度适中,图像具有丰富细节,避免光照伪影与过曝光现象.另外,从图3中图像C㊁D 可见,LIME㊁GLAD⁃Net㊁MBLLEN㊁EnlightenGAN 没能较好处理局部暗48模式识别与人工智能(PR&AI) 第34卷区,如笔记本㊁右下角拱门窗户区域仍未增亮,导致难以识别边缘细节.而文献[15]算法饱和度发生变化.Retinex⁃Net存在伪影和噪声等问题.本文算法不仅可增强图像的亮度,减小噪声,还保留图像的细节,图像效果更有利于高级视觉系统的识别或检测.综合上述分析发现,本文算法对低光照图像增强效果更优.3 结束语针对Retinex⁃Net噪声较大㊁颜色失真问题,本文提出改进Retinex⁃Net的低光照图像增强算法.算法在分解网络采用浅层上下采样结构及去噪损失,在增强网络嵌入注意力机制模块和颜色损失.实验表明,本文算法不仅能增强图像亮度,而且能显著降低噪声,并取得较优结果.本文算法较好地处理亮度增强过程中无法避免的噪声问题,兼顾提升亮度和降低噪声任务,可给未来研究图像多属性增强提供思路,如低光增强㊁去噪㊁颜色恢复㊁去模糊等多任务同步进行.今后研究重心将是实现图像多属性同步增强.同时扩展该研究网络结构到其它高级视觉任务中,作为图像预处理模块,期望实现网络的端到端训练.参考文献[1]HU X,MA P R,MAI Z H,et al.Face Hallucination from Low Quality Images Using Definition⁃Scalable Inference.Pattern Reco⁃gnition,2019,94:110-121.[2]陈琴,朱磊,后云龙,等.基于深度中心邻域金字塔结构的显著目标检测.模式识别与人工智能,2020,33(6):496-506. (CHEN Q,ZHU L,HOU Y L,et al.Salient Object Detection Based on Deep Center⁃Surround Pyramid.Pattern Recognition and Artificial Intelligence,2020,33(6):496-506.)[3]杨兴明,范楼苗.基于区域特征融合网络的群组行为识别.模式识别与人工智能,2019,32(12):1116-1121. (YANG X M,FAN L M.Group Activity Recognition Based on Re⁃gional Feature Fusion Network.Pattern Recognition and Artificial Intelligence,2019,32(12):1116-1121.)[4]CHENG H D,SHI X J.A Simple and Effective Histogram Equaliza⁃tion Approach to Image Enhancement.Digital Signal Processing, 2004,14(2):158-170.[5]ABDULLAH⁃AI⁃WADUD M,KABIR M H,DEWAN M A A,et al.A Dynamic Histogram Equalization for Image Contrast Enhancement. IEEE Transactions on Consumer Electronics,2007,53(2):593-600.[6]LAND E H.The Retinex Theory of Color Vision.Scientific Ameri⁃can,1977,237(6):108-128.[7]JOBSON D J,RAHMAN Z,WOODELL G A.Properties and Per⁃formance of a Center/Surround Retinex.IEEE Transactions on Im⁃age Processing,1997,6(3):451-462.[8]JOBSON D J,RAHMAN Z,WOODELL G A.A Multiscale Retinex for Bridging the Gap between Color Images and the Human Observa⁃tion of Scenes.IEEE Transactions on Image Processing,1997, 6(7):965-976.[9]WANG S H,ZHENG J,HU H M,et al.Naturalness Preserved En⁃hancement Algorithm for Non⁃uniform Illumination Images.IEEE Transactions on Image Processing,2013,22(9):3538-3548.[10]FU X Y,ZENG D L,HUANG Y,et al.A Weighted VariationalModel for Simultaneous Reflectance and Illumination Estimation// Proc of the IEEE Conference on Computer Vision and Pattern Re⁃cognition.Washington,USA:IEEE,2016:2782-2790. [11]GUO X J,LI Y,LING H B.LIME:Low⁃Light Image Enhance⁃ment via Illumination Map Estimation.IEEE Transactions on Image Processing,2017,26(2):982-993.[12]FUKUSHIMA K.Neocognitron:A Self⁃organizing Neural NetworkModel for a Mechanism of Pattern Recognition Unaffected by Shift in Position.Biological Cybernetics,1980,36:193-202. [13]LORE K G,AKINTAYO A,SARKAR S.LLNet:A Deep Autoen⁃coder Approach to Natural Low⁃Light Image Enhancement.Pattern Recognition,2017,61:650-662.[14]LÜF F,LU F,WU J H,LIM C S.MBLLEN:Low⁃Light Image/Video Enhancement Using CNNs[C/OL].[2020-05-11].http:// /bmvc/2018/contents/papers/0700.pdf. [15]ZHANG Y,DI X G,ZHANG B,et al.Self⁃supervised Image En⁃hancement Network:Training with Low Light Images Only[C/ OL].[2020-05-11].https:///pdf/2002.11300.pdf.[16]WEI C,WANG W J,YANG W H,et al.Deep Retinex Decom⁃position for Low⁃Light Enhancement[C/OL].[2020-05-11].https: ///pdf/1808.04560.pdf.[17]ZHANG Y H,ZHANG J W,GUO X J.Kindling the Darkness:APractical Low⁃Light Image Enhancer//Proc of the27th ACM In⁃ternational Conference on Multimedia.New York,USA:ACM, 2019:1632-1640.[18]AI S,KWON J.Extreme Low⁃Light Image Enhancement for Sur⁃veillance Cameras Using Attention U⁃Net.Sensors,2020,20(2): 495-505.[19]IGNATOV A,KOBYSHEV N,TIMOFTE R,et al.DSLR⁃QualityPhotos on Mobile Devices with Deep Convolutional Networks// Proc of the IEEE International Conference on Computer Vision.Washington,USA:IEEE,2017:3297-3305.[20]TIKHONOV A N,ARSENIN V Y.Solutions of Ill⁃Posed Pro⁃blems.SIAM Review,1979,21(2):266-267. [21]ZHAO H,GALLO O,FROSIO I,et al.Loss Functions for ImageRestoration with Neural Networks.IEEE Transactions on Computa⁃58第1期 欧嘉敏 等:改进Retinex⁃Net的低光照图像增强算法。
Entropy-based image mergingA.German,M.R.Jenkin,Y.Lesp´e ranceDepartment of Computer Science and Engineering and Centre for Vision Research York University,Toronto,Ontario,Canada.{german,jenkin,lesperan}@cs.yorku.caAbstractSpacecraft docking using vision is a challenging task. Not least among the problems encountered is the need to visually localize the docking target.Here we consider the task of adapting the local illumina-tion to assist in this docking.An online approach is developed that combines images obtained under dif-ferent exposure and lighting conditions into a single image upon which docking decisions can be made. This method is designed to be used within an intel-ligent controller that automatically adjusts lighting and image acquisition in order to obtain the“best”possible composite view of the target for further im-age processing.Keywords:Image Entropy,High Dynamic Range. 1IntroductionPerhaps the most interesting vision tasks involve guiding semi-autonomous vehicles such as unmanned underwater vehicles,mining machines and space-craft.Given the widely varying and often poor light-ing conditions encountered in such tasks,the remote video camera is often associated with one or more (typicallyfixed)but controllable light sources.The camera itself often has a variety of controllable pa-rameters such as shutter speed and aperture.Given the controllable intrinsic camera parameters,and the controllable light sources,the remote operator ma-nipulates the various camera parameters and lighting options in order to be able to carry out the required task.This task may be performed directly by a hu-man operator or it may be performed by a software agent with or without human intervention.In either case,the operator manipulates the camera param-eters and the available lighting in order to ensure that those portions of the image that are critical to the task at hand are illuminated appropriately(see Figure1a).Choosing an appropriate illumination for a hu-man operator is an extremely complex problem. Maximizing one illuminant may place portions of the scene in high relief,while at the same time casting shadows over other portions of the image.Interac-tions between the illuminants and gain control within the camera itself complicates the task even further. Perhaps the most common version of this problem is the lighting problem portrait photographers en-counter:How should the various illuminates be lit and the camera controlled in order for the camera to best capture the subject?Note that what“best”is depends significantly on the specific task on hand.In the machine vision domain,the task becomes even more complex.Cameras typically have a lim-ited dynamic range so they often cannot be used to effectively image the whole scene in one acquisition. Unlike natural settings,one simplifying assumption that is often made is that the only active agent in a teleoperated setting is the teleoperated agent.As-suming that the scene is static,then it is possible to illuminate different parts of the image under differ-ent illuminates and camera capture parameters,and then to combine different parts of the image captured under different conditions into a single composite im-age.To consider this illumination problem in its sim-plest form,consider a spacecraft equipped with a camera-light arrangement like that given in Fig-ure1b.If one assumes that the underlying camera capture and scene geometry is static,i.e.,the space-craft are not moving relative to each other and the position of the camera,the lights and object being viewed remain unchanged,then the camera’s intrin-sic parameters and the level of illumination provided by each light can be manipulated.Furthermore,if the aperture,focus and focal length of the camera re-main unchanged,then over a set of images taken un-der different lighting and camera parameters,a given pixel(u,v)in the camera will always image the same scene point and image blur will remain constant.Un-der these conditions,the process of combining mul-tiple images into a single image can be expressed at the pixel level–how should a specific pixel values at (u,v),taken under different illumination and camera(a)A computer graphics rendering of the space shuttle docking procedure.(b)An intelligent controller can manipulate light-ing intensities and camera intrinsics in order to derive an accurate model of the relationships be-tween the spacecrafts involved in a docking pro-cedure.Figure1:Illumination issues in teleoperation.How can the scene be best illuminated and captured in order to dock the two vehicles?parameters,be combined to obtain a composite pixelvalue at(u,v)?1.1Formal Statement of the ProblemGiven a set of images{I1,...,I N},a functionφis de-sired that combines the set into a single image˜I1...N.Notationally,we seek a functionφ()that operates atthe pixel level and that has the following properties:˜I1=φ(I1)˜I1...N=φ(I1,I2,...,I N)In order for the image merging to operate in an ef-ficient,online mannerφshould have the propertythat˜I1...N+1=φ(˜I1...N,I N+1)That is,it should be efficient to compute the N+1th image of˜I given the computation for the N th image. 2Related WorkThe problem of combining multiple images taken un-der varying sensor/lighting conditions has received considerable attention in the literature,although not in the limited scope of the algorithm being consid-ered here.High dynamic range images have many properties in common with the task being consid-ered here(see[1]for an introduction to the prob-lem of high dynamic range images).A commonly considered problem in high dynamic range images is the task of rendering the wide range of data avail-able at a given pixel(u,v)given the limited display range of the intended displays:That is,given the set I how to compute an image˜I that best repre-sents the input images.In the high dynamic range image case,the various images I are typically cap-tured before the image processing takes place and an offline version of the algorithm is appropriate–the display is not updated as new bands of image infor-mation are obtained.Several approaches to this‘ren-dering of high dynamic range images’problem have been devised and implemented in both hardware and software.Cameras like the QinetiQ High Dynamic Range Logarithmic CMOS Cameras1compress the dynamic range of the image using on-board logarith-mic intensity compression.The system described in [1]uses several images under different exposures to recover the camera’s response function and from that is able to fuse the images into a single,high dynamic range radiance map.In[2]a contrast compression al-gorithm using a coarse-to-fine hierarchy is described. In[3]a system is developed that performs gradient attenuation to reduce the dynamic range in the im-age.The algorithm described in this work is based in part on the approach of Goshtasby[4].The basic ap-proach of[4]is to combine images in a manner that maximizes the entropy of the resulting combined im-age,while using a smoothing function to ensure that the resulting image does not exhibit intensity discon-tinuities that were not present in the input images. 3Basic ApproachThe basic approach developed for the online com-bination of images builds upon Goshtasby’s entropy based high dynamic range reduction algorithm[4],(a)(b)(c)(d)(e)(f)(g)(h)Figure2:Top row:(a)Illumination by illuminant1only at100%(b)Illumination by illuminant2only at 100%(c)Illumination by illuminant3only at100%(d)Illumination by illuminant1,2and3all at100%. Bottom row:(e)-(g)The composite image at various stages during the addition of the512source images.(h)Thefinal composite image after all512images have been added.but differs in how the images are combined.In thissystem,the images are merged on a pixel per pixelbasis by weighting the local pixel values by their localentropy estimate.Entropy was chosen as a measure of the detailprovided by each picture.The entropy(see[5])isdefined as the average number of binary symbols nec-essary to code a given input given the probability ofthat input appearing an a stream.High entropy isassociated with a high variance in the pixel values,while low entropy indicates that the pixel values arefairly uniform,and hence little detail can be derivedfrom them.Therefore,when applied to groups ofpixels within the source images,entropy provides away to compare regions from the different source im-ages and decide which provides the most detail.The method developed for this task,though sim-ple,is bothflexible and powerful.Every pixel in thefinal image is computed as the weighted average ofthe corresponding pixels in the source images whereeach value is weighted by the entropy of the sur-rounding region.For each pixel p=(u,v)in thefinalimage there are corresponding pixels p1,p2,...,p N,one for each source image.For each pixel p i in eachimage,the local entropy(measured within afixedwindow)v i is computed,and the weighted average pis computed asp=Ni=1p i v iFigure3:The combined result of all512images after gamma correctionThefinal image pixel p can be computed from G r p,G g p,G b p and I r p,I g p,I b p asp r=G r pI g pp b=G b p(see[6]).Acknowledgments:We would like to acknowledge the support of Mark Obsniuk,Andrew Hogue and Olena Borzenko.Thefinancial support of CITO, MDRobotics and NSERC is greatly appreciated. References[1]Debevec,P. E.and Malik,J.,“RecoveringHigh Dynamic Range Radience Maps from Pho-tographs,”Proceedings of SIGGRAPH1997, ACM Press/ACM SIGGRAPH,369-378,1997.[2]Tumblin,J.and Turk,G.,“LCIS:A Bound-ary Hierarchy for Detail-Preserving Contrast Reduction,”Proceedings of SIGGRAPH,ACM Press/ACM SIGGRAPH,83-90,1999.[3]Fattal,R.,Lischinski, D.,and Werman,M.,“Gradient Domain High Dynamic Range Com-pression,”Proceedings of SIGGRAPH,ACM Press/ACM SIGRAPH,249-256,2002.[4]Goshtasby, A. A.,“High Dynamic RangeReduction Via Maximization of Image In-formation.”,/agosh-tas/hdr.html[5]Shannon,C.E.,“A Mathematical Theory ofCommunication,”Bell System Technical Jour-nal,Vol.27,379-423,623-656,1948.[6]Borzenko,O.,Lesperance,Y.,and Jenkin,M.R.,“Controlling Camera and Lights for Intel-ligent Image Acquisition and Merging.”,IEEE Second Canadian Conference on Computer and Robot Vision(CRV2005),2005.(a)(b)(c)(d)(e)(f)(g)(h)(i)(j)(k)Figure 5:Top row:(a)-(g)Images taken as luminosity increases.Bottom row:(h)-(k)Composites of (a)-(g)images with window sizes 5,11,21,41respectively(a)(b)(c)(d)(e)Figure 6:Shows the effect of random noise as well as blank images on the composite.Top row:The source images (a)an image taken of the object,(b)an image of random noise,(c)a blank image.Bottom Row:(d)A composite comprised of (a)and (b),(e)a composite comprised of (a)and (c)。
基于预测差值的医学图像可逆信息隐藏范鑫惠; 李辉【期刊名称】《《北京化工大学学报(自然科学版)》》【年(卷),期】2019(046)002【总页数】7页(P83-89)【关键词】可逆信息隐藏; 医学图像; 中值边缘预测; 预测差值【作者】范鑫惠; 李辉【作者单位】北京化工大学信息科学与技术学院北京100029【正文语种】中文【中图分类】TP391引言信息隐藏是将受保护的信息嵌入到载体中,可逆信息隐藏技术是其中一个重要的分支,不仅能完整提取嵌入信息,同时可无损恢复载体。
近年来随着医疗数字化进程的推进,海量的医学图像在网络上传播,传输过程中任何的误差都有可能导致误诊,而且如果在医学图像中嵌入了病人的诊断信息,则需要很高的嵌入量。
普通的嵌入算法在嵌入大量信息的同时,会不可避免地使解密后的图像出现一定程度的失真。
因此,对于具有高嵌入量和高精度要求的医学图像,需采用可逆信息隐藏算法,保证原图像和解密后图像的一致性。
目前的可逆信息隐藏方法主要有3 类:无损压缩[1-4]、图像直方图平移[5-7]和差值扩展[8-11]。
基于无压缩的可逆算法实现简单,但在嵌入容量上不具有优势而且容易带来图像失真;基于直方图的可逆算法虽然保证了图像精度,但嵌入量仍达不到令人满意的效果。
因此研究者开发了利用相邻像素之间的相关性嵌入秘密信息的差值扩展技术[8],在传统的差值扩展技术的基础上,提出了基于预测差值的可逆算法。
张丽娜[9]采用平均值预测算法生成预测差值,利用待预测像素点周围的8 个像素进行预测,预测准确度有所提高,但嵌入量减小;Hong等[10]采用中值边缘预测(median edge detection prediction,MPE)算法生成预测差值,再根据预测差值的分类对像素点进行平移或保持不变,虽然提高了嵌入量,但未对像素值为0 和255 的像素点进行操作,仍不能满足医学图像高嵌入量的要求。
本文基于文献[10]中的MPE 算法,结合医学图像的特点,提出一种基于预测差值的医学图像可逆信息隐藏算法。
基于图像块相似性和补全生成的人脸复原算法苏婷婷;王娜【摘要】图像获取过程中,由于成像距离、成像设备分辨率等因素的限制,成像系统难以无失真地获取原始场景中的信息,产生变形、模糊、降采样和噪声等问题,针对上述情况下降质图像的复原问题,提出了适用于低分辨率,低先验知识情况下的人脸复原方法,通过基于图像相似性的期望块1o9相似性EPLL(expected patch log likelihood)框架来构建人脸复原效果的失真函数,利用生成对抗网络的图像补全式生成过程来复原图像.所提算法在加噪率50%以及更高情况下可以保持较好的人脸图像轮廓与视觉特点,在复原加噪20%的降质图像时,相比传统的基于图像块相似性的算法,本文算法复原结果的统计特征峰值信噪比PSNR(peak signal-noise ratio)与结构相似度SSIM(structural similarity)值具有明显优势.【期刊名称】《科学技术与工程》【年(卷),期】2019(019)013【总页数】6页(P171-176)【关键词】图像复原;图像块相似性;生成对抗网络;人脸复原;图像补全【作者】苏婷婷;王娜【作者单位】武警工程大学密码工程学院,西安710086;武警工程大学基础部,西安710086【正文语种】中文【中图分类】TP391.413在图像获取过程中,由于成像距离、成像设备分辨率等因素的限制,成像系统难以无失真地获取原始场景中的信息,通常会受到变形、模糊、降采样和噪声等诸多因素的影响,导致获取图像的质量下降。
因此,如何提高图像的空间分辨率,改善图像质量,一直以来都是成像技术领域亟待解决的问题[1]。
图像复原技术致力于从一定程度上缓解成像过程中各种干扰因素的影响,主要采用的方法是将降质图像建模为原始图像与点扩展函数PSF(point spread function) 的卷积加上噪声的形式,根据PSF是否已知分为传统的定向复原与盲复原。
Image Compression TechniquesAbstract:In this chapter,the basic principle of a commonly used technique for image compression called transform coding will be described.After a short summary of useful image formats,we shall describe two commonly used image coding standards, the JPEG.Keyword:image compression,JPEG standards.To use the computer for image analysis to achieve the required results of the technology.Also known as image processing.The basic content of image processing generally refers to digital image processing. Image processing technology,including the main elements of image compression,and enhanced recovery,match,describe and identify three parts.Digital images are compressed by the image of an enormous amount of data,a typical digital image is usually from500×500or1000×1000pixels composition.If the image is dynamic,is its greater volume of data.Therefore image compression for image storage and transmission are very necessary.There were two types of compression algorithm,yet that is really similar to the methods and means.The most common non-distortion of space or time compression from adjacent pixels on the value of the poor,and then encoded.Run code is such compression code example.Approximate image compression algorithm used to exchange most of the way,such as the fast Fourier transform images or discrete cosine transform.Well-known,as the international image compression standard JPEG and MPEG are similar to a compression algorithm.The former used for still images, which used to moving images.They have been chip.Image enhancement and recovery image enhancement goal is to improve the quality of images,such as increasing contrast,remove the ambiguity and noise,that geometric distortion;image restoration is known in the vague assumption that the model or the noise,trying to estimate the original image of a Technology.Image enhancement by the methods used can be divided into the frequency domain law and space domain law.The former of the two-dimensional image as a signal, based on their two-dimensional Fourier transform the signal enhancement.Low pass filter(that is,only low-frequency signals through),can remove the map of noise,a high pass filter,can enhance the edge of high-frequency signals,and so on,so that the fuzzy picture has become clear.A representative of the space domain algorithm for a local law and the average median filter(from the middle of local jurisdictions adjacent pixels)Act,they can be used for the removal or weakening of noise.Image formatReal world images,such as color images,usually contain different components.For color images represented in the RGB color system,there will be three component images corresponding to the R,G,and B components.Since the RGB color component is relatively uniform in terms of quantization,they are frequently employed in color sensors with each component being quantized to 8bits.From the trichromatic theory of color mixture,most colors can be represented by three properly chosen primary colors.The RGB color primary,which contains the red,green and blue colors,is most popular for illuminating sources.The CMY primary is very common for reflecting light sources and they are frequently employed in printing (the CMYK format).Other than the RGB system,there are a number of color coordinate systems such as YIQ,YUV,XYZ,UVW,U*V*W*,L*a*b*,and L*[236,127].Since human visual system (HVS)is less sensitive to high-frequency chrominance information,the YCbCr color system is commonly used in image coding.The RGB image can be converted to the YCbCr color space using the following formula0.2990.5870.1140.1690.3310.500. 0.5000.4190.081Y R Cb G Cr B ⎡⎤⎡⎤⎡⎤⎢⎥⎢⎥⎢⎥=−−⎢⎥⎢⎥⎢⎥⎢⎥⎢⎥⎢⎥−−⎣⎦⎣⎦⎣⎦Transform coding of imagesFor simplicity,we will consider grey scale images first.For color images,the original image is usually converted to the YCrCb-(4:2:0)format and the same technique for using the Y component image is applied to the Cr and Cb component images.The image to be encoded is first divided into (N*N)non-overlapping blocks,and each block is transformed by a 2D transformation such as the 2D discrete cosine transform (DCT).The basic idea of transform coding is to pack most of the energy of the image block into a few transform coefficients.This process is usually called energy compaction.The transform coefficients are then adaptively quantized.The quantized coefficients and other auxiliary information will be entropy coded and packed according to a certain format into a bit-stream for transmission or storage.At the decoder,the bit-stream is decoded to recover the various information.Since the amplitudes of the transform coefficients usually differ considerably from each other,it is advantageous to use a different number of quantizer levels (i.e.,bits)for each transform coefficients.This problem is called the bit allocation problem.QuantizationThere are a number of methods to encode the transform coefficients.For example,a popular method is to employ scalar quantization followed by run-length and entropycoding.Alternatively,VQ or embedded zero-tree coding can be applied[331].For simplicity,we only describe the first approach,which is employed in the JPEG Baseline coding.Similar methods are also employed in other video coding standards. Most coding standards require the image pixels be preprocessed to have a mean of zero.For RGB color space,all color components have a mean value of128(8-bit /pixel).In YCbCr color space,the Y component has an average value of128,while the chrominance components have an average value of zero.JPEG standardThe JPEG(Joint Photographic Experts Group)standard is an ISO/IEC international standard(10918-1)for Digital compression and coding of continuous-tone still images.It is also an ITU standard known as ITU-T Recommendation T.81.To satisfy different requirements in practical applications,the standard defines four modes of operation:Sequential DCT-based:This mode is based on DCT-based transform coding with a block size of(8x8)for each color component.The transform coefficients are runlength and entropy coded.A subset of this mode is the Baseline Mode,which is an implementation with a minimum set of requirements for a JPEG compliant decoder.Progressive DCT-based:This mode is similar to the sequential DCT-based algorithm,except that the quantized coefficients are transmitted in multiple scans.By partially decoding the transmitted data,this mode allows a rough preview of the transmitted image to be obtained at the decoder having a low transmission bandwidth. Lossless:This mode is intended for lossless coding of digital images.It uses a prediction approach,where the input image pixel is predicted from adjacent encoded pixels.The prediction residual is then entropy-coded.Hierarchical:This mode provides spatial scalability and encodes the input image into a sequence of increasing resolution.The lowest resolution image can be encoded using either the lossy or lossless techniques in other mode,while the residuals are coded using the lossy or DCT-based modes.JPEG supports multiple component images.For color images,the input image is usually in RGB and other formats like luminance and chrominance representation (YUV,YCbCr),etc.The color space conversion process is not part of the standard, but most codecs employ the YCbCr system because the chrominance components can be decimated by a factor of two in the horizontal and vertical dimensions to achieve a better compression performance.Either Huffman or arithmetic coding techniques can be used in the JPEG modes (except the Baseline mode,where Huffman coding is mandatory)for entropy coding. The arithmetic coding techniques usually perform better than the Huffman coding in JPEG,while the latter is simpler to implement.For Huffman coding,up to4AC and2 DC tables can be specified.The input image to JPEG may have from I to65,535lines and from1to65,535pixels per line.Each pixel may have from1to255color components except for progressive mode,where at most four components are allowed. For the DCT modes,each component pixel is an8or12bits unsigned integer,except for the Baseline mode,where8-bit precision is allowed.For the lossless mode,arange from2to16bits is supported.。
轻敲模式下原子力显微镜的能量耗散魏征;孙岩;王再冉;王克俭;许向红【摘要】There are many imaging modes in atomic force microscopy (AFM), in which the tapping mode is one of the most commonly used scanning methods. Tapping mode can provide height and phase topographies of the sample surface, in which phase topography reflects more valuable information of sample surface, such as surface energy, elasticity, hydrophilic hydrophobic properties and so on. According to the theory of vibration mechanics, the phase is related to the energy dissipation of the vibration system. The dissipation energy between the tip and sample in tapping mode of AFM is a very critical key to understanding the image mechanism. It is affected by sample properties and lab environment. The loading and unloading curves of tip and sample interaction are given based on the JKR model while the capillary force is not considered. The unstable position of jump out between the tip and sample is show,and then the energy dissipation in a complete contact and separate process is calculated. The effect of roughness of sample surfaces on energy dissipation is also discussed. It is provided that the extrusion effect is the dominant fact or in liquid bridge formation by characteristic time contrast when capillary force is considered in tapping mode AFM. The effects of relative humidity on energy dissipation are numerically calculated under isometric conditions. Finally, the relationship between phase image of AFM and sample surface energy, Young's modulus, surface roughness andrelative humidity is briefly explained by one-dimensional oscillator model. The analyses show that the difference of surface roughness and ambient humidity can cause phase change, and then they are considered as the cause of artifact images.%原子力显微镜有多种成像模式,其中轻敲模式是最为常用的扫描方式.轻敲模式能获取样品表面形貌的高度信息和相位信息,其中相位信息具有更多的价值,如能反映样品的表面能、弹性、亲疏水性等.依据振动力学理论,相位与振动系统的能量耗散有关.探针样品间的能量耗散对于理解轻敲模式下原子力显微镜的成像机理至关重要,样品特性和测量环境会影响能量耗散.本文在不考虑毛细力影响下,基于JKR接触模型,给出了探针样品相互作用下的加卸载曲线,结合原子力显微镜力曲线实验,给出了探针-样品分离失稳点的位置,从而计算一个完整接触分离过程的能量耗散,进而讨论考虑表面粗糙度对能量耗散的影响.在轻敲模式下考虑毛细力影响,通过特征时间对比,证明挤出效应是液桥生成的主导因素,在等容条件下,用数值方法计算了不同相对湿度对能量耗散的影响.通过一维振子模型,简要说明原子力显微镜相位像与样品表面能、杨氏模量、表面粗糙度、相对湿度之间的关系.分析表明,表面粗糙度和环境湿度均会引起相位的变化,进而认为它们是引起赝像的因素.【期刊名称】《力学学报》【年(卷),期】2017(049)006【总页数】11页(P1301-1311)【关键词】原子力显微镜;相位像;黏附;液桥;能量耗散;毛细力【作者】魏征;孙岩;王再冉;王克俭;许向红【作者单位】北京化工大学机电工程学院,北京 100029;北京化工大学机电工程学院,北京 100029;北京化工大学机电工程学院,北京 100029;北京化工大学机电工程学院,北京 100029;中国科学院力学研究所非线性力学国家重点实验室,北京100190【正文语种】中文【中图分类】TH742.91986年诺贝尔物理学奖授予了电子显微镜和扫描隧道显微镜(scanning tunneling microscope,STM)的发明者.随后一系列的扫描探针显微镜(scanning probe microscope,SPM)面世,这其中就包括原子力显微镜(atomicforcemicroscope,AFM)[1].不同于STM,AFM对扫描样品没有导电的要求,扩大了扫描样品范围,更本质的区别是相对于 STM测量的隧道电流,AFM 测量的是探针与样品间的作用力,因此在本源上 AFM比 STM更具有力学 (机械)本质[2].AFM的核心力学传感部件是一根微悬臂梁,在接触式扫描中,通过微悬臂梁的弯曲或扭转变形而得到样品的表面形貌和力学性质(模量、黏性、摩擦等).在非接触模式中(包括轻敲模式),主要通过微悬臂梁的振幅、相位和频移来反映样品的表面形貌和力学性质[3-4].接触式和轻敲式是AFM的两种最主要形貌成像方式,由于轻敲模式采用的是探针与样品间歇式接触方式,因此这种扫描方式对样品(特别是软物质,如生物组织)的损伤最小.另外通过微悬臂梁的相位变化能提供更多的样品信息,因此轻敲式扫描为最常用的扫描方式.尽管AFM技术取得了巨大的进步,但其仍然存在很严重的缺陷,即使对于非常熟练的操作者,发现扫描形貌中的干扰因素和赝像都是一件很困难的事情[5-6].赝像产生的原因很多,依据噪声来源,可分为探针因素[5]、扫描器因素[7]、样品因素以及探针与样品相互作用因素[8-10].尽管赝像问题非常普遍,但近二十年来仅有有限的文献论述了AFM的赝像问题.关于探针与样品之间作用力引起赝像的论述更少[10].在AFM的测量中,若要解释其成像机理,理解针尖和样品之间的黏附力是必不可少的.不得不强调的是,作为一种探针技术,对针尖样品间作用力的准确控制是获得高分辨率形貌的最为重要的因素.不同的样品表面和探针间距会引起不同的作用力,但针尖与样品之间的黏附力在本质上都是电磁作用.针尖与样品之间的黏附力主要由毛细力、静电力、短程斥力、范德华力等构成[11-16].毛细力会掩盖其他作用力,如在有毛细力存在的情况下,范德华力会降1∼2个量级[17].生物、有机材料或无机材料,由于其亲疏水特性的不同,在不同湿度下,往往会带来扫描图像中高度、相位的差别[12].这些差别都是由于在扫描过程中液桥的生成而非范德华力的作用.因此,在大气环境中,湿度影响液桥的生成、破碎,进而影响毛细力,研究湿度对AFM形貌测量的影响,从而合理地控制毛细力,是避免产生赝像获得高分辨率图像的关键所在.到目前为止,关于湿度对AFM扫描图像的影响的研究还是零星地分布于各文献中[6-7],没有较系统的研究,甚至还没有明确提出湿度会引起赝像的观点.AFM轻敲模式下所得相位图比高度形貌图更能反映样品的材料特性,如黏附、弹性以及黏弹性等,相位图反映的是AFM微悬臂梁响应与压电管激振之间的相差[12].按照Cleveland等的观点,相位差与系统的能量耗散有关,能量耗散存在于探针与样品的机械接触中[18].耗散反映了被测材料的黏弹性[19],利用该特性可以分辨不同种类物质在整个材料中的分布.但不同的作用力使得耗散能不同,从而使相位图像产生变化.探针与样品间的接触能量耗散是引起轻敲模式下相位变化的主要因素,本文拟将作者多年来对此种接触下各种因素引起的能量耗散进行分析、讨论,以期对成像机理和赝像有更一步的认识.对于微纳尺度下的接触,在理想情况下,分离两个接触表面所需要的功等于两个表面相接触时所获取的功.但在实际情况下,即使表面力的作用和接触物体的弹性变形是可逆的,将两个表面分离所需要的功仍大于表面相接触时黏着力所做的功,即接触与分离过程是不可逆的,存在能量耗散.这种现象称为黏着接触滞后[20].另外,黏着接触滞后还有一个表现,就是在接触分离过程中,其加载与卸载的路径不同,即卸载具有滞后性,这也是称之为接触滞后的原因,这种滞后在实际界面现象中非常常见.从能量的观点和加卸载路径的观点分析黏着接触滞后,是处理这类问题的两个基本方法.图1为AFM典型力曲线.当探针从远处向样品接近时,针尖与样品间的作用力很弱,微悬臂梁探针端挠度为0,这一阶段为图中线段ab所示.当探针接近样品一定位置时,探针样品间吸引力越来越大,探针加速撞向样品,此现象为接触突跳(jump in),为图中c点,为接触分离过程中的第一次失稳过程.探针继续向样品方向移动,探针样品间的作用力变为斥力,微悬臂梁向上弯曲,此过程为cd段.此后探针撤离样品,探针样品的斥力逐渐减小,当微悬臂梁的扰度为零时,继续向上抬离探针,由于探针样品间黏附力的存在,探针样品没有发生分离,这时悬臂梁向下弯曲,探针样品的吸引力随探针向上移动一直增加,如图de段所示.最后当黏附力不足对抗弯曲梁中的弹性力时,发生突跳分离(jump out),此为接触分离过程中的第二次失稳.下面将介绍第二次失稳发生的条件.将AFM探针样品作用简化为球、弹簧、样品系统,如图2所示.其接触分离过程中存在两个失稳点,即在针尖趋近基底的过程中,针尖与基底的相互吸引力会越来越强,最终吸引力的梯度大于AFM微悬臂刚度时,进入突跳接触失稳(jump in),当针尖脱离基底时,在某个位置上,同样存在黏着力的梯度大于微悬臂的刚度,进入分离失稳(jump out),图1给出了这两个失稳的位置.因此,这两个失稳都是发生在探针样品间作用力梯度等于或即将大于微悬臂梁刚度的位置,这种失稳属于“力学不稳定性”[20].显然这类力学失稳,会引起加卸载过程的不可逆,产生能量耗散,引起黏着接触滞后.因此,有必要进一步分析探针样品作用力,以期对AFM失稳特性和力曲线测量(或接触分离过程)中的能量耗散有更明确的认识. AFM探针尖端为椎球状,其球形部分半径一般在几纳米到几十纳米之间,因此探针与样品的接触分离为微纳尺度接触问题.经典微尺度黏着接触理论有Bradley理论、DMT理论、JKR理论和M-D理论等[11,13].Johnson和Greenwood利用Maugis理论绘制了弹性接触的黏着作用分布图,也称黏着图[21],如图3所示.图中各边界的意义在文献中有较详尽的表述.实际的接触适用于何种接触理论,由两个无量纲参数决定.一个是载荷参数为外载荷,R是两接触物体的等效半径,w是界面能;另外一个是弹性参数,它和Tabor数µ等价.从黏着图中看出,黏着力与整个载荷比值取0.05是经典弹性接触与黏着弹性接触的临界点,当比值小于0.05时,表明黏着力相对于整个载荷非常小,可以忽略黏着力的影响,而采用Hertz 接触模型.相反,当比值大于0.05时,就必须考虑黏着力的影响而采用黏着接触模型.黏着接触模型的选取由第二个无量纲数(弹性参数)控制.此弹性参数详细的论述可参考Johnson等[21]的文章.Tabor数µ的定义为其中,z0为原子间平衡间距.R=R1R2/(R1+R2)为两接触物体的等效半径,R1,R2为两接触物体的半径,对AFM来说,样品为无限大平面,故R就是AFM针尖半径.为接触区等效弹性模量,Ei,υi(i=1,2)分别为样品和探针的弹性模量和泊松比.AFM探针针尖一般为Si或Si3N4材料,其弹性模量分别为168GPa和310GPa,泊松比为[22]0.22.对于比较刚硬的样品,即弹性模量大于探针材料的,等效弹性模量接近于探针材料的弹性模量,对于比较软的材料,如生物材料、聚乙烯(PE)、聚二甲基硅氧烷(PDMS)等,弹性模量取值在500Pa∼50GPa之间,这时等效弹性模量取值接近样品材料.界面能一般取[20]1∼102mJ·m−2.对 Tabor数进行估计,E*在 103Pa∼102GPa之间取值,z0=0.5nm,R=50nm.这样µ在3.4×10−3∼ 1.6×104之间取值.因此,不同的样品和探针,其微尺度接触模型可取图3中所有理论.根据Greenwood等的研究[14,23],当µ>5时,JKR接触理论模拟微尺度接触时是非常合适的.对于上述各参数,在E*和w取上述变化范围时,给出了JKR适用区域,如图4所示.随着界面能量的增大,特别是样品变软时,JKR模型是合适的.因此下面的微尺度接触模型用JKR理论.相比较于Hertz理论,实际的弹性体接触界面间除了有相互的斥力外,在物理本质上还应当引入分子间的引力作用,如范德华力,其统计学上的表现就是表面能.表面能的引入,势必会增大接触面积,这样也要重新考虑压入量和储存的弹性能.对这个问题,Johnson等[24]提出了JKR接触模型.相应的方程如下其中,a是在外载F作用下的接触半径.当没有黏着力的影响,即w=0时,上述表达式退化为Hertz弹性接触理论.外载F作用下两弹性体的压缩或拉伸量为式(2)和式(3)中F,a和δ如图2(b)和图2(c)所示.外载F可取的最小值为该力为把探针从样品上拉离所需要的最大力,称为黏附力Fad另外如果把接触问题类比于裂纹扩展,则黏附力Fad对应于恒力加载模式下的裂纹失稳问题,同样对应于恒位移模式下的失稳,可得到此时两接触物体拉开的最大位移量为[25]对应的拉力为通过式(2)和式(3),并把外载和位移无量纲化,得到JKR理论加卸载曲线如图5所示[26].从图5可以看到,JKR理论同样存在两个失稳过程,第一个失稳为突跳接触失稳(jump in),如图OA段.第二个失稳为分离失稳(jump out),这个失稳相对复杂些.从图5可以看到两个极限位置C和D,C点力梯度为0,D点力梯度为无限大.由上面力学失稳分析知,当图2(b)中弹簧刚度kc趋近于无穷小时,失稳发生在图5的C点,当图2(b)中弹簧刚度kc趋近于无穷大时,失稳发生在图5的D点.实际AFM微悬臂梁具有确定的刚度,因此探针与样品分离位置处于图5曲线CD间.依据JKR理论,C点为分离时取最大力的位置,D点为分离取最小力的位置,随着图2中微悬臂梁刚度提高,分离力逐渐变小.因此,从图5可以看出,曲线CD是AFM探针与样品分离时的状态区间,探针样品分离点位于CD曲线上力梯度等于微悬臂梁刚度处.探针样品分离时分离力F的范围为AFM力曲线实验证明上述结论,图6是AFM力曲线,样品为疏水硅片和杨氏模量为200MPa的PDMS,悬臂梁刚度分别为 0.06N/m,0.12N/m,图6(a)是刚度为0.06N/m的探针在疏水硅片上的力曲线,其失稳点发生在最大拉力处.图6(b)和图6(c)所用样品为PDMS,微悬臂梁刚度分别为0.06N/m,0.12N/m,可以看出随着微悬臂梁刚度增大,分离时的力逐渐减小,证明了上面对分离失稳判据的分析.AFM轻敲模式下,探针样品间接触分离过程的能量耗散对高度像和相位像都有非常大的影响,特别是相位像直接与耗散能相关.下面用JKR理论讨论探针样品接触分离过程中的能量耗散.由图5可以看出,接触分离过程中的耗散能是指在接触分离一个周期中外力所做的功.如果微悬臂梁较软,分离失稳发生在C点,则外力功为式中A1为图5中的阴影面积.如果微悬臂梁较刚硬,分离失稳发生在D点,则外力功为式中A2为图5中的阴影面积.通过数值计算,A1=1.07,A2=0.47.定义一个参照能量∆E=Fadδc,由式(5)和式(6)得因此,基于JKR接触模型下的能量耗散Ets为(1.07∼1.54)∆E.下面估算耗散能的大小,假定耗散能为∆E.一般情况下,AFM探针半径为50nm,考虑毛细力影响,其黏附力约为4πRγ,值约为40nN(γ取为水的表面张力).在无毛细力作用下,黏附力要降低一个量级,为4nN.接触区域的等效杨氏模量从上述表达式可以看出接近两接触材料中最软材料的杨氏模量,等效弹性模量E*取值在103Pa∼102GPa之间,则耗散能∆Ets在(8×10−20∼ 1.6×10−14)J之间.真实的接触表面都不可能达到原子级光滑,因此有必要考虑粗糙度对AFM加卸载曲线和接触分离过程能量耗散的影响.如图2所示,将AFM探针样品接触简化为半球与半无限大平面的接触模型,如果再考虑球面和半无限大平面的粗糙度,会引起数学处理上的不方便,为方便处理,假定问题为两半无限大平面的接触,其中一个平面假定为光滑平面,另一个为粗糙平面,如图7所示.假设图7下平面粗糙峰高度符合高斯分布,即其中,z为高度,φ(z)为粗糙峰高度的概率密度函数,所有的粗糙峰都假设为半径为R的半球,同图2,σ为高度分布标准差.假设单峰与平面接触分离用JKR模型,其加卸载曲线为图5所示.假设失稳发生在D点,可以得到载荷变形关系的隐式表达[26]如图7所示,先考虑接触过程,当光滑平面压入到距离粗糙平面平均线为d时,如果假设整个粗糙平面有N个尖峰,则此时共有个尖峰与光滑平面接触.同时参照图5和式(12),可得加载方程其中,δ=z− d,∆= δ/σ,∆c= δc/σ,h=d/σ.同理,考虑卸载,当光滑平面与粗糙平面处于图7位置时,卸载方程为从图8可以看出,粗糙度对加卸载曲线的影响较大,随着粗糙比的变大,加卸载路径逐渐趋于重合,接触分离中的黏附性能消失.特别是对加卸载曲线围成的面积,即能量耗散,影响较大.定义能量耗散∆Ets通过对式(15)或图8进行数值积分,得到能量耗散与粗糙度的关系如图9所示.从图9可以看出,当粗糙度变大时,会降低接触分离的耗散能量,因此,在轻敲模式的扫描图象中,表面粗糙度对高度像和相位像都有一定的影响,并可能引起赝像.大气环境下,若样品是亲水的,AFM探针与样品之间的作用力中的主导力为毛细力,它比其他作用力(范德华力、静电力等)大1∼2个量级[17].因此,有必要考虑湿度对探针样品接触分离过程中能量耗散的影响.探针与样品间的毛细力是由探针与样品间的液桥提供的.作者对AFM中液桥的生成和破碎进行过较深入的研究,关于液桥的生成,我们提出了以下模型:挤出模型、毛细凝聚模型和液膜流动模型.下面简要介绍这3个模型[27-33].在大气环境中,亲水样品表面会吸附一层或多层水分子进而形成水膜,在探针接触样品时,探针和样品表面的水膜被挤出,这部分挤出的水形成液桥,由于这部分挤出水体积较小,按照热力学理论,此时的液桥还不是最终平衡态时的液桥,挤出模型形成液桥的特征时间等于探针样品的接触时间.在探针与样品接触后,探针与样品接触区域附近的狭缝区具有极强的吸附能力,这种强吸附势会使得狭缝区空气中的水分子产生凝聚,这个过程很快完成,狭缝区外围的水分子要通过扩散运动到狭缝区再行凝聚,扩散过程相对于凝聚过程需要更长时间,因此毛细凝聚形成液桥的特征时间实际上是由扩散过程控制的.另外,由于液桥中的负压和样品水膜中的分离压作用,液桥远处的水膜向液桥流动,这个流动模型的特征时间由流动过程控制.由热力学关系,水膜厚度h与相对湿度有如下关系[33]式中哈梅克常数AH= −8.7×1021J,水的摩尔体积Vm=1.8×10−5m3/mol,普适气体常数=8.31J/(K·mol),取绝对温度=293K,p为大气蒸汽压,ps为液体饱和蒸汽压.p/ps为相对湿度.在相对湿度为65%时,h=0.2nm.在AFM轻敲模式下,微悬臂梁以接近于自身一阶共振频率(10∼500kHz)的频率振动且每一周期内探针敲击样品一次,针尖与样品每次的接触时间为式中,A为微悬臂梁振幅,h为液膜厚度,T为微悬臂梁振动周期.悬臂梁振动频率取为100kHz,A=10nm,h=0.2nm(相对湿度为 65%).振动周期为10−5s,则得tcontact?0.2µs.文献[32]曾详尽计算了相对湿度为65%的各液桥生成模型对液桥的贡献,在轻敲模式下,由于探针样品接触时间在微秒量级以下,毛细凝聚的特征时间为毫秒量级,液膜流动的特征时间为102µs∼102s,因此毛细凝聚和液膜流动对液桥贡献较小,液膜挤出在轻敲模式下占主导地位.如图10所示,假设探针和样品表面吸附水膜厚度为h,忽略探针样品接触后的弹性变形,则挤出液体体积即液桥体积为式中h?R.则由式(16)和式(18)可得不同湿度下由挤出效应所形成液桥的体积.实验和理论证明,图10(b)液桥在拉断时的临界长度Dcr正比于液桥体积的立方根[33]. 由图10(b)可以看出,液桥毛细力由液桥表面张力和液桥内外的杨--拉普拉斯压力差组成式中,ra,rm为液桥的主曲率半径,则在轻敲模式下,由挤出效应形成的液桥,在极短时间内被拉断,这个过程为等容绝热过程,外力克服毛细力做功即有液桥引起的耗散能为式(19)和式(20)联合求解,就可得到不同湿度下耗散能.由于在轻敲模式下,液桥等容变化,其几何形态复杂,在求解过程中利用了圆弧近似,所求得的毛细力与实验比较吻合.详细求解可参考文献[31-33],其耗散能与相对湿度的关系见图11.从图11可以看出,湿度对探针半径为50nm的AFM来说,在轻敲模式下,耗散能随相对湿度升高而增大,液桥断裂能即耗散能大约在10−18∼8×10−17J量级. 对于图2和图10所示的简化模型,表明无论是否考虑毛细力的影响,在接触分离的过程中都会产生能量耗散.由于轻敲模式的AFM是在高频振动状态下工作,当考虑能量耗散影响后,相当于在振动系统中引入阻尼机制,现在我们知道在一个振动周期内,如果能量耗散为∆E,由于AFM为单频激励单频响应,故将探针样品系统简化为一维阻尼振子系统,如图12所示.弹簧刚度为探针样品间作用,在本文中它既可是式(12)中JKR黏附力也可以是式(19)中的毛细力,这里总弹簧刚度计及了探针样品作用力的贡献.按照阻尼能量等效原则,图12系统在一个周期内,阻尼器所耗散的功就是上述JKR模型或液桥所引起的耗散能.图12振子系统振动方程为其中,m为微悬臂梁的等效质量,因此,由耗散能所引起的相位差φ正切为[12]式中,E为系统总能量,s为激振频率与系统固有频率比,在轻敲模式下,s?1±ε,ε为一小量,因此式(22)简化为式(23)中有正负号是由于轻敲模式下激振频率选在微悬臂梁固有频率附近,依据不同样品和目的,激振频率可小可大,如果激振频率小于固有频率,相位在[0,π/2]区间,如果激振频率大于固有频率,相位在[π/2,π]区间.我们采用JKR模型对微纳尺度下接触分离过程能量耗散进行分析,得到能量耗散∆Ets为(1.07∼由式 (23)可以看出,对于疏水样品或干燥环境下的亲水样品,即不考虑毛细力影响时,轻敲模式下的相位图反映的是探针样品的界面能和样品的软硬程度,因此在样品表面,如果不同区域的物质构成不同,则其界面能和杨氏模量是不同的,在相位图上是能够分辨出来的.如果式(23)中s<1,则界面能越高,能量耗散越大,得到的相位也越大;同样,如果样品杨氏模量比较高,则能量耗散比较小,相位就比较小.从图9可以看出,表面粗糙度越大,针尖样品间的能量耗散越小.如果还是假设s<1,从式(23)可以看出,相位要变小,这种情况下,我们就不能判断这个区域相位变小是什么原因引起的.由上面分析,我们知道相位变小的原因也有可能是界面能变小或样品模量提高.但一般情况下,这种相位的变化,被认为是样品物理化学性质的变化引起的,而不会认为是样品形貌引起相位变化,进而引起扫描图像解读的误差,这也是赝像的一种表现形式.在大气环境下的亲水样品,从图11看出,随着相对湿度的升高,耗散能在升高,从式(23)可看到,当激振频率稍低于梁固有频率时,随湿度增高,相位增大,当激振频率稍高于固有频率时,随湿度增高,相位减小.因此,同一个样品,当我们在不同的实验室环境下扫描时,会得到不同的相位图,但不管相位随湿度怎么变化,真实的样品特性是固定的,因此,我们认为湿度干扰了我们对相位图的正确分析,带来了样品的赝像.在轻敲模式下,由于微悬臂梁的振动频率很高,在样品的一个扫描点,探针与样品接触分离大于103次,因此除了探针样品的第一次接触会在接触区产生塑性变形外,其他后续接触可以忽略塑性效应,因此可以认为在轻敲模式下,塑性变形对耗散没有影响.关于材料黏弹性对耗散的影响,由于探针速度约为2πAf,A为微悬臂梁振幅,取10nm,f为激振频率,取105Hz,则探针速度在10−2m/s量级,在这种低速冲击下,是否考虑黏性的影响,是一个可商榷的问题,留待以后讨论. AFM轻敲模式下相位成像是研究物质表界面特性的重要手段,其相位主要反映的是探针样品作用时的能量耗散.本文主要就两类接触进行了研究.一类是不考虑毛细力影响的微纳尺度接触能量耗散问题,一类是考虑毛细力下的接触能量耗散问题. 在不考虑毛细力存在的情况下,采用JKR接触模型,提出了AFM力曲线中分离失稳与JKR加卸载曲线中的对应位置关系,进而计算轻敲模式下的能量耗散.采用一维振子模型,探讨影响相位的样品因素.并进一步探讨了粗糙度对能量耗散的影响,并指出粗糙度是引起赝像的原因.通过对液桥生成机理分析,对比挤出、毛细凝聚和液膜流动在液桥生成过程所需平衡时间与探针样品接触时间,认为在轻敲模式下,只有挤出效应对液桥的生成有贡献.由于探针样品接触时间极短,因此在等容条件下,计算了不同湿度下的接触能。
A Simple Block-Based Lossless Image Compression SchemeS.Grace Chang University of California,Berkeley Berkeley,CA94720grchang@Gregory S.Yovanof Hewlett-Packard Laboratories, Palo Alto,CA94304yovanof@AbstractA novel low-complexity lossless scheme for continuous-tone images dubbed the PABLO codec(Pixel And Block adaptive LOw complexity coder)is introduced.It comprises a simple pixel-wise adaptive predictor and a block-adaptive coder based on the Golomb-Rice coding method.PABLO is an asymmetric algorithm requiring no coding dictionary and only a small amount of working memory on the en-coder side.Due to the simplistic data structure for the compressed data,the decoder is even simpler lending itself to very fast implementations.Experimental results show the efficiency of the proposed scheme when compared against other state-of-the-art compression systems of considerably more complexity.1Introduction:Predictive CodingAmong the various compression methods,predictive techniques have the advantage of relatively simple imple-mentation.Predictive schemes exploit the fact that adjacent pixel values from a raster image are highly correlated.With a predictive codec,the encoder(decoder)predicts the value of the current pixel based on the value of pixels which have already been encoded(decoded)and compresses the errorsignal.If a good predictor is used,the distribution of the prediction error is concentrated near zero,meaning that the error signal has significantly lower entropy than the original, and hence,it can be efficiently encoded by a lossless coding scheme like the Huffman coding,Rice-coding or Arithmetic coding.The introduced algorithm falls under the category of pre-dictive coding.The main processing steps are(see Fig.1):1) Prediction:predict the current pixel based on the“past”pix-els to allow lossless differential predictive coding;2)Error Preprocessing:map the prediction errors to non-negative2otherwise(1)The parameters and stand for big threshold and small threshold,respectively.The intuition is that if,say,is not close to but is close to,then there is probably a horizontal edge and we take to be the predicted value. The analysis for a vertical edge is similar.1.2Rice CodesEncoding a given sequence of-bit symbols with Rice coding is like using a collection of1different HuffmanA BCXFigure2.The predictor support pixels.X isthe current pixel to be coded.codebooks designed for over a wide entropy range.Foreach entropy value,this allows the encoder to choose the best codebook from the1choices.When the datais Laplacian-distributed,the Rice coder has been shown tobe equivalent to the multiple Huffman codebook approach[1,2],but it does not require a codebook.For differentialpredictive coding,the Laplacian assumption is usually avalid one.Encoding of a given symbol with a Rice code comprisestwo components:the fundamental sequence(FS)and thesample splitting.The FS is a comma code which takes asymbol value and transforms it into“0”s followed by a“1”(the comma).For example,the codeword for3is “0001”.Sample splitting is based on the intuition that thefew least significant bits(LSB)are random and thus non-compressible and should be transmitted as biningthese two ideas,a symbol is encoded by splitting thenon-compressible LSB’s from the MSB’s,and theLSB’s are transmitted as the original bits while the MSB’s are transmitted as a FS code.The variable will be referredto as the splitting factor.Clearly,the codeword length fora given input symbol and splitting factor is given by1(2) where is the integer corresponding to the MSB’s. The default option is;i.e.transmit the originalbits,and thus guarantee that the symbol is not expanded.For each symbol of bits we canfind at least one optimal 012which gives the minimal length. By selecting among the various options,we essentiallyhave multiple encoders to choose from.1.3Pre-processor–the Rice MapperPrediction errors are usually modeled as having a Lapla-cian distribution.For-bit symbols in the range021, the prediction error exists in the range2121, requiring1bits to represent.However,given that we know the current prediction there are only2possible values for the current prediction error.Specifically,if we knowˆ,then the prediction error can only be in the range ˆ21ˆ,which requires bits to code.We use the Rice-mapper in[3]to map the original Laplacian distributed integral error value to a non-negative integer following an approximate geometric distribution.In closed form,the Rice mapper is˜20210otherwise(3)whereˆis the error residual and minˆ21ˆ.1.4Error Modeling–Estimating the StatisticsIn the case of Rice-coding the necessary statistic is the value of the optimal for a processing block.There are several ways tofind the optimal adaptively.One method, as suggested by the original Rice coder based on block-encoding,finds an optimal for each processing block through an exhaustive search among the allowable’s.To find the best splitting factor,a cumulative counter is kept for each allowable keeping track of the total block length if every pixel in the block were coded with this.The optimal is the one yielding the smallest cumulative codelength.There are simpler methods for estimating with only a slight loss in performance.One method is to compare the cumulative sum of each block,1˜,where is the number of symbols in the block,with some decision boundaries derived from assuming random LSB’s[3]. Another method is to note that adjacent blocks are highly correlated,and thus it suffices to search within1,where is the value of the previous block.2Rice-Coding Based Compression Schemes2.1Original Rice AlgorithmThe original Rice coder is a block-based algorithm[3]. The size of the processing block is a one-dimensional16x1 vector.A cumulative counter is updated for each as de-scribed in Section1.4.The best is taken to be the one which yields the least number of compressed bits.At the beginning of each block,ID bits are sent to indicate the value used(in the case of8bpp grayscale images,3ID bits are sent),followed by the encoded output of the entire block using as the splitting factor.In our investigation we have experimented with several variations of the original Rice coding method.2.2Mixed Rice and Binary EncodingNotice that for Rice encoding,even by choosing an opti-mal for each block,there are some codewords within the block that are expanded rather than compressed.Therefore, it would be better to have some symbols coded with theoptimal for that block,but binary encode,i.e.,send as de-fault,the symbols that would be expanded by the chosen value.To accomplish this,we keep a1bit per pixel bitmap indicating whether the pixel is Rice or binary encoded.This 1bpp bitmap is an expensive overhead which we need to decide whether it is worth keeping.To make this decision, we keep two cumulative counts,one for summing the total length if the entire block were Rice encoded with each al-lowable and the other one for summing the total length if some pixels were Rice-encoded and some were binary encoded.If the saving in bits is more than the overhead, then we keep a bitmap and do a mixture of Rice coding and binary encoding;otherwise,we do purely Rice coding for the entire block.There needs to be an indicator for each block telling us whether the block is mix-encoded or purely Rice en-coded.To avoid further overhead,we reserve the value6as a MARKER in the3block-ID bits that are sent indicating the value used.2.3PABLO:Mixed Rice-Binary Encoding withBlock ClassificationThe proposed PABLO scheme builds upon the previously described mixed Rice-Binary encoding method with the pur-pose of improving its performance against images with large flat areas(text,graphics,compound documents).With such images instead of coding every pixel(which can achieve at most8:1compression ratio),better compression would be achieved with a scheme like,say,runlength encoding, which skips over large contiguous areas of a single value. One simple way to do this is by block classification.That is,we use1bit per block indicating whether that block is a FLA T block,meaning that the entire block is of the same value.If it is FLA T,then send the value of the block.If not,then we decide to mix-encode that block as described in Section2.2.From the previous discussion it can be seen that this scheme employs a pixel-by-pixel adaptive predictor and a block-adaptive Rice coder,and hence the name PABLO (Pixel And Block adaptive LOw complexity coder).Obviously,the PABLO scheme is specifically targeted towards textual and graphics images.For regular images, there are seldomly FLA T blocks,and1bit per block would be wasted.However,the overhead incurred by the block classification only results in approximately.5%loss in cod-ing efficiency against natural images and thus is quite in-significant.2.4Hierarchical Block ClassificationThe block-classification algorithm described in Section 2.3could be improved by noticing that in compound images, there are many blocks which are mostlyflat,but have some(i.e. HALF_FLAT)Figure3.The tree-structured scheme for clas-sifying theflat regions of a block.other values in a corner or at the edge.Instead of classifying the entire block as non-flat and using Rice encoding,we can split the block into smaller regions and classify whether it is half-flat or is allflat except on one quadrant,and then mix-encode the non-flat portion.To transmit this information,it naturally incurs more overhead.To avoid too much overhead for pure images,we propose a tree-structured classification as shown in Figure 3.For REGULAR images it incurs1bit per block over-head,same expense as the block-classification scheme.For compound images,2bits per block are needed to specify ENTIRELY-FLA T blocks,and3bits per block are needed to specify3/4-FLA T and HALF-FLA T blocks,plus an ad-ditional2bits to specify which half(top,bottom,left,right) or quadrant is notflat.A block is REGULAR if it is neither ENTIRELY-FLA T,HALF-FLA T,nor3/4-FLA T.The value of theflat region is then sent as original,and the non-flat portion is mix-encoded.3ResultsWe now present some experimental results.Table1 provides a comparison of our schemes with other exist-ing schemes against the USC image database.Column1 shows the compression attained by FELICS[4](with the maximum parameter set to6),which is a low-complexity context-based pixel-wise adaptive algorithm also based on the Rice code.The JPEG data shown corresponds to the independent lossless JPEG function employing the2-pointpredictor no.7and arithmetic coding[5].The third column is the straightforward Rice algorithm with the parameter estimated via exhaustive search and a processing block of size8x8.The column MixEncode is the algorithm described in Section2.2,and PABLO is the algorithm described in Sec-tion2.3,both with88blocks.The next column is the0th order entropy of the entire image,using the predictor in(1). Note that we have converted the bitrate to compression ra-tio.For comparison,we also include the performance of the sequential LZW algorithm,i.e.,the UNIX’compress’.Our algorithms show improvement over the FELICS and JPEG scheme.Notice that PABLO is always worse than MixEn-code for pure images since there is rarely any FLA T block and thus,there is a waste of1bit per block.For the most part,MixEncode performs slightly better than pure RICE. However,in the case of the mand image,it is slightly worse because there are quite a few blocks with optimal6,a value that is not used in our schemes.Table2summarizes the performance of the introduced schemes and the LOCO-I algorithm as described in[6] against a number of images from the JPEG suite of standard test images.The LOCO-I scheme is a pixel-wise adaptive coder employing an adaptive predictor,context-based error modeling with special treatment of long pixel-runs and Rice coding.It is symmetric for both the encoder and the de-coder.For the non-compound images,the HIER scheme (the hierarchical block classification)is about2-7%worse than the LOCO-I.For the compound images,HIER is about 20%worse due to the very simple encoding for blocks with mostly white space and some text.However,the advan-tage that PABLO offers over the LOCO-I scheme is that the decoder is extremely simple,since it does not require any statistical modeling.4ComplexityThe main design objective for all the presented algorithms so far has been low complexity,in terms of both the com-putational complexity as well as the overall system resource requirements.The error modeling part of the block-adaptive algorithms makes them more than1-pass,but only on the block level(typically8x8block size).Only the current block plus the boundary pixels in the adjacent blocks(which are used by the predictor)need to be buffered.The collection of the statistics requires the use of a few counters and the pro-cess merely involves addition and bit-shift operations.The formation of a Rice codeword is extremely simple,and these algorithms use very little working memory and no coding memory at all.The decoder is even simpler and faster than the encoder since it does not have to estimate the value which is trans-mitted as overhead along with the compressed bitstream. Thus,these schemes are ideally suited for asymmetrical ap-plications such as compression in a laserjet printer[7]where the decoder needs to operate at a much faster rate than the encoder.5ConclusionThis paper summarizes the results of an investigation of lossless compression schemes for grayscale images based on the Rice coding method,a low-complexity alternative to the popular Huffman coding.Due to the simplistic data structure of the compressed information,our algorithms have very simple and fast implementations ideally suited for low cost devices like computer peripherals.References[1]S.W.Golomb,“Run-Length Encodings,”IEEE Trans.Inf.Theory,IT-12,399-401,July1966.[2]R.Gallager,D.V an V oorhis,“Optimal Source Codesfor Geometrical Distributed Alphabets,”IEEE Trans.Info.Theory,vol.IT-21,228-230,March1975.[3]R.Rice,P-S.Yeh,and ler,“Algorithms fora very high speed universal noiseless coding mod-ule,”JPL Publication91-1,Jet Propulsion Laboratory, Pasadena,California,Feb.1991.[4]P.Howard,“The design and analysis of efficient loss-less data compression systems,”Ph.D.Thesis,Brown University,Department of Computer Science,June 1993.[5]W.B.Pennebaker,J.L.Mitchell,“JPEG:Still ImageData Compression Standard,”V an Nostrand Reinhold, N.Y ork,1993.[6]M.Weinberger,G.Seroussi,G.Sapiro,“LOCO-I:A Low Complexity,Context-Based,Lossless ImageCompression Algorithm,”Proc.IEEE Data Compres-sion Conf.,Snowbird,Utah,April,1996.[7]G.S.Y ovanof,“Compression In A Printer Pipeline,”IEEE29th Asilomar Conference on Signals,Systems and Computers,Pacific Grove,CA,Oct30-Nov.1, 1995.Images JPEG MixEnc0th Entropy1.82 1.85 1.84 1.33lax 1.30 1.36 1.351.74 1.82 1.81 1.13 man 1.64 1.71 1.642.23 2.37 2.36 1.40 lake 1.48 1.54 1.501.27 1.31 1.31 1.00 milkdrop2.05 2.13 2.111.912.00 2.00 1.45 peppers 1.89 1.68 1.661.28 1.27 1.27 1.46 urban 1.78 1.89 1.79MixEncode HIER LZWfinger 1.350 1.4231.995 1.999 1.39hotel 2.011 2.1183.4044.178 3.44woman 1.857 1.9163.552 5.1334.80cmpnd2 4.536 5.943。