Adaptive image restoration using discrete polynomial transforms
- 格式:pdf
- 大小:309.12 KB
- 文档页数:8
基于自适应对偶字典的磁共振图像的超分辨率重建佚名【期刊名称】《光电技术应用》【年(卷),期】2013(000)004【摘要】为了提高磁共振成像的图像质量,提出了一种基于自适应对偶字典的超分辨率去噪重建方法,在超分辨率重建过程中引入去噪功能,使得改善图像分辨率的同时能够有效地滤除图像中的噪声,实现了超分辨率重建和去噪技术的有机结合。
该方法利用聚类—PCA算法提取图像的主要特征来构造主特征字典,采用训练方法设计出表达图像细节信息的自学习字典,两者结合构成的自适应对偶字典具有良好的稀疏度和自适应性。
实验表明,与其他超分辨率算法相比,该方法超分辨率重建效果显著,峰值信噪比和平均结构相似度均有所提高。
%In order to enhance images quality of magnetic resonance imaging (MRI), a super-resolutionde⁃noising reconstruction method is proposed based on adaptive dual dictionary. In the method, denoising function is used in super-resolution reconstruction process so that the noise in images is filtered effectively as the improve⁃ment of image resolution. And the integration of super-resolution reconstruction and denoising technology is im⁃plemented. Clustering-PCA algorithm is used in the method to extract main features of images to construct main-feature dictionary. Training method is used to design self-learning dictionary expressing detailed informa⁃tion of images. Adaptive dual dictionary constructed by combination of the two dictionaries has good sparseness and adaptability. Experimental resultsshow that super-resolution reconstruction effect is remarkable in the meth⁃od comparing with other super-resolution algorithms. Peak signal to noise ratio (PSNR) and mean structure simi⁃larity (MSSIM) are all improved.【总页数】6页(P55-60)【正文语种】中文【中图分类】R445.2【相关文献】1.基于K-SVD的自适应选择字典的超分辨率重建算法 [J], 薛冰;王春兴2.自适应正则化超分辨率磁共振图像重建 [J], 彭洁;徐启飞;冯衍秋;吕庆文;陈武凡3.基于K-SVD字典学习的核磁共振图像重建方法 [J], 刘平;刘晓曼;朱永贵4.基于自适应半耦合字典学习的超分辨率图像重建 [J], 黄陶冶; 孙恬恬; 周正华; 赵建伟5.基于多分辨率学习卷积神经网络的磁共振图像超分辨率重建 [J], 夏皓;蔡念;王平;王晗因版权原因,仅展示原文概要,查看原文内容请购买。
自适应拉曼光谱成像数据去噪及其在植物细胞壁光谱分析中的应用张逊;陈胜;吴博士;杨桂花;许凤【摘要】Two inevitable noise signals, baseline drifts and cosmic spikes in Raman spectral imaging data should be eliminated before data analysis. However, current denoising methods for a single spectrum often lead to unstable results with bad reproducible properties. In this study, a novel adaptive method for denoising Raman spectral imaging data was proposed to address this issue. Adaptive iteratively reweighted penalized least-squares (airPLS) and principal component analysis (PCA) based despiking algorithm were applied to correct drifting baselines and cosmic spikes, respectively. The method offers a variety of advantages such as less parameter to be set, no spectral distortion, fast computation speed, and stable results, etc. We utilized the method to eliminate the noise signals in Raman spectral imaging data of Miscanthus sinensis ( involving 9010 spectra) , and then employed PCA and cluster analysis ( CA) to distinguish plant spectra from non-plant spectra. Theoretically, this method could be used to denoise other spectral imaging data and provide reliable foundation for achieving stable analysis results.%拉曼光谱成像数据存在基线漂移与宇宙射线干扰峰两类噪声信号,无法直接用于光谱分析研究,必须去除。
基于像斑空间关系的遥感图像分类李亮;舒宁;龚龑;王凯【摘要】为充分挖掘遥感图像本身包含的空间关系信息,弥补基于光谱信息的传统图像分类方法的不足,提高分类精度,提出了一种基于像斑空间关系的遥感图像分类方法.通过图像分割获取像斑,利用最大似然法获取初始分类结果,引入马尔科夫随机场对像斑的空间关系予以描述,通过地物的类别邻接矩阵定量地描述各地物类别之间的空间关系,从而对图像的分类结果进行修正,最后采用条件迭代的方法获取最终的图像分类结果,精度较好.实验结果表明,该方法应用于高分辨率遥感图像可取得较好的分类效果.%An image classification method based on the spatial relationship of image segment is proposed with the purpose of excavating the spatial relationship between image segments and compensating for deficiencies of the traditional image classification method based on spectral information. Image segmentation is used to get image segments for original image classification employing maximum likelihood ( ML) method. Then the spatial relationship of image segments is described by Markov random field (MRF). Quantitative spatial relationship can be obtained by class adjacency matrix (CAM) so as to revise the result of classification. After that the iterated conditional mode (ICM) algorism for classification is presented, which can yield results with higher accuracy. Experimental results show that this method has been functioning well in classification experiments with high resolution remote sensing images.【期刊名称】《国土资源遥感》【年(卷),期】2013(025)001【总页数】5页(P77-81)【关键词】马尔科夫随机场;地物类别邻接矩阵;条件迭代;像斑;图像分类【作者】李亮;舒宁;龚龑;王凯【作者单位】武汉大学遥感信息工程学院,武汉430079【正文语种】中文【中图分类】TP751.10 引言遥感图像分类是遥感领域的研究热点之一。
BAYESIAN IMAGE RESTORATION USING A WA VELET-BASED SUBBANDDECOMPOSITIONRafael Molina,Aggelos K.Katsaggelos and Javier Abada)Departamento de Ciencias de la Computaci´o n e Inteligencia Artificial,Universidad de Granada,18071Granada,Espa˜n a.b)Department of Electrical and Computer Engineering,Northwestern University,Evanston,Illinois60208-3118e-mail:rms@decsai.ugr.es,aggk@,abad@decsai.ugr.esABSTRACTIn this paper the subband decomposition of a single channel image restoration problem is examined.The decomposi-tion is carried out in the image model(prior model)in orderto take into account the frequency activity of each band of the original image.The hyperparameters associated witheach band together with the original image are rigorously estimated within the Bayesian framework.Finally,the pro-posed method is tested and compared with other methodson real images.1.INTRODUCTIONA standard formulation of the image degradation model isgiven in lexicographic form by[1](1) where the vectors,,and represent respectivelythe original image,the available noisy and blurred image and the noise with independent elements of variance,and represents the known blurring matrix.The im-ages are assumed to be of size,with.The restoration problem calls forfinding an estimate of given ,and knowledge about and possibly(see Chapter1 in[7]).Smoothness constraints on the original image can be in-corporated under the form of(2) where is the Laplacian operator.Then,following the Bayesian paradigm it is customary to select as the restoration of f,the image defined by(3)This work has been supported by the“Comisi´o n Nacional de Ciencia y Tecnolog´ıa”under contract TIC-0989.where from Eq.1we have(4)An important problem arises when and/or are un-known.Much interest has centered on the question of how these parameters should be estimated(see[6],[9]).It is widely accepted that the hyperparameter in the image model ()should be adapted to the local image characteristics.The application of multichannel techniques to single chan-nel restoration problems using a subband decomposition was proposed in[2]and[3]using the framework developed in [8].In this paper we examine the subband decomposition of the quadratic image model given in Eq.2.Since by perform-ing a subband decomposition we are extracting different fre-quency regions(channels)of an image,the process of asso-ciating a different image hyperparameter to each subband of the image model becomes equivalent to assigning different hyperparameters to different frequency bands in the image. These hyperparameters will reflect then the activity of that band in the original image.We show how the estimation of these parameters can be carried out within the Bayesian image restoration paradigm.The rest of the paper is organized as follows.In sec-tion2the image and noise models are defined in order to apply the Bayesian paradigm.For those image and noise models,the estimation of the hyperparameters and the orig-inal image is performed in section3.Finally,in section4 experimental results are shown and section5concludes the paper.2.IMAGE AND NOISE MODELSA simple way to incorporate the smoothness of the object luminosity is to model the distribution of by Eq.2.It iszLLHLLHHHRowsFigure1:Four-channel2-D decomposition.important to note that this model is a simultaneous autorre-gression(SAR)([10])and is characterized by(5) where the s are independent.A careful examination of Eq.5shows that this expres-sion is not true for real images.The spectrum of is not normallyflat and the energy in each frequency is not the same(equal to).Obviously the image model is just a simple approximation.Let us now consider and perform a multichan-nel decomposition on it.Let and be quadra-ture mirrorfilters(QMF)based on the orthonormal wavelet bases with compact support([4]),so that one set of coef-ficients may be used to defined the other([11]).Then,the subband decomposition of can be calculated as described in Fig.1.We note that(6) where with are thematrices used to obtain the bands(see Fig.1) and denotes transpose.It is important to observe that now contains information on some part of the spec-trum of.Let us consider the quadratic form defining the image model;we have(7)Now,in order to adapt the image model,and therefore have a hyperparameter for each of the decomposed chan-nels,we define the following image model(8)where denotes the vector and(9) where.Note that the model we have just proposed can be ex-tended to a-channel decomposition.However,for nota-tional simplicity,we will only use a-channel decomposi-tion.We also note that the image model we are proposing allows the use of the same hyperparameters for several sub-bands.Let us now examine how to estimate the unknown pa-rameters and perform the restoration in the coming section.3.BAYESIAN ANALYSISThe steps we follow in this paper to estimate the hyperpa-rameters and the original image areStep I:Estimation of the hyperparametersand arefirst selected as(10) where.Step II:Estimation of the original image Once the hyperparamenters have been estimated,the es-timation of the original image,,is selected as the im-age satisfying(11)Note that we are obtaining the maximum likelihood esti-mates of the hyperparameters and the maximum a posteriori (MAP)estimate of.Furthermore,although steps I and II are separated,the iterative scheme proposed next performs both estimations simultaneously.The estimation process we are using could be performed within the so called hierarchical Bayesian approach(see[9]) by including hyperpriors on the unknown hypervector and hyperparameter.However,the possibility of incorporat-ing additional knowledge on them by means of gamma or other distributions will not be discussed here(see[9]).Differentiating with respect to and so as tofind the conditions which are satisfied at the maxima we havefor(12)(13)where.Let us examine the use of the EM-algorithm[5]withand to iteratively in-crease.The application of the EM-algorithm to our problem produces Eqs.12and13where the old val-ues of the hyperparameters are used on the left hand side of these equations to obtain the new ones on their right hand side.Unfortunately,these equations are highly nonlinear.Let us,however,considerfirst the iterative EM equa-tions corresponding to using one hyperparameter for the im-age model()and one for the noise()(see[9]).We have(14)(15) where has been defined in Eq.3and and represent the evaluation of the expressions for the new and old values of and,respectively.We notice that these equations correspond to the application of a gradient de-scendent method on and.Let us adapt this method to the multichannel problem. Multiplying and dividing the right hand side of Eq.12by we have(16) where(we have removed the dependency on of to simplify the notation).Notice that if we have only one image parameter and that.Then,we can use the following equations to estimate the hyperparameters,where the old values are used in the right hand side of the equations to obtain the new ones on the left hand sidefor(17)(18)This method is again a gradient descendent one.We have used it in our experiments and have not observed anydB Method Iterations10MLE407.6038184.981param.307.2863214.032params.507.2844214.094params.607.2241213.95 20MLE357.988949.841param.358.816263.642params.508.808663.684params.708.278764.73 30MLE30 6.2049 3.911param.358.8181 6.372params.418.8006 6.424params.909.1402 6.57 Table1:Iterations required,ISNR,and noise variance esti-mations for the“Cameraman”image and different SNRs. convergence problem,however,it would always be possible to use smaller steps as to guarantee convergence.4.EXPERIMENTAL RESULTSIn order to show the behavior of the proposed algorithm, we have used the original“Cameraman”im-age,blurred by a motion blur over9pixels.It was also degraded by additive Gaussian noise to achieve,and dB SNR(noise variances of,, and,respectively).For a comparison,we havealso applied the maximum likelihood restoration method to these degraded images.For the purpose of objectively testing the performance of the image restoration algorithms,the Improvement in Signal to Noise Ratio(ISNR)will be used.This metric is given byThe values of ISNR,the required number of iterations needed to achieve convergence in parameter estimation,and the corresponding values of the estimated noise variance are shown in Table1.We have included the results obtained by maximum likelihood and the proposed algorithm using only one parameter for all the bands of the image;using two parameters,one for the band,and a different one for the,,and bands.The last row of every sub-table contains the result obtained using a different parameter for each subband.The set of coefficients used is DAUB4.From this table we can see that the proposed method re-sults in better estimates of the noise variance,very close to the real value,giving less noisy images than the maximum(a)(b)(c)(d)Figure 2:(a)Original “Cameraman”image.(b)Noisy-blurred image for 9-point motion blur at 30dB.(c)Maxi-mum likelihood restoration.(d)Restoration obtained with the proposed method.likelihood method.We can see that the ISNR is in general better for the proposed method.The dB case is the only one where maximum likelihood gives better results in terms of ISNR,but the images obtained using the proposed algo-rithm appear to be better from a subjective point of view (visual inspection).Fig.2shows the original “Cameraman”image,the de-graded image at 30dB,the restoration obtained by maxi-mum likelihood and the restoration obtained with the pro-posed method using four prior-model parameters,a differ-ent one for each band.We can see that the solution pro-posed gives smoother solutions but the noise is much better removed.5.CONCLUSIONSIn this paper we have proposed the decomposition of the single channel image restoration problem in order to take into account the frequency activity in each subband of the decomposed image.The Bayesian framework has been used to estimate both the parameters and the restored image.The results obtained using the proposed method have been compared to those obtained by the maximum likeli-hood restoration method.The proposed method results inbetter estimates of the parameters involved in the problem,giving less noisy results.We have also used objective met-rics to measure the quality of the resulting restorations.In general better solutions are obtained with the proposed ap-proach,than with the maximum likelihood method.6.REFERENCES[1]H.C.Andrews and B.R.Hunt,Digital Image Restora-tion ,Prentice Hall,New York,1977.[2]M.R.Banham,N.P.Galatsanos,H.L.Gonzalez andA.K.Katsaggelos “Multichannel Restoration of Sin-gle Channel Images Using a Wavelet-Based Subband Decomposition”,IEEE Trans.on Image Processing ,vol.3,pp.1-13,1994.[3]M.R.Banham and A.K.Katsaggelos “Spatially Adap-tive Wavelet-Based Multiscale Image Restoration”IEEE Trans.on Image Processing ,vol.5,pp.619-634,1994.[4]I.Daubechies,“Orthonormal Bases of CompactlySupported Wavelets”,Commun.Pure Appl.Math.,vol.41,pp.909-996,1988.[5]A.P.Dempster,ird and D.B.Rubin,“Max-imum Likelihood from Incomplete Data”,J.Royal Statist.Soc.B ,vol.39,pp.1-38,1972.[6]N.P.Galatsanos and A.K.Katsaggelos,“Methods forChoosing the Regularization Parameter and Estimat-ing the Noise Variance in Image Restoration and Their Relation”,IEEE Trans.on Image Processing ,vol.1,pp.322-336,1992.[7]A.K.Katsaggelos,ed.,Digital Image Restoration ,Springer Series in Information Sciences,vol.23,Springer-Verlag,1991.[8]A.K.Katsaggelos,y and N.P.Galatsanos,“AGeneral Framework for Frequency Domain Multi-channel Signal Processing”,IEEE Trans.on Image Processing ,vol.2,pp.417-420,1993.[9]R.Molina, A.K.Katsaggelos and J.Mateos,“Bayesian and Regularization Methods for Hyper-parameter Estimation in Image Restoration”,IEEE Trans.on Image Processing ,accepted for publication,1988.[10]B.D.Ripley,Spatial Statistics ,John Wiley,New York,pp.88-90,1981.[11]P.P.Vaidyanathan Multirate Systems and Filter Banks ,Englewoods Cliffs,NJ,Prentice Hall,1993.。
专利名称:ENVIRONMENT-ADAPTIVE IMAGE DISPLAYSYSTEM, PROJECTOR, PROGRAM,INFORMATION STORAGE MEDIUM, ANDIMAGE PROCESSING METHOD发明人:MATSUDA HIDEKI,松田 秀樹申请号:JP特願2001-206474(P2001-206474)申请日:20010706公开号:JP特開2002-344761(P2002-344761A)A公开日:20021129专利内容由知识产权出版社提供专利附图:摘要:PROBLEM TO BE SOLVED: To provide an environment-adaptive image displaysystem, capable of accurately reproducing the appearance of the colors of an image, a program, an information storage medium, and to provide an image processing method. SOLUTION: From visual environment information from a color sensor 60 for grasping the visual environment, device color range information and color information corresponding to the RGB standards, an information creating unit 160 creates a 3D-LUT (lookup table) in a LUT storage unit 122 by an operation method, based on an appearance model; and a projector color conversion unit 120 converts the image signal according to the 3D-LUT.申请人:SEIKO EPSON CORP,セイコーエプソン株式会社地址:東京都新宿区西新宿2丁目4番1号国籍:JP代理人:井上 一 (外2名)更多信息请下载全文后查看。
遥感影像语义理解基于⾃适应深度稀疏语义建模的⾼分辨率遥感影像场景分类:为了挖掘⾼分辨率遥感场景更具区分性的语义信息,提出了⼀种将稀疏主题和深层特征⾃适应相融合的深度稀疏语义建模(ADSSM)框架。
⾸先,为了从影像中发现本质底层特征,ADSSM框架集成了基于中层的稀疏主题模型FSTM和基于⾼层的卷积神经⽹络CNN。
基于稀疏主题和深度特征视觉信息的互补性,设计了三种异质性稀疏主题和深度场景特征来描述⾼分辨率遥感影像的复杂的⼏何结构和空间模式。
其中, FSTM可以从影像中获取局部和显著性信息,⽽CNN则更多关注的是全局和细节信息。
稀疏主题和深度特征的集成为⾼分辨率遥感场景提供了多层次的特征描述。
其次,为了改善稀疏主题和深度特征的融合,针对稀疏主题和深度特征之间的差异性,提出了⼀种⾃适应特征标准化策略。
在ADSSM中,挖掘的稀疏主题和深度特征各⾃进⾏⾃适应的标准化,以增强代表性特征的重要性。
基于⾃适应融合特征的表达,ADSSM框架可以减少复杂场景的混淆。
ADSSM框架在UCM、Google、NWPU-RESISC45以及OSRSI20四个数据集上的结果表明提出的⽅法相较于⽬前公认的⾼精度场景分类⽅法来说有了较⼤的提升。
资源共享1.公开数据集(1)SIRI-WHU ⾕歌影像数据集 (The Google image dataset of SIRI-WHU, 更新⽇期:2019.12.10).该数据集包括12个类别,主要⽤于科研⽤途。
以下各个类别中均包含200幅影像:农场、商业区、港⼝、闲置⽤地、⼯业区、草地、⽴交桥、停车场、池塘、居民区、河流、⽔体每⼀幅影像⼤⼩为200*200,空间分辨率为2⽶。
该数据集获取⾃⾕歌地球,由武汉⼤学RS-IDEA研究组(SIRI-WHU)搜集制作,主要覆盖了中国的城市地区。
当您发表的结果中⽤到了该数据集,请引⽤以下⽂献:[1]Q. Zhu, Y. Zhong, L. Zhang, and D. Li, "Adaptive Deep Sparse Semantic Modeling Framework for High Spatial Resolution Image Scene Classification," IEEE Transactions on Geoscience and Remote Sensing, 2018, 56(10): 6180-6195. DOI: 10.1109/TGRS.2018.2833293.[2]Q. Zhu, Y. Zhong, B. Zhao, G.-S. Xia, and L. Zhang, "Bag-of-Visual-Words Scene Classifier with Local and Global Features for High Spatial Resolution Remote Sensing Imagery," IEEE Geoscience and Remote Sensing Letters, 2016, 13(6): 747-751. DOI:10.1109/LGRS.2015.2513443 2016.(2)SIRI-WHU USGS标注影像数据集 (The USGS image dataset of SIRI-WHU, 更新⽇期:2019.12.10).该数据集包括4个场景类别:农场、森林、居民区、停车场,其主要⽤于科研⽤途。
基于解耦学习的图像跨域迁移算法
## 一、算法介绍
跨域图像迁移(Cross-Domain Image Transfer,CIT)是指在源
域和目标域之间进行图像迁移的过程,以便在目标域中更好地推理图像。
解耦学习(Decoupled Learning,DL)是一种新型
的跨域图像迁移算法,它通过将源域和目标域中的特征分解为空间特征和属性特征,从而解耦源域和目标域之间的特征表示,并分别训练空间特征和属性特征的网络模型,从而实现跨域图像迁移。
解耦学习算法的主要思想是,将源域和目标域中的特征表示分解为空间特征和属性特征,并分别训练空间特征和属性特征的网络模型,从而实现跨域图像迁移。
空间特征模型用于学习源域和目标域之间的空间特征映射,属性特征模型用于学习源域和目标域之间的属性特征映射,最后,通过联合空间特征和属性特征的映射,实现跨域图像迁移。
## 二、算法流程
解耦学习算法的算法流程如下:
(1)收集源域和目标域的图像数据集;
(2)提取源域和目标域图像数据集中的空间特征和属性特征;
(3)训练空间特征模型和属性特征模型;
(4)利用空间特征模型和属性特征模型,对源域和目标域图像进行跨域迁移;
(。
第38卷第6期2023年12月安 徽 工 程 大 学 学 报J o u r n a l o fA n h u i P o l y t e c h n i cU n i v e r s i t y V o l .38N o .6D e c .2023文章编号:1672-2477(2023)06-0084-08收稿日期:2022-12-16基金项目:安徽省高校自然科学基金资助项目(K J 2020A 0367)作者简介:王 坤(1999-),男,安徽池州人,硕士研究生㊂通信作者:张 玥(1972-),女,安徽芜湖人,教授,博士㊂数字图像修复的D I -S t yl e G A N v 2模型王 坤,张 玥*(安徽工程大学数理与金融学院,安徽芜湖 241000)摘要:当数字图像存在大面积缺失或者缺失部分纹理结构时,传统图像增强技术无法从图像有限的信息中还原出更多所需的信息,增强性能有限㊂为此,本文提出了一种将解码信息与S t y l e G A N v 2相融合的深度学习图像修复方法 D I -S t y l e G A N v 2㊂首先由U -n e t 模块生成包含图像主信息的隐编码以及次信息的解码信号,随后利用S t y l e G A N v 2模块引入图像生成先验,在图像生成的过程中不但使用隐编码主信息,并且整合了包含图像次信息的解码信号,从而实现了修复结果的语义增强㊂在F F HQ 和C e l e b A 数据库上的实验结果验证了所提方法的有效性㊂关 键 词:图像修复;解码信号;生成对抗神经网络;U -n e t ;S t y l e G A N v 2中图分类号:O 29 文献标志码:A 数字图像是现代图像信号最常用的格式,但往往在采集㊁传输㊁储存等过程中,由于噪声干扰㊁有损压缩等因素的影响而发生退化㊂数字图像修复是利用已退化图像中存在的信息重建高质量图像的过程,可以应用于生物医学影像高清化㊁史物影像资料修复㊁军事科学等领域㊂研究数字图像修复模型具有重要的现实意义和商业价值㊂传统的数字图像修复技术多依据图像像素间的相关性和内容相似性进行推测修复㊂在纹理合成方向上,C r i m i n i s i 等[1]提出一种基于图像块复制粘贴的修补方法,该方法采用基于图像块的采样对缺失部分的纹理结构进行信息的填充㊂在结构相似性度研究上,T e l e a [2]提出基于插值的F MM 算法(F a s tM a r c -h i n g M e t h o d );B e r t a l m i o 等[3]提出基于N a v i e r -S t o k e s 方程和流体力学的图像视频修复模型(N a v i e r -S t o k e s ,F l u i dD y n a m i c s ,a n d I m a g e a n dV i d e o I n p a i n t i n g )㊂这些方法缺乏对图像内容和结构的理解,一旦现实中待修复图像的纹理信息受限,修复结果就容易产生上下文内容缺失语义;同时它们需要手动制作同高㊁宽的显示待修复区域的掩码图像来帮助生成结果㊂近年来,由于深度学习领域的快速发展,基于卷积神经网络的图像修复方法得到了广泛的应用㊂Y u等[4]利用空间变换网络从低质量人脸图像中恢复具有逼真纹理的高质量图像㊂除此之外,三维面部先验[5]和自适应空间特征融合增强[6]等修复算法也很优秀㊂这些方法在人工退化的图像集上表现得很好,但是在处理实际中的复杂退化图像时,修复效果依旧有待提升;而类似P i x 2P i x [7]和P i x 2P i x H D [8]这些修复方法又容易使得图像过于平滑而失去细节㊂生成对抗网络[9](G e n e r a t i v eA d v e r s a r i a lN e t w o r k s ,G A N )最早应用于图像生成中,U -n e t 则用于编码解码㊂本文提出的D I -S t y l e G A N v 2模型充分利用S t y l e G A N v 2[10]和U -n e t [11]的优势,通过S t y l e -G A N v 2引入高质量图像生成图像先验,U -n e t 生成修复图像的指导信息㊂在将包含深度特征的隐编码注入预训练的S t y l e G A N v 2中的同时,又融入蕴含图像局部细节的解码信号,得以从严重退化的图像中重建上下文语义丰富且真实的图像㊂1 理论基础1.1 U -n e t U -n e t 是一种A u t o -E n c o d e r ,该结构由一个捕获图像上下文语义的编码路径和能够实现图像重建的对称解码路径组成㊂其前半部分收缩路径为降采样,后半部分扩展路径为升采样,分别对应编码和解码,用于捕获和恢复空间信息㊂编码结构包括重复应用卷积㊁激活函数和用于下采样的最大池化操作㊂在整个下采样的过程中,伴随着特征图尺度的降低,特征通道的数量会增加,扩展路径的解码过程同理,同时扩展路径与收缩路径中相应特征图相连㊂1.2 生成对抗网络生成对抗网络框架由生成器G 和判别器D 这两个模块构成,生成器G 利用捕获的数据分布信息生成图像,判别器D 的输入是数据集的图像或者生成器G 生成的图像,判别器D 的输出刻画了输入图像来自真实数据的可能性㊂为了让生成器G 学习到数据x 的分布p g ,先定义了一个关于输入的噪声变量z ,z 到数据空间的映射表示为G (z ;θg ),其中G 是一个由参数为θg 的神经网络构造的可微函数㊂同时定义了第2个神经网络D (x ;θd ),其输出单个标量D (x )来表示x 来自于数据而不是P g 的概率㊂本训练判别器最大限度地区分真实样本和生成样本,同时训练生成器去最小化l o g(1-D (G (z ))),相互博弈的过程由以下公式定义:m i n G m a x D V (D ,G )=E x ~P d a t a (x )[l o g (D (x ))]+E z ~P z (z )[l o g (1-D (G (z )))],(1)通过生成器㊁判别器的相互对抗与博弈,整个网络最终到达纳什平衡状态㊂这时生成器G 被认为已经学习到了真实数据的内在分布,由生成器合成的生成图像已经能够呈现出与真实数据基本相同的特征,在视觉上难以分辨㊂1.3 S t y l e G A N v 2选用高分辨率图像生成方法S t y l e G A N v 2[10]作为嵌入D I -S t y l e G A N v 2的先验网络,S t yl e G A N v 2[10]主要分为两部分,一部分是映射网络(M a p p i n g N e t w o r k ),如图1a 所示,其中映射网络f 通过8个全连接层对接收的隐编码Z 进行空间映射处理,转换为中间隐编码W ㊂特征空间中不同的子空间信息对应着数据的不同类别信息或整体风格,因为其相互关联存在较高的耦合性,且存在特征纠缠的现象,所以利用映射网络f 使得隐空间得到有效的解耦,最后生成不必遵循训练数据分布的中间隐编码㊂另一部分被称为合成网络,如图1b 所示㊂合成网络由卷积和上采样层构成,其根据映射网络产生的隐编码来生成所需图像㊂合成网络用全连接层将中间隐编码W 转换成风格参数A 来影响不同尺度生成图像的骨干特征,用噪声来影响细节部分使生成的图片纹理更自然㊂S t y l e G A N v 2[10]相较于旧版本重新设计生成器的架构,使用了权重解调(W e i g h tD e m o d u t i o n )更直接地从卷积的输出特征图的统计数据中消除特征图尺度的影响,修复了产生特征伪影的缺点并进一步提高了结果质量㊂图1 S t yl e G A N v 2基本结构根据传入的风格参数,权重调制通过调整卷积权重的尺度替代性地实现对输入特征图的调整㊂w 'i j k =s i ㊃w i jk ,(2)式中,w 和w '分别为原始权重和调制后的权重;s i 为对应第i 层输入特征图对应的尺度㊂接着S t y l e -㊃58㊃第6期王 坤,等:数字图像修复的D I -S t y l e G A N v 2模型G A N v 2通过缩放权重而非特征图,从而在卷积的输出特征图的统计数据中消除s i 的影响,经过调制和卷积后,权值的标准偏差为:σj =∑i ,k w 'i j k 2,(3)权重解调即权重按相应权重的L 2范式进行缩放,S t y l e G A N v 2使用1/σj 对卷积权重进行操作,其中η是一个很小的常数以避免分母数值为0:w ″i j k =w 'i j k /∑i ,k w 'i j k 2+η㊂(4)解调操作融合了S t y l e G A N 一代中的调制㊁卷积㊁归一化,并且相比一代的操作更加温和,因为其是调制权重信号,而并非特征图中的实际内容㊂2 D I -S t yl e G A N v 2模型D I -S t y l e G A N v 2模型结构如图2所示:其主要部分由一个解码提取模块U -n e t 以及预先训练过的S t y l e G A N v 2组成,其中S t y l e G A N v 2包括了合成网络部分(S y n t h e s i sN e t w o r k )和判别器部分㊂待修复的输入图像在被输入到整个D I -S t yl e G A N v 2网络之前,需要被双线性插值器调整大小到固定的分辨率1024×1024㊂接着输入图像通过图2中左侧的U -n e t 生成隐编码Z 和解码信号(D e c o d i n g I n f o r m a t i o n ,D I )㊂由图2所示,隐编码Z 经过多个全连接层构成的M a p p i n g Ne t w o r k s 解耦后转化为中间隐编码W ㊂中间隐编码W 通过仿射变换产生的风格参数作为风格主信息,已经在F F H Q 数据集上训练过的S t yl e G A N v 2模块中的合成网络在风格主信息的指导下恢复图像㊂合成网络中的S t y l e G A N B l o c k 结构在图2下方显示,其中包含的M o d 和D e m o d 操作由式(2)和式(4)给出㊂同时U -n e t 解码层每一层的解码信息作为风格次信息以张量拼接的方式加入S t y l e G A NB l o c k 中的特征图,更好地让D I -S t y le G A N v 2模型生成细节㊂最后将合成网络生成的图片汇入到判别器中,由判别器判断是真实图像还是生成图像㊂图2 数字图像修复网络结构图模型鉴别器综合了3个损失函数作为总损失函数:对抗损失L A ㊁内容损失L C 和特征匹配损失L F ㊂其中,L A 为原始G A N 网络中的对抗损失,被定义为式(5),X 和^X 表示真实的高清图像和低质量的待修复图像,G 为训练期间的生成器,D 为判别器㊂L A =m i n G m a x D E (X )L o g (1+e x p (-D (G (X ^)))),(5)L C 定义为最终生成的修复图片与相应的真实图像之间的L 1范数距离;L F 为判别器中的特征层,其定义为式(6);T 为特征提取的中间层总数;D i (X )为判别器D 的第i 层提取的特征:㊃68㊃安 徽 工 程 大 学 学 报第38卷L F =m i n G E (X )(∑T i =0||D i (X )-D i (G (X ^))||2),(6)最终的总损失函数为:L =L A +αL C +βL F ,(7)式(7)中内容损失L C 判断修复结果和真实高清图像之间的精细特征与颜色信息上差异的大小;通过判别器中间特征层得到的特征匹配损失L F 可以平衡对抗损失L A ,更好地恢复㊁还原高清的数字图像㊂α和β作为平衡参数,在本文的实验中根据经验设置为α=1和β=0.02㊂3实验结果分析图3 F F HQ 数据集示例图3.1 数据集选择及数据预处理从图像数据集的多样性和分辨率考虑,本文选择在F F H Q (F l i c k rF a c e s H i g h Q u a l i t y )数据集上训练数字图像修复模型㊂该数据集包含7万张分辨率为1024×1024的P N G 格式高清图像㊂F F H Q 囊括了非常多的差异化的图像,包括不同种族㊁性别㊁表情㊁脸型㊁背景的图像㊂这些丰富属性经过训练可以为模型提供大量的先验信息,图3展示了从F F H Q 中选取的31张照片㊂训练过程中的模拟退化过程,即模型生成数据集对应的低质量图像这部分,本文主要通过以下方法实现:通过C V 库对图像随机地进行水平翻转㊁颜色抖动(包括对图像的曝光度㊁饱和度㊁色调进行随机变化)以及转灰度图等操作,并对图像采用混合高斯模糊,包括各向同性高斯核和各向异性高斯核㊂在模糊核设计方面,本文采用41×41大小的核㊂对于各向异性高斯核,旋转角度在[-π,π]之间均匀采样,同时进行下采样和混入高斯噪声㊁失真压缩等处理㊂整体模拟退化处理效果如图4所示㊂图4 模拟退化处理在模型回测中,使用C e l e b A 数据集来生成低质量的图像进行修复并对比原图,同时定量比较本模型与近年来提出的其他方法对于数字图像的修复效果㊂3.2 评估指标为了公平地量化不同算法视觉质量上的优劣,选取图像质量评估方法中最广泛使用的峰值信噪比(P e a kS i g n a l -t o -n o i s eR a t i o ,P S N R )以及结构相似性指数(S t r u c t u r a l S i m i l a r i t y I n d e x ,S S I M )指标,去量化修复后图像和真实图像之间的相似性㊂P S N R 为信号的最大可能功率和影响其精度的破坏性噪声功率的比值,数值越大表示失真越小㊂P S N R 基于逐像素的均方误差来定义㊂设I 为高质量的参考图像;I '为复原后的图像,其尺寸均为m ×n ,那么两者的均方误差为:M S E =1m n ∑m i =1∑n j =1(I [i ,j ]-I '[i ,j ])2,(8)P S N R 被定义为公式(9),P e a k 表示图像像素强度最大的取值㊂㊃78㊃第6期王 坤,等:数字图像修复的D I -S t yl e G A N v 2模型P S N R =10×l o g 10(P e a k 2M S E )=20×l o g 10(P e a k M S E ),(9)S S I M 是另一个被广泛使用的图像相似度评价指标㊂与P S N R 评价逐像素的图像之间的差异,S S I M 仿照人类视觉系统实现了其判别标准㊂在图像质量的衡量上更侧重于图像的结构信息,更贴近人类对于图像质量的判断㊂S S I M 用均值估计亮度相似程度,方差估计对比度相似程度,协方差估计结构相似程度㊂其范围为0~1,越大代表图像越相似;当两张图片完全一样时,S S I M 值为1㊂给定两个图像信号x 和y ,S S I M 被定义为:S S I M (x ,y )=[l (x ,y )α][c (x ,y )]β[s (x ,y )]γ,(10)式(10)中的亮度对比l (x ,y )㊁对比度对比c (x ,y )㊁结构对比s (x ,y )三部分定义为:l (x ,y )=2μx μy +C 1μ2x +μ2y +C 1,c (x ,y )=2σx σy +C 2σ2x +σ2y +C 2,l (x ,y )=σx y +C 3σx σy +C 3,(11)其中,α>0㊁β>0㊁γ>0用于调整亮度㊁对比度和结构之间的相对重要性;μx 及μy ㊁σx 及σy 分别表示x 和y 的平均值和标准差;σx y 为x 和y 的协方差;C 1㊁C 2㊁C 3是常数,用于维持结果的稳定㊂实际使用时,为简化起见,定义参数为α=β=γ=1以及C 3=C 2/2,得到:S S I M (x ,y )=(2μx μy +C 1)(2σx y +C 2)(μ2x +μ2y +C 1)(σ2x +σ2y +C 2)㊂(12)在实际计算两幅图像的结构相似度指数时,我们会指定一些局部化的窗口,计算窗口内信号的结构相似度指数㊂然后每次以像素为单位移动窗口,直到计算出整幅的图像每个位置的局部S S I M 再取均值㊂3.3 实验结果(1)D I -S t y l e G A N v 2修复结果㊂图5展示了D I -S t yl e G A N v 2模型在退化图像上的修复结果,其中图5b ㊁5d 分别为图5a ㊁5c 的修复结果㊂通过对比可以看到,D I -S t y l e G A N v 2修复过的图像真实且还原,图5b ㊁5d 中图像的头发㊁眉毛㊁眼睛㊁牙齿的细节清晰可见,甚至图像背景也被部分地修复,被修复后的图像通过人眼感知,整体质量优异㊂图5 修复结果展示(2)与其他方法的比较㊂本文用C e l e b A -H Q 数据集合成了一组低质量图像,在这些模拟退化图像上将本文的数字图像修复模型与G P E N [12]㊁P S F R G A N [13]㊁H i F a c e G A N [14]这些最新的深度学习修复算法的修复效果进行比较评估,这些最近的修复算法在实验过程中使用了原作者训练过的模型结构和预训练参数㊂各个模型P S N R 和L P I P S 的测试结果如表1所示,P S N R 和S S I M 的值越大,表明修复图像和真实高清图像之间的相似度越高,修复效果越好㊂由表1可以看出,我们的数字图像修复模型获得了与其他顶尖修复算法相当的P S N R 指数,L P I P S 指数相对于H i F a c e G A N 提升了12.47%,同时本实验环境下修复512×512像素单张图像耗时平均为1.12s ㊂表1 本文算法与相关算法的修复效果比较M e t h o d P S N R S S I M G P E N 20.40.6291P S F R G A N 21.60.6557H i F a c e G A N 21.30.5495o u r 20.70.6180值得注意的是P S N R 和S S I M 大小都只能作为参考,不能绝对反映数字图像修复算法的优劣㊂图6展示了D I -S t y l e G A N v 2㊁G P E N ㊁P S F R G A N ㊁H i F a c e G A N 的修复结果㊂由图6可以看出,无论是全局一致性还是局部细节,D I -S t y l e G A N v 2都做到了很好得还原,相比于其他算法毫不逊色㊂在除人脸以外的其他自然场景的修复上,D I -S t yl e G A N v 2算法依旧表现良好㊂图7中左侧为修复前㊃88㊃安 徽 工 程 大 学 学 报第38卷图像,右侧为修复后图像㊂从图7中红框处标记出来的细节可以看出,右侧修复后图像的噪声相比左图明显减少,观感更佳,这在图7下方放大后的细节比对中表现得尤为明显㊂从图像中招牌的字体区域可见对比度和锐化程度的提升使得图像内部图形边缘更加明显,整体更加清晰㊂路面的洁净度上噪声也去除很多,更为洁净㊂整体上修复后图像的色彩比原始的退化图像要更丰富,层次感更强,视觉感受更佳㊂图6 修复效果对比图图7 自然场景图像修复结果展示(左:待修复图像;右:修复后图像)4 结论基于深度学习的图像修复技术近年来在超分辨图像㊁医学影像等领域得到广泛的关注和应用㊂本文针对传统修复技术处理大面积缺失或者缺失部分纹理结构的图像时容易产生修复结果缺失图像语义的问题,在国内外图像修复技术理论与方法的基础上,由卷积神经网络U -n e t 结合近几年效果极佳的生成对抗网络S t yl e G A N v 2,提出了以图像解码信息㊁隐编码㊁图像生成先验这三类信息指导深度神经网络对图像进行修复㊂通过在F F H Q 数据集上进行随机图像退化模拟来训练D I -S t y l e G A N v 2网络模型,并由P S N R 和S S I M 两个指标来度量修复样本和高清样本之间的相似性㊂㊃98㊃第6期王 坤,等:数字图像修复的D I -S t y l e G A N v 2模型㊃09㊃安 徽 工 程 大 学 学 报第38卷实验表明,D I-S t y l e G A N v2网络模型能够恢复清晰的面部细节,修复结果具有良好的全局一致性和局部精细纹理㊂其对比现有技术具有一定优势,同时仅需提供待修复图像而无需缺失部分掩码就能得到令人满意的修复结果㊂这主要得益于D I-S t y l e G A N v2模型能够通过大样本数据的训练学习到了丰富的图像生成先验,并由待修复图像生成的隐编码和解码信号指导神经网络学习到更多的图像结构和纹理信息㊂参考文献:[1] A N T O N I O C,P A T R I C KP,K E N T A R O T.R e g i o n f i l l i n g a n do b j e c t r e m o v a l b y e x e m p l a r-b a s e d i m a g e i n p a i n t i n g[J].I E E ET r a n s a c t i o n s o n I m a g eP r o c e s s i n g:A P u b l i c a t i o no f t h eI E E E S i g n a lP r o c e s s i n g S o c i e t y,2004,13(9):1200-1212.[2] T E L E A A.A n i m a g e i n p a i n t i n g t e c h n i q u eb a s e do nt h e f a s tm a r c h i n g m e t h o d[J].J o u r n a l o fG r a p h i c sT o o l s,2004,9(1):23-34.[3] B E R T A L M I O M,B E R T O Z Z IA L,S A P I R O G.N a v i e r-s t o k e s,f l u i dd y n a m i c s,a n d i m a g ea n dv i d e o i n p a i n t i n g[C]//I E E EC o m p u t e r S o c i e t y C o n f e r e n c e o nC o m p u t e rV i s i o n&P a t t e r nR e c o g n i t i o n.K a u a i:I E E EC o m p u t e r S o c i e t y,2001:990497.[4] Y U X,P O R I K L I F.H a l l u c i n a t i n g v e r y l o w-r e s o l u t i o nu n a l i g n e d a n dn o i s y f a c e i m a g e s b y t r a n s f o r m a t i v e d i s c r i m i n a t i v ea u t o e n c o d e r s[C]//I nC V P R.H o n o l u l u:I E E EC o m p u t e r S o c i e t y,2017:3760-3768.[5] HU XB,R E N W Q,L AMA S T E RJ,e t a l.F a c e s u p e r-r e s o l u t i o n g u i d e db y3d f a c i a l p r i o r s.[C]//C o m p u t e rV i s i o n–E C C V2020:16t hE u r o p e a nC o n f e r e n c e.G l a s g o w:S p r i n g e r I n t e r n a t i o n a l P u b l i s h i n g,2020:763-780.[6] L IX M,L IW Y,R E ND W,e t a l.E n h a n c e d b l i n d f a c e r e s t o r a t i o nw i t hm u l t i-e x e m p l a r i m a g e s a n d a d a p t i v e s p a t i a l f e a-t u r e f u s i o n[C]//P r o c e e d i n g so f t h eI E E E/C V F C o n f e r e n c eo nC o m p u t e rV i s i o na n dP a t t e r n R e c o g n i t i o n.V i r t u a l:I E E EC o m p u t e r S o c i e t y,2020:2706-2715.[7] I S O L A P,Z HUJY,Z HO U T H,e t a l.I m a g e-t o-i m a g e t r a n s l a t i o nw i t hc o n d i t i o n a l a d v e r s a r i a l n e t w o r k s[C]//P r o-c e ed i n g s o f t h eI E E E C o n fe r e n c eo nC o m p u t e rV i s i o na n dP a t t e r n R e c o g n i t i o n.H o n o l u l u:I E E E C o m p u t e rS o c i e t y,2017:1125-1134.[8] WA N G TC,L I U M Y,Z HUJY,e t a l.H i g h-r e s o l u t i o n i m a g es y n t h e s i sa n ds e m a n t i cm a n i p u l a t i o nw i t hc o n d i t i o n a lg a n s[C]//P r o c e e d i n g so f t h eI E E E C o n f e r e n c eo nC o m p u t e rV i s i o na n dP a t t e r n R e c o g n i t i o n.S a l tL a k eC i t y:I E E EC o m p u t e r S o c i e t y,2018:8798-8807.[9] G O O D F E L L OWI A N,P O U G E T-A B A D I EJ,M I R Z A M,e t a l.G e n e r a t i v ea d v e r s a r i a ln e t s[C]//A d v a n c e s i n N e u r a lI n f o r m a t i o nP r o c e s s i n g S y s t e m s.M o n t r e a l:M o r g a nK a u f m a n n,2014:2672-2680.[10]K A R R A ST,L A I N ES,A I T T A L A M,e t a l.A n a l y z i n g a n d i m p r o v i n g t h e i m a g e q u a l i t y o f s t y l e g a n[C]//P r o c e e d i n g so f t h e I E E E/C V FC o n f e r e n c e o nC o m p u t e rV i s i o n a n dP a t t e r nR e c o g n i t i o n.S n o w m a s sV i l l a g e:I E E EC o m p u t e r S o c i e-t y,2020:8110-8119.[11]O L A FR O N N E B E R G E R,P H I L I P PF I S C H E R,T HOMA SB R O X.U-n e t:c o n v o l u t i o n a l n e t w o r k s f o r b i o m e d i c a l i m a g es e g m e n t a t i o n[C]//M e d i c a l I m a g eC o m p u t i n g a n dC o m p u t e r-A s s i s t e dI n t e r v e n t i o n-M I C C A I2015:18t hI n t e r n a t i o n a lC o n f e r e n c e.M u n i c h:S p r i n g e r I n t e r n a t i o n a l P u b l i s h i n g,2015:234-241.[12]Y A N G T,R E NP,X I EX,e t a l.G a n p r i o r e m b e d d e dn e t w o r k f o r b l i n d f a c e r e s t o r a t i o n i n t h ew i l d[C]//P r o c e e d i n g s o ft h eI E E E/C V F C o n f e r e n c eo n C o m p u t e r V i s i o na n d P a t t e r n R e c o g n i t i o n.V i r t u a l:I E E E C o m p u t e rS o c i e t y,2021: 672-681.[13]C H E NCF,L IX M,Y A N GLB,e t a l.P r o g r e s s i v e s e m a n t i c-a w a r e s t y l e t r a n s f o r m a t i o n f o r b l i n d f a c e r e s t o r a t i o n[C]//P r o c e e d i n g s o f t h e I E E E/C V FC o n f e r e n c e o nC o m p u t e rV i s i o n a n dP a t t e r nR e c o g n i t i o n.V i r t u a l:I E E EC o m p u t e r S o c i-e t y,2021:11896-11905.[14]Y A N GLB,L I U C,WA N GP,e t a l.H i f a c e g a n:f a c e r e n o v a t i o nv i a c o l l a b o r a t i v e s u p p r e s s i o na n dr e p l e n i s h m e n t[C]//P r o c e e d i n g s o f t h e28t hA C MI n t e r n a t i o n a l C o n f e r e n c e o n M u l t i m e d i a.S e a t t l e:A s s o c i a t i o n f o rC o m p u t i n g M a c h i n e r y, 2020:1551-1560.D I -S t y l e G A N v 2M o d e l f o rD i g i t a l I m a geR e s t o r a t i o n WA N G K u n ,Z H A N G Y u e*(S c h o o l o fM a t h e m a t i c s a n dF i n a n c e ,A n h u i P o l y t e c h n i cU n i v e r s i t y ,W u h u241000,C h i n a )A b s t r a c t :W h e n t h e r e a r e l a r g e a r e a s o fm i s s i n g o r c o m p l e x t e x t u r e s t r u c t u r e s i n d i g i t a l i m a g e s ,t r a d i t i o n -a l i m a g e e n h a n c e m e n t t e c h n i q u e s c a n n o t r e s t o r em o r e r e q u i r e d i n f o r m a t i o n f r o mt h e l i m i t e d i n f o r m a t i o n i n t h e i m a g e ,r e s u l t i n g i n l i m i t e d e n h a n c e m e n t p e r f o r m a n c e .T h e r e f o r e ,t h i s p a p e r p r o p o s e s a d e e p l e a r n -i n g i n p a i n t i n g m e t h o d ,D I -S t y l e G A N v 2,w h i c hc o m b i n e sd e c o d i n g i n f o r m a t i o n (D I )w i t hS t y l e G A N v 2.F i r s t l y ,t h eU -n e tm o d u l e g e n e r a t e s ah i d d e n e n c o d i n g s i g n a l c o n t a i n i n g t h em a i n i n f o r m a t i o no f t h e i m -a g e a n d a d e c o d i n g s i g n a l c o n t a i n i n g t h e s e c o n d a r y i n f o r m a t i o n .T h e n ,t h e S t y l e G A N v 2m o d u l e i s u s e d t o i n t r o d u c e a n i m a g e g e n e r a t i o n p r i o r .D u r i n g t h e i m a g e g e n e r a t i o n p r o c e s s ,n o t o n l y t h eh i d d e ne n c o d i n gm a i n i n f o r m a t i o n i s u s e d ,b u t a l s o t h e d e c o d i n g s i g n a l c o n t a i n i n g t h e s e c o n d a r y i n f o r m a t i o no f t h e i m a g e i s i n t e g r a t e d ,t h e r e b y a c h i e v i n g s e m a n t i c e n h a n c e m e n t o f t h e r e p a i r r e s u l t s .T h e e x pe r i m e n t a l r e s u l t s o n F F H Qa n dC e l e b Ad a t a b a s e s v a l i d a t e t h e ef f e c t i v e n e s s o f t h e p r o p o s e da p p r o c h .K e y w o r d s :i m ag e r e s t o r a t i o n ;d e c o d e s i g n a l ;g e n e r a t i v e a d v e r s a r i a l n e t w o r k ;U -n e t ;S t y l e G A N v 2(上接第83页)P r o g r e s s i v e I n t e r p o l a t i o nL o o p S u b d i v i s i o n M e t h o d w i t hD u a lA d ju s t a b l eF a c t o r s S H IM i n g z h u ,L I U H u a y o n g*(S c h o o l o fM a t h e m a t i c s a n dP h y s i c s ,A n h u i J i a n z h uU n i v e r s i t y ,H e f e i 230601,C h i n a )A b s t r a c t :A i m i n g a t t h e p r o b l e mt h a t t h e l i m i ts u r f a c e p r o d u c e db y t h ea p p r o x i m a t eL o o p s u b d i v i s i o n m e t h o d t e n d s t o s a g a n d s h r i n k ,a p r o g r e s s i v e i n t e r p o l a t i o nL o o p s u b d i v i s i o nm e t h o dw i t hd u a l a d j u s t a -b l e f a c t o r s i s p r o p o s e d .T h i sm e t h o d i n t r o d u c e s d i f f e r e n t a d j u s t a b l e f a c t o r s i n t h e t w o -p h a s eL o o p s u b d i -v i s i o nm e t h o d a n d t h e p r o g r e s s i v e i t e r a t i o n p r o c e s s ,s o t h a t t h e g e n e r a t e d l i m i t s u r f a c e i s i n t e r p o l a t e d t o a l l t h e v e r t i c e s o f t h e i n i t i a l c o n t r o lm e s h .M e a n w h i l e ,i t h a s a s t r i n g e n c y ,l o c a l i t y a n d g l o b a l i t y .I t c a nn o t o n l y f l e x i b l y c o n t r o l t h e s h a p e o f l i m i t s u r f a c e ,b u t a l s o e x p a n d t h e c o n t r o l l a b l e r a n g e o f s h a p e t o a c e r -t a i ne x t e n t .F r o mt h e n u m e r i c a l e x p e r i m e n t s ,i t c a nb e s e e n t h a t t h em e t h o d c a n r e t a i n t h e c h a r a c t e r i s t i c s o f t h e i n i t i a l t r i a n g u l a rm e s hb e t t e rb y c h a n g i n g t h ev a l u eo f d u a l a d j u s t a b l e f a c t o r s ,a n d t h e g e n e r a t e d l i m i t s u r f a c eh a s a s m a l l d e g r e e o f s h r i n k a g e ,w h i c h p r o v e s t h em e t h o d t ob e f e a s i b l e a n d e f f e c t i v e .K e y w o r d s :L o o p s u b d i v i s i o n ;p r o g r e s s i v e i n t e r p o l a t i o n ;d u a l a d j u s t a b l e f a c t o r s ;a s t r i n g e n c y ㊃19㊃第6期王 坤,等:数字图像修复的D I -S t y l e G A N v 2模型。
InpaintingMarcelo Bertalm´ıo,Vicent Caselles,Simon Masnou,GuillermoSapiroSynonyms–Disocclusion–Completion–Filling-in–Error concealmentRelated Concepts–Texture synthesisDefinitionGiven an image and a regionΩinside it,the inpainting problem consists in modifying the image values of the pixels inΩso that this region does not stand out with respect to its surroundings.The purpose of inpainting might be to restore damaged portions of an image(e.g.an old photograph where folds and scratches have left image gaps)or to remove unwanted elements present in the image(e.g.a microphone appearing in afilm frame).Seefigure1.The regionΩis always given by the user,so the localization ofΩis not part of the inpainting problem.Almost all inpainting algorithms treatΩas a hard constraint,whereas some methods allow some relaxing of the boundaries ofΩ.This definition,given for a single-image problem,extends naturally to the multi-image case therefore this entry covers both image and video inpainting. What is not however considered in this text is surface inpainting(e.g.how tofill holes in3D scans),although this problem has been addessed in the literature.Fig.1.The inpainting problem.Left:original image.Middle:inpainting mask Ω,in black.Right:an inpainting result.Figure taken from[20].BackgroundThe term inpainting comes from art restoration,where it is also called re-touching.Medieval artwork started to be restored as early as the Renaissance, the motives being often as much to bring medieval pictures“up to date”as to fill-in any gaps.The need to retouch the image in an unobtrusive way extended naturally from paintings to photography andfilm.The purposes remained the same:to revert deterioration(e.g.scratches and dust spots infilm),or to add or remove elements(e.g.the infamous“airbrushing”of political enemies in Stalin-era U.S.S.R).In the digital domain,the inpainting problemfirst appeared under the name“error concealment”in telecommunications,where the need was to fill-in image blocks that had been lost during data transmission.One of thefirst works to address automatic inpainting in a general setting dubbed it“image disocclusion,”since it treated the image gap as an occluding object that had to be removed,and the image underneath would be the restoration result.Popular terms used to denote inpainting algorithms are also“image completion”and “imagefill-in”.ApplicationThe extensive literature on digital image inpainting may be roughly grouped into three categories:patch-based,sparse,and PDEs/variational methods. From texture synthesis to patch-based inpaintingEfros and Leung[14]proposed a method that,although initially intended for texture synthesis,has proven most effective for the inpainting problem.The image gap isfilled-in recursively,inwards from the gap boundary:each“empty”pixel P at the boundary isfilled with the value of the pixel Q(lying outside the image gap,i.e.Q is a pixel with valid information)such that the neighborhood Ψ(Q)of Q(a square patch centered in Q)is most similar to the(available) neighborhoodΨ(P)of P.Formally,this can be expressed as an optimization problem:Output(P)=V alue(Q),P∈Ω,Q/∈Ω,Q=arg min d(Ψ(P),Ψ(Q)),(1) where d(Ψ(P),Ψ(Q))is the Sum of Squared Differences(SSD)among the patches Ψ(P)andΨ(Q)(considering only available pixels):d(Ψ1,Ψ2)=ij|Ψ1(i,j)−Ψ2(i,j)|2,(2)and the indices i,j span the extent of the patches(e.g.ifΨis an11×11patch then0≤i,j≤10.Once P isfilled-in,the algorithm marchs on to the next pixel at the boundary of the gap,never going back to P(whose value is,therefore,not altered again).See Figure2for an overview of the algorithm and Figure3for an example of the outputs it can achieve.The results are really impressive for a wide range of images.The main shortcomings of this algorithm are its computational cost,the selection of the neighborhood size(which in the original paper is a global user-selected parameter,but which should change locally depending onimage content),thefilling order(which may create unconnected boundaries for some objects)and the fact that it cannot deal well with image perspective(it was intended to synthesize frontal textures,hence neighborhoods are compared always with the same size and orientation).Also,results are poor if the image gap is very large and disperse(e.g.an image where80%of the pixels have been lost due to random salt and pepper noise).Fig.2.Efros and Leung’s algorithm overview(figure taken from[14]).Given a sample texture image(left),a new image is being synthesized one pixel at a time(right).To synthesize a pixel,the algorithmfirstfinds all neighborhoods in the sample image(boxes on the left)that are similar to the pixels neighborhood (box on the right)and then randomly chooses one neighborhood and takes its center to be the newly synthesized pixel.Criminisi et al.[12]improved on this work in two aspects.Firstly,they changed thefilling order from the original“onion-peel”fashion to a priority scheme where empty pixels at the edge of an image object have higher prior-ity than empty pixels onflat regions.Thus,they are able to correctly inpaint straight object boundaries which could have otherwise ended up disconnected with the original formulation.See Figure4.Secondly,they copy entire patches instead of single pixels,so this method is considerably faster.Several shortcom-ings remain,though,like the inability to deal with perspective and the need to manually select the neighborhood size(here there are two sizes to set,one for the patch to compare with and another for the patch to copy from).Also,objects with curved boundaries may not be inpainted correctly.Ashikhmin[2]contributed as well to improve on the original method of Efros and Leung[14].With the idea of reducing the computational cost of the proce-dure,he proposed to look for the best candidate Q to copy its value to the empty pixel P not searching the whole image but only searching among the candidates of the neighbors of P which have already been inpainted.See Figure5.The speed-up achieved with this simple technique is considerable,and also there is a very positive effect regarding the visual quality of the output.Other methods reduce the search space and computational cost involved in the candidate patch search by organizing image patches in tree structures,reducing the dimension-Fig.3.Left:original image,inpainting maskΩin black.Right:inpainting result obtained with Efros and Leung’s algorithm,images taken from their paper[14]. ality of the patches with techniques like Principal Component Analysis(PCA), or using randomized approaches.While most image inpainting methods attempt to be fully automatic(aside from the manual setting of some parameters),there are user-assisted methods that provide remarkable results with just a little input from the user.In the work by Sun et al.[27]the user must specify curves in the unknown region, curves corresponding to relevant object boundaries.Patch synthesis is performed along these curves inside the image gap,by copying from patches that lie on the segments of these curves which are outside the gap,in the“known”region. Once these curves are completed,in a process which the authors call structure propagation,the remaining empty pixels are inpainted using a technique like the one by Ashikhmin[2]with priorities as in Criminisi et al.[12].Barnes et al.[5]accelerate this method and make it interactive,by employing randomized searches and combining into one step the structure propagation and texture synthesis processes of Sun et al.[27].The role of sparsityAfter the introduction of patch-based methods for texture synthesis by Efros and Leung[14],and image inpainting by Criminisi et al[12],it became clear that the patches of an image provide a good dictionary to express other parts of the image.This idea has been successfully applied to other areas of image processing,e.g.denoising and segmentation.More general sparse image representations using dictionaries have proven their efficiency in the context of inpainting.For instance,using overcomplete dictionaries adapted to the representation of image geometry and texture,Elad et al.[15]proposed an image decomposition model with sparse coefficients forFig.4.Left:original image.Right:inpainting result obtained with the algorithm of Criminisi et al.[12],images taken from their paper.the geometry and texture components of the image,and showed that the model can be easily adapted for image inpainting.A further description of this model follows.Let u be an image represented as a vector in R N .Let the matrices D g ,D t of sizes N ×k g and N ×k t represent two dictionaries adapted to geometry and texture,respectively.If αg ∈R k g and αt ∈R k g represent the geometry and texture coefficients,then u =D g αg +D t αt represents the image decomposition using the dictionaries collected in D g and D t .A sparse image representation is obtained by minimizingmin (αg ,αt ):u =D g αg +D t αt αg p + αt p ,(3)where p =0,1.Although the case p =0represents the sparseness measure (i.e.,the number of non zero coordinates)it leads to a non-convex optimization problem whose minimization is more complex.The case p =1yields a convex and tractable optimization problem leading also to sparsness.Introducing the constraint by penalization (thus,in practice,relaxing it)and regularizing the ge-ometric part of the decomposition with a total variation semi-norm penalization,Elad et al [15]propose the variational model:min (αg ,αt ) αg 1+ αt 1+λ u −D g αg −D t αt 22+γT V (D g αg ),(4)where T V denotes the total variation,λ,γ>0.This model can be easily adapted to a model for image inpainting.Observe that u −D g αg −D t αt can be inter-preted as the noise component of the image and λis a penalization parameterFig.5.Ashikhmin’s texture synthesis method(figure taken from[2]).Each pixel in the current L-shaped neighborhood generates a shifted candidate pixel(black) according to its original position(hatched)in the input texture.The best pixel is chosen among these candidates only.Several different pixels in the current neighborhood can generate the same candidate.that depends inversely on the noise power.Then the inpainting mask can be in-terpreted as a region where the noise is very large(infinite).Thus,if M=0and =1identify the inpainting mask and the known part of the image,respectively, then the extension of(4)to inpainting can be written asαg 1+ αt 1+λ M(u−D gαg−D tαt) 22+γT V(D gαg).(5) min(αg,αt)Writing the energy in(5)using u g:=D g u,u t:=D t u as unknown variables, it can be observed thatαg=D+g u g+r g,αt=D+t u t+r t,where D+g,D+t denote the corresponding pseudoinverse matrices and r g,r t are in the null spaces of D g and D t,respectively.Assuming for simplicity,as in Elad et al[15],that r g=0, r t=0,the model(5)can be written asD+g u g 1+ D+t u t 1+λ M(u−u g−u t) 22+γT V(u g).(6) min(αg,αt)This simplified model is justified in Elad et al[15]by several reasons:it is an upper bound for(5),is easier to solve,it provides good results,it has a Bayesian interpretation,and is equivalent to(5)if D g and D t are non-singular,or when using the 2norm in place of the 1norm.The model has nice featuressince it permits to use adapted dictionaries for geometry and texture,treats the inpainting as missing samples and the sparsity model is included with 1norms that are easy to solve.This framework has been adapted to the use of dictionaries of patches and has been extended in several directions like image denoising,filling-in missing pixels (Aharon et al [1]),color image denoising,demosaicing and inpainting of small holes (Mairal et al [21],and further extended to deal with multiscale dictionaries and to cover the case of video sequences in Mairal et al [22].To give a brief review of thismodel some notation is required.Image patches are squares of size n =√n ×√n .Let D be a dictionary of patches represented by a matrix of size n ×k ,where the elements of the dictionary are the columns of D .If α∈R k is a vector of coefficients,then Dαrepresents the patch obtained by linear combination of the columns of D .Given an image v (i,j ),i,j ∈{1,...,N },the purpose is to find a dictionary ˆD ,an image ˆu and coefficients ˆα={ˆαi,j ∈R k :i,j ∈{1,...,N }}which minimize the energymin (α,D,u )λ v −u 2+Ni,j =1µi,j αi,j 0+Ni,j =1 Dαi,j −R i,j u 2,(7)where R i,j u denotes the patch of u centered at (i,j )(dismissing boundary ef-fects),and µi,j are positive weights.The solution of the nonconvex problem (7)is obtained using an alternate minimization:a sparse coding step where one computes αi,j knowing the dictionary D for all i,j ,a dictionary update using a sequence of one rank approximation problems to update each column of D (Aharon,Elad,and Bruckstein [1]),and a final reconstruction step given by the solution ofmin u λ v −u 2+N i,j =1ˆDαi,j −R i,j u 2.(8)Again,the inpainting problem can be considered as a case of non-homogeneous noise.Defining for each pixel (i,j )a coefficient βi,j inversely proportional to the noise variance,a value of βi,j =0may be taken for each pixel in the inpainting mask.Then the inpainting problem can be formulated asmin (α,D,u )λ β⊗(v −u ) 2+Ni,j =1µi,j αi,j 0+Ni,j =1 (R i,j β)⊗(Dαi,j −R i,j u ) 2,(9)where β=(βi,j )N i,j =1,and ⊗denotes the elementwise multiplication between two vectors.With suitable adaptations,this model has been applied to inpainting (of relatively small holes),to interpolation from sparse irregular samples and super-resolution,to image denoising,demoisaicing of color images,and video denoising and inpainting,obtaining excellent results,see Mairal et al [22].PDEs and variational approachesAll the methods mentioned so far are based on the same principle:a miss-ing/corrupted part of an image can be well synthetized by suitably sampling and copying uncorrupted patches (taken either from the image itself or built from a dictionary).A very different point of view underlies many contributions in-volving either a variational principle,through a minimization process,or a (non necessarily variational)partial differential equation (PDE).An early interpolation method that applies for inpainting is due to Ogden,Adelson,Bergen,and Burt [24].Starting from an initial image,a Gaussian filter-ing is built by iterated convolution and subsampling.Then,a given inpainting domain can be filled-in by successive linear interpolations,downsampling and upsampling at different levels of the Gaussian pyramid.The efficiency of such approach is illustrated in Figure 6.Fig.6.An inpainting experiment taken from Ogden et al [24].The method uses a Gaussian pyramid and a series of linear interpolations,downsampling,and upsampling.Masnou and Morel proposed in [23]to interpolate a gray-valued image by extending its isophotes (the lines of constant intensity)in the inpainting domain.This approach is very much in the spirit of early works by Kanizsa,Ullman,Horn,Mumford and Nitzberg to model the ability of the visual system to complete edges in an occlusion or visual illusion context.This is illustrated in Figure 7.The general completion process involves complicated phenomena that cannot be easily and univocally modeled.However,experimental results show that,in simple occlusion situations,it is reasonable to argue that the brain extrapolates broken edges using elastica-type curves,i.e.,curves that join two given points with prescribed tangents at these points,a totallength lower than a given L ,and minimize the Euler elastica energy |κ(s )|2ds ,with s the curve arc-lengthand κthe curvature.The model by Masnou and Morel [23]generalizes this principle to the isophotes of a gray-valued image.More precisely,denoting ˜Ωa domain slightly larger than Ω,it is proposed in [23]to extrapolate the isophotes of an image u ,known out-Fig.7.Amodal completion:the visual system automatically completes the bro-ken edge in the leftfigure.The middlefigure illustrates that,here,no global symmetry process is involved:in bothfigures,the same edge is synthesized.In such simple situation,the interpolated curve can be modeled as a Euler’s elas-tica,i.e.a curve with clamped points and tangents at its extremities,and with minimal oscillations.sideΩand valued in[m,M],by a collection of curves{γt}t∈[m,M]with no mutual crossings,that coincide with the isophotes of u on˜Ω\Ωand that minimize theenergyMmγt(α+β|κγt|p)ds dt.(10)Hereα,βare two context-dependent parameters.This energy penalizes a gener-alized Euler’s elastica energy,with curvature to the power p>1instead of2,of all extrapolation curvesγt,t∈[m,M].An inpainting algorithm,based on the minimization of(10)in the case p=1, is proposed by Masnou and Morel in[23].A globally minimal solution is com-puted using a dynamic programming approach that reduces the algorithmical complexity.The algorithm handles only simply connected domains,i.e.,those with no holes.In order to deal with color images,RGB images are turned into a luma/chrominance representation,e.g.YCrCb,or Lab,and each channel is processed independently.The reconstruction process is illustrated in Figure8.The word inpainting,in the image processing context,has been coinedfirst by Bertalm´ıo,Sapiro,Caselles,and Ballester in[7],where a PDE model is proposed in the very spirit of real paintings restoration.More precisely,being u a gray-valued image to be inpainted inΩ,a time stepping method for the transport-like equationu t=∇⊥u·∇∆u inΩ,(11)u given inΩc,is combined with anisotropic diffusion steps that are interleaved for stabilization, using the following diffusion modelu t=ϕ (x)|∇u|∇·∇u|∇u|,(12)whereϕ is a smooth cut-offfunction that forces the equation to act only inΩ, and∇·(∇u/|∇u|)is the curvature along isophotes.This diffusion equation,that(a)(b)(c)(d)(e)(f)Fig.8.8(a)is the original image and8(b)the image with occlusions in white. The luminance channel is shown in Figure8(c).A few isophotes are drawn in Figure8(d)and their reconstruction by the algorithm of Masnou and Morel[23] is given in Figure8(e).Applying the same method to the luminance,hue,and saturation channels,yields thefinal result of Figure8(f).has been widely used for denoising an image while preserving its edges,com-pensates any shock possibly created by the transport-like equation.What is the meaning of Equation(11)?Following Bertalm´ıo et al[7],∆u is a measure of im-age smoothness,and stationary points for the equation are images for which∆u is constant along the isophotes induced by the vectorfield∇⊥u.Equation(11) is not explicitly a transport equation for∆u,but,in the equivalent form,u t=−∇⊥∆u·∇u(13) it is a transport equation for u being convected by thefield∇⊥∆u.Following Bornemann and M¨a rz[9],thisfield is in the direction of the level lines of∆u, which are related to the Marr-Hildreth edges.Indeed,the zero crossings of(a convoluted version of)∆u are the classical characterization of edges in the cel-ebrated model of Marr and Hildreth.In other words,as in the real paintings restoration,the approach of Bertalm´ıo et al[7]consists in conveying the image intensities along the direction of the edges,from the boundary of the inpainting domainΩtowards the interior.The efficiency of such approach is illustrated in Figure9.From a numerical viewpoint,the transport equation and the anisotropic diffusion can be implemented with classicalfinite difference schemes.For color images,the coupled system can be applied independently to each channel ofany classical luma/chrominance representation.There is no restriction on the topology of the inpaintingdomain.Fig.9.An experiment taken from Bertalm´ıo et al [7].Left:original image.Mid-dle:a user-defined mask.Right:the result with the algorithm of [7].Another perspective on this model is provided by Bertalm´ıo,Bertozzi,and Sapiro in [6],where connections with the classical Navier-Stokes equation of fluid dynamics are shown.Indeed,the steady state equation of Bertalm´ıo et al [7],∇⊥u ·∇∆u =0,is exactly the equation satisfied by steady state inviscid flows in the two-dimensional incompressible Navier-Stokes model.Although the anisotropic diffusion equa-tion (12)is not the exact couterpart of the viscous diffusion term used in the Navier-Stokes model for incompressible and Newtonian flows,yet a lot of the numerical knowledge on fluid mechanics seems to be adaptable to design sta-ble and efficient schemes for inpainting.Results in this direction are shown in Bertalm´ıo,Bertozzi,and Sapiro [6].Chan and Shen propose in [11]a denoising/inpainting first-order model based on the joint minimization of a quadratic fidelity term outside Ωand a total variation criterion in Ω,i.e.,the joint energy A|∇u |dx +λ Ω|u −u 0|2dx,with A ⊃⊃Ωthe image domain and λa Lagrange multiplier.The existence of so-lutions to this problem follows easily from the properties of functions of bounded variation.As for the implementation,Chan and Shen look for critical points of the energy using a Gauss-Jacobi iteration scheme for the linear system associ-ated to an approximation of the Euler-Lagrange equation by finite differences.More recent approaches to the minimization of total variation with subpixel ac-curacy should nowadays be preferred.From the phenomenological point of view, the model of Chan and Shen[11]yields inpainting candidates with the smallest possible isophotes.It is therefore more suitable for thin or sparse domains.An illustration of the model’s performances is given in Figure10Fig.10.An experiment taken from Chan and Shen[11].Left:original image. Right:after denoising and removal of text.Turning back to the criterion(10),a similar penalization on˜Ωof both the length and the curvature of all isophotes of an image u yields two equivalent forms,in the case where u is smooth enough(see Masnou and Morel[23]): +∞−∞{u=t}∩˜Ω(α+β|κ|p)ds dt=˜Ω|∇u|α+β∇·∇u|∇u|pdx.(14)There have been various contributions to the numerical approximation of critical points for this criterion.A fourth-order time-stepping method is proposed by Chan,Kang,and Shen in[10]based on the approximation of the Euler-Lagrange equation,for the case p=2,using upwindfinite differences and a min-mod formula for estimating the curvature.Such high-order evolution method suffers from well-known stability and convergence issues that are difficult to handle.A model,slightly different from(14),is tackled by Ballester,Bertalm´ıo, Caselles,Sapiro,and Verdera in[4]using a relaxation approach.The key idea is to replace the second-order term∇·∇u|∇u|with afirst-order term depending on an auxiliary variable.More precisely,Ballester et al study in[4]the minimization of˜Ω|∇·θ|p(a+b|∇k∗u|)dx+α˜Ω(|∇u|−θ·∇u)dx,under the constraint thatθis a vectorfield with subunit modulus and prescribed normal component on the boundary of˜Ω,and u takes values in the same range as inΩc.Clearly,θplays the role of∇u/|∇u|but the new criterion is much less singular.As for k,it is a regularizing kernel introduced for technical reasons in order to ensure the existence of a minimizing couple(u,θ).The main differencebetween the new relaxed criterion and (14),besides singularity,is the term ˜Ω|∇·θ|p which is more restrictive,despite the relaxation,than ˜Ω|∇u | ∇·∇u |∇u | p dx .However,the new model has a nice property:a gradient descent with respect to (u,θ)can be easily computed and yields two coupled second-order equations whose numerical approximation is standard.Results obtained with this model are shown in Figure 11.Fig.11.Two inpainting results obtained with the model proposed by Ballester et al [4].Observe in particular how curved edges are restored.The Mumford-Shah-Euler model by Esedoglu and Shen [17]is also varia-tional.It combines the celebrated Mumford-Shah segmentation model for images and the Euler’s elastica model for curves,i.e.,denoting u a piecewise weakly smooth function,that is a function with integrable squared gradient out of a discontinuity set K ⊂˜Ω,the proposed criterion reads ˜Ω\K |∇u |2dx + K(α+βk 2)ds.Two numerical approaches to the minimization of such criterion are discussed in Esedoglu and Shen [17]:first,a level set approach based on the representation of K as the zero-level set of a sequence of smooth functions that concentrate,and the explicit derivation,using finite differences,of the Euler-Lagrange equations associated with the criterion;the second method addressed by Esedoglu and Shen is a Γ-convergence approach based on a result originally conjectured by De Giorgi and recently proved by Sch¨a tzle.In both cases,the final system of discrete equations is of order four,facing again difficult issues of convergence and stability.More recently,following the work of Grzibovskis and Heintz on the Willmore flow,Esedoglu,Ruuth,and Tsai [16]have addressed the numerical flow associ-ated with the Mumford-Shah-Euler model using a promising convolution/thresholding method that is much easier to handle than the previous approaches.Tschumperl´e proposes in [28]an efficient second-order anisotropic diffusion model for multi-valued image regularization and inpainting.Given a R N -valued image u known outside Ω,and starting from an initial rough inpainting obtained by straightforward advection of boundary values,the pixels in the inpainting domain are iteratively updated according to a finite difference approximation to the equations ∂u i ∂t=trace(T ∇2u i ),i ∈{1,···,N }.Here,T is the tensor field defined asT =1(1+λmin +λmax )α1v min ⊗v min +1(1+λmin +λmax )α2v max ⊗v max ,with 0<α1<<α2,and λmin ,λmax ,v min ,v max are the eigenvalues and eigen-vectors,respectively,of G σ∗ N i =1∇u i ⊗∇u i ,being G σa smoothing kernel and Ni =1∇u i ⊗∇u i the classical structure tensor,that is known for representing well the local geometry of u .Figure 12reproduces an experiment taken from Tschumperl´e [28].Fig.12.An inpainting experiment (the middle image is the mask defined by the user)taken from Tschumperl´e [28].The approach of Auroux and Masmoudi in [3]uses the PDE techniques that have been developed for the inverse conductivity problem in the context of crack detection.The link with inpainting is the following:missing edges are modeled as cracks and the image is assumed to be smooth out of these cracks.Given a crack,two inpainting candidates can be obtained as the solutions of the Laplace equation with Neumann condition along the crack and either a Dirichlet,or a Neumann condition on the domain’s boundary.The optimal cracks are those for which the two candidates are the most similar in quadratic norm,and theycan be found through topological analysis,i.e.they correspond to the set of points where putting a crack mostly decreases the quadratic difference.Both the localization of the cracks and the associated piecewise smooth inpainting solutions can be found using fast and simplefinite differences schemes.Finally,Bornemann and M¨a rz propose in[9]afirst-order model to advect the image information along the integral curves of a coherence vectorfield that extends inΩthe dominant directions of the image gradient.This coherence field is explicitly defined,at every point,as the normalized eigenvector to the minimal eigenvalue of a smoothed structure tensor whose computation carefully avoids boundary biases in the vicinity of∂Ω.Denoting c the coherencefield, Bornemann and M¨a rz show that the equation c·∇u=0with Dirichlet boundary constraint can be obtained as the vanishing viscosity limit of an efficient fast-marching scheme:the pixels inΩare synthezised one at a time,according to their distance to the boundary.The new value at a pixel p is a linear combination of both known and previously generated values in a neighborhood of p.The key ingredient of the method is the explicit definition of the linear weights according to the coherencefield c.Although the Bornemann-M¨a rz model requires a careful tune of four parameters,it is much faster than the PDE approaches mentioned so far,and performs very well,as illustrated in Figure13Fig.13.An inpainting experiment taken from Bornemann and M¨a rz[9],with a reported computation time of0.4sec.Combining and extending PDEs and patch modelsIn general,most PDE/variational methods that have been presented so far per-form well for inpainting either thin or sparsely distributed domains.However, there is a common drawback to all these methods:they are unable to restore texture properly,and this is particularly visible on large inpainting domains,like for instance in the inpainting result of Figure12where the diffusion method is not able to recover the parrot’s texture.On the other hand,patch-based meth-ods are not able to handle sparse inpainting domains like in Figure14,where no valid squared patch can be found that does not reduce to a point.On the contrary,most PDE/variational methods remain applicable in such situation, like in Figure14where the model proposed by Masnou and Morel[23]yields the。
基于非局部相似和低秩模型的图像盲去噪于静;杨晓梅【摘要】针对目前大多图像去噪算法的性能依赖输入噪声水平参数的问题,为进一步提高去噪效果,提出一种改进的基于非局部相似和低秩模型的图像盲去噪方法.预先估计图像的全局噪声方差,在图像非局部相似和低秩模型的框架下,自适应地估计各图像块的局部噪声方差,确定各图像块奇异值阈值(SVT)的局部阈值参数,运用迭代规则完成去噪.为验证该方法的有效性,与3种目前较成熟的去噪算法进行仿真对比.仿真结果表明,对于噪声方差未知的图像,该方法的去噪效果在视觉、峰值信噪比(PSNR)和结构相似度(SSIM)的数据上更具优势,具有更好的自适应能力,更适合应用于实际图像去噪问题.【期刊名称】《计算机工程与设计》【年(卷),期】2016(037)004【总页数】5页(P959-963)【关键词】图像盲去噪;自适应;非局部相似;低秩;奇异值阈值【作者】于静;杨晓梅【作者单位】四川大学电气信息学院,四川成都610065;四川大学电气信息学院,四川成都610065【正文语种】中文【中图分类】TN911.72近年来,稀疏和非局部模型广泛应用于信号和图像处理领域。
稀疏模型是用一组基或字典中少数原子的线性组合来表示信号[1],如基于稀疏表示和字典学习滤波的(K-singular value decomposition,KSVD)[1]方法,其强调样本在字典下的稀疏表示,但忽略了邻域的非局部信息。
非局部模型是利用图像结构的自相似性恢复原始图像[2],如非局部均值滤波(nonlocal means,NLM)[2]和三维块匹配(block-matching and 3D filtering,BM3D)[3]算法。
随后出现了联合稀疏和低秩模型,解决了多个样本间的关联性。
低秩可以解释为约束矩阵奇异值的稀疏性,最早出现在矩阵填充中,在Candes[4]等的研究下得到了发展。
结合图像非局部信息和低秩模型的空间自适应迭代奇异值阈值(spatially adaptive iterative singular value thresholding,SAIST)[5]方法在图像去噪中有很好的应用。
非扩张映像与伪压缩映像的迭代方法研究的开题报告题目:非扩张映像与伪压缩映像的迭代方法研究背景:在图像处理领域中,图像的压缩与恢复是非常重要的研究方向。
传统的压缩算法主要是基于离散余弦变换(DCT)和小波变换(Wavelet)等技术,这些算法的运算复杂度较高,计算时间较长。
为了提高图像处理的效率,近年来,研究人员提出了非扩张映像算法(Non-expansive imaging)和伪压缩映像算法(Pseudo-compressive imaging)等迭代算法,以减少计算量并提高图像的处理速度。
非扩张映像算法是一种基于投影算子的迭代算法,采用非线性映射来将原始图像缩小,同时保持其在像素空间中的特征不变。
这种算法不需要进行像素值的转换,因此可以保持图像的质量和清晰度。
伪压缩映像算法则是一种新型的压缩算法,它采用随机映射来将原始图像投影到一个低维度空间中,从而减少所需的存储空间和计算时间。
通过采集重构算法,可以将压缩后的图像恢复到原始质量。
研究目的:本文旨在研究非扩张映像算法和伪压缩映像算法,探究它们在图像压缩与恢复中的应用,并提出相应的迭代方法,使压缩后的图像在恢复时能够保持清晰度和精度。
研究内容:1. 综述非扩张映像算法和伪压缩映像算法的原理和应用;2. 基于非扩张映像算法和伪压缩映像算法,提出相应的迭代方法;3. 构建样本数据集,用研究的迭代方法进行实验,并评估算法的效果;4. 分析实验结果,总结研究结论,并探究进一步的应用场景。
研究意义:非扩张映像算法和伪压缩映像算法是图像处理领域内的先进技术,可以应用于多个领域,如图像储存、图像传输等。
本文通过对这两种算法进行深入研究,提出迭代方法并进行实验,有助于提高图像处理的速度和质量,为相关应用的研究提供指导和参考。
预期成果:1. 提出基于非扩张映像算法和伪压缩映像算法的迭代方法;2. 构建样本数据集,用研究的迭代方法进行实验,并评估算法的效果;3. 发表相关论文和会议论文,将研究成果推广到工业领域。
第41卷 第4期吉林大学学报(信息科学版)Vol.41 No.42023年7月Journal of Jilin University (Information Science Edition)July 2023文章编号:1671⁃5896(2023)04⁃0621⁃10特征更新的动态图卷积表面损伤点云分割方法收稿日期:2022⁃09⁃21基金项目:国家自然科学基金资助项目(61573185)作者简介:张闻锐(1998 ),男,江苏扬州人,南京航空航天大学硕士研究生,主要从事点云分割研究,(Tel)86⁃188****8397(E⁃mail)839357306@;王从庆(1960 ),男,南京人,南京航空航天大学教授,博士生导师,主要从事模式识别与智能系统研究,(Tel)86⁃130****6390(E⁃mail)cqwang@㊂张闻锐,王从庆(南京航空航天大学自动化学院,南京210016)摘要:针对金属部件表面损伤点云数据对分割网络局部特征分析能力要求高,局部特征分析能力较弱的传统算法对某些数据集无法达到理想的分割效果问题,选择采用相对损伤体积等特征进行损伤分类,将金属表面损伤分为6类,提出一种包含空间尺度区域信息的三维图注意力特征提取方法㊂将得到的空间尺度区域特征用于特征更新网络模块的设计,基于特征更新模块构建出了一种特征更新的动态图卷积网络(Feature Adaptive Shifting⁃Dynamic Graph Convolutional Neural Networks)用于点云语义分割㊂实验结果表明,该方法有助于更有效地进行点云分割,并提取点云局部特征㊂在金属表面损伤分割上,该方法的精度优于PointNet ++㊁DGCNN(Dynamic Graph Convolutional Neural Networks)等方法,提高了分割结果的精度与有效性㊂关键词:点云分割;动态图卷积;特征更新;损伤分类中图分类号:TP391.41文献标志码:A Cloud Segmentation Method of Surface Damage Point Based on Feature Adaptive Shifting⁃DGCNNZHANG Wenrui,WANG Congqing(School of Automation,Nanjing University of Aeronautics and Astronautics,Nanjing 210016,China)Abstract :The cloud data of metal part surface damage point requires high local feature analysis ability of the segmentation network,and the traditional algorithm with weak local feature analysis ability can not achieve the ideal segmentation effect for the data set.The relative damage volume and other features are selected to classify the metal surface damage,and the damage is divided into six categories.This paper proposes a method to extract the attention feature of 3D map containing spatial scale area information.The obtained spatial scale area feature is used in the design of feature update network module.Based on the feature update module,a feature updated dynamic graph convolution network is constructed for point cloud semantic segmentation.The experimental results show that the proposed method is helpful for more effective point cloud segmentation to extract the local features of point cloud.In metal surface damage segmentation,the accuracy of this method is better than pointnet++,DGCNN(Dynamic Graph Convolutional Neural Networks)and other methods,which improves the accuracy and effectiveness of segmentation results.Key words :point cloud segmentation;dynamic graph convolution;feature adaptive shifting;damage classification 0 引 言基于深度学习的图像分割技术在人脸㊁车牌识别和卫星图像分析领域已经趋近成熟,为获取物体更226吉林大学学报(信息科学版)第41卷完整的三维信息,就需要利用三维点云数据进一步完善语义分割㊂三维点云数据具有稀疏性和无序性,其独特的几何特征分布和三维属性使点云语义分割在许多领域的应用都遇到困难㊂如在机器人与计算机视觉领域使用三维点云进行目标检测与跟踪以及重建;在建筑学上使用点云提取与识别建筑物和土地三维几何信息;在自动驾驶方面提供路面交通对象㊁道路㊁地图的采集㊁检测和分割功能㊂2017年,Lawin等[1]将点云投影到多个视图上分割再返回点云,在原始点云上对投影分割结果进行分析,实现对点云的分割㊂最早的体素深度学习网络产生于2015年,由Maturana等[2]创建的VOXNET (Voxel Partition Network)网络结构,建立在三维点云的体素表示(Volumetric Representation)上,从三维体素形状中学习点的分布㊂结合Le等[3]提出的点云网格化表示,出现了类似PointGrid的新型深度网络,集成了点与网格的混合高效化网络,但体素化的点云面对大量点数的点云文件时表现不佳㊂在不规则的点云向规则的投影和体素等过渡态转换过程中,会出现很多空间信息损失㊂为将点云自身的数据特征发挥完善,直接输入点云的基础网络模型被逐渐提出㊂2017年,Qi等[4]利用点云文件的特性,开发了直接针对原始点云进行特征学习的PointNet网络㊂随后Qi等[5]又提出了PointNet++,针对PointNet在表示点与点直接的关联性上做出改进㊂Hu等[6]提出SENET(Squeeze⁃and⁃Excitation Networks)通过校准通道响应,为三维点云深度学习引入通道注意力网络㊂2018年,Li等[7]提出了PointCNN,设计了一种X⁃Conv模块,在不显著增加参数数量的情况下耦合较远距离信息㊂图卷积网络[8](Graph Convolutional Network)是依靠图之间的节点进行信息传递,获得图之间的信息关联的深度神经网络㊂图可以视为顶点和边的集合,使每个点都成为顶点,消耗的运算量是无法估量的,需要采用K临近点计算方式[9]产生的边缘卷积层(EdgeConv)㊂利用中心点与其邻域点作为边特征,提取边特征㊂图卷积网络作为一种点云深度学习的新框架弥补了Pointnet等网络的部分缺陷[10]㊂针对非规律的表面损伤这种特征缺失类点云分割,人们已经利用各种二维图像采集数据与卷积神经网络对风扇叶片㊁建筑和交通工具等进行损伤检测[11],损伤主要类别是裂痕㊁表面漆脱落等㊂但二维图像分割涉及的损伤种类不够充分,可能受物体表面污染㊁光线等因素影响,将凹陷㊁凸起等损伤忽视,或因光照不均匀判断为脱漆㊂笔者提出一种基于特征更新的动态图卷积网络,主要针对三维点云分割,设计了一种新型的特征更新模块㊂利用三维点云独特的空间结构特征,对传统K邻域内权重相近的邻域点采用空间尺度进行区分,并应用于对金属部件表面损伤分割的有用与无用信息混杂的问题研究㊂对邻域点进行空间尺度划分,将注意力权重分组,组内进行特征更新㊂在有效鉴别外邻域干扰特征造成的误差前提下,增大特征提取面以提高局部区域特征有用性㊂1 深度卷积网络计算方法1.1 包含空间尺度区域信息的三维图注意力特征提取方法由迭代最远点采集算法将整片点云分割为n个点集:{M1,M2,M3, ,M n},每个点集包含k个点:{P1, P2,P3, ,P k},根据点集内的空间尺度关系,将局部区域划分为不同的空间区域㊂在每个区域内,结合局部特征与空间尺度特征,进一步获得更有区分度的特征信息㊂根据注意力机制,为K邻域内的点分配不同的权重信息,特征信息包括空间区域内点的分布和区域特性㊂将这些特征信息加权计算,得到点集的卷积结果㊂使用空间尺度区域信息的三维图注意力特征提取方式,需要设定合适的K邻域参数K和空间划分层数R㊂如果K太小,则会导致弱分割,因不能完全利用局部特征而影响结果准确性;如果K太大,会增加计算时间与数据量㊂图1为缺损损伤在不同参数K下的分割结果图㊂由图1可知,在K=30或50时,分割结果效果较好,K=30时计算量较小㊂笔者选择K=30作为实验参数㊂在分析确定空间划分层数R之前,简要分析空间层数划分所应对的问题㊂三维点云所具有的稀疏性㊁无序性以及损伤点云自身噪声和边角点多的特性,导致了点云处理中可能出现的共同缺点,即将离群值点云选为邻域内采样点㊂由于损伤表面多为一个面,被分割出的损伤点云应在该面上分布,而噪声点则被分布在整个面的两侧,甚至有部分位于损伤内部㊂由于点云噪声这种立体分布的特征,导致了离群值被选入邻域内作为采样点存在㊂根据采用DGCNN(Dynamic Graph Convolutional Neural Networks)分割网络抽样实验结果,位于切面附近以及损伤内部的离群值点对点云分割结果造成的影响最大,被错误分割为特征点的几率最大,在后续预处理过程中需要对这种噪声点进行优先处理㊂图1 缺损损伤在不同参数K 下的分割结果图Fig.1 Segmentation results of defect damage under different parameters K 基于上述实验结果,在参数K =30情况下,选择空间划分层数R ㊂缺损损伤在不同参数R 下的分割结果如图2所示㊂图2b 的结果与测试集标签分割结果更为相似,更能体现损伤的特征,同时屏蔽了大部分噪声㊂因此,选择R =4作为实验参数㊂图2 缺损损伤在不同参数R 下的分割结果图Fig.2 Segmentation results of defect damage under different parameters R 在一个K 邻域内,邻域点与中心点的空间关系和特征差异最能表现邻域点的权重㊂空间特征系数表示邻域点对中心点所在点集的重要性㊂同时,为更好区分图内邻域点的权重,需要将整个邻域细分㊂以空间尺度进行细分是较为合适的分类方式㊂中心点的K 邻域可视为一个局部空间,将其划分为r 个不同的尺度区域㊂再运算空间注意力机制,为这r 个不同区域的权重系数赋值㊂按照空间尺度多层次划分,不仅没有损失核心的邻域点特征,还能有效抑制无意义的㊁有干扰性的特征㊂从而提高了深度学习网络对点云的局部空间特征的学习能力,降低相邻邻域之间的互相影响㊂空间注意力机制如图3所示,计算步骤如下㊂第1步,计算特征系数e mk ㊂该值表示每个中心点m 的第k 个邻域点对其中心点的权重㊂分别用Δp mk 和Δf mk 表示三维空间关系和局部特征差异,M 表示MLP(Multi⁃Layer Perceptrons)操作,C 表示concat 函数,其中Δp mk =p mk -p m ,Δf mk =M (f mk )-M (f m )㊂将两者合并后输入多层感知机进行计算,得到计算特征系数326第4期张闻锐,等:特征更新的动态图卷积表面损伤点云分割方法图3 空间尺度区域信息注意力特征提取方法示意图Fig.3 Schematic diagram of attention feature extraction method for spatial scale regional information e mk =M [C (Δp mk ‖Δf mk )]㊂(1) 第2步,计算图权重系数a mk ㊂该值表示每个中心点m 的第k 个邻域点对其中心点的权重包含比㊂其中k ∈{1,2,3, ,K },K 表示每个邻域所包含点数㊂需要对特征系数e mk 进行归一化,使用归一化指数函数S (Softmax)得到权重多分类的结果,即计算图权重系数a mk =S (e mk )=exp(e mk )/∑K g =1exp(e mg )㊂(2) 第3步,用空间尺度区域特征s mr 表示中心点m 的第r 个空间尺度区域的特征㊂其中k r ∈{1,2,3, ,K r },K r 表示第r 个空间尺度区域所包含的邻域点数,并在其中加入特征偏置项b r ,避免权重化计算的特征在动态图中累计单面误差指向,空间尺度区域特征s mr =∑K r k r =1[a mk r M (f mk r )]+b r ㊂(3) 在r 个空间尺度区域上进行计算,就可得到点m 在整个局部区域的全部空间尺度区域特征s m ={s m 1,s m 2,s m 3, ,s mr },其中r ∈{1,2,3, ,R }㊂1.2 基于特征更新的动态图卷积网络动态图卷积网络是一种能直接处理原始三维点云数据输入的深度学习网络㊂其特点是将PointNet 网络中的复合特征转换模块(Feature Transform),改进为由K 邻近点计算(K ⁃Near Neighbor)和多层感知机构成的边缘卷积层[12]㊂边缘卷积层功能强大,其提取的特征不仅包含全局特征,还拥有由中心点与邻域点的空间位置关系构成的局部特征㊂在动态图卷积网络中,每个邻域都视为一个点集㊂增强对其中心点的特征学习能力,就会增强网络整体的效果[13]㊂对一个邻域点集,对中心点贡献最小的有效局部特征的边缘点,可以视为异常噪声点或低权重点,可能会给整体分割带来边缘溢出㊂点云相比二维图像是一种信息稀疏并且噪声含量更大的载体㊂处理一个局域内的噪声点,将其直接剔除或简单采纳会降低特征提取效果,笔者对其进行低权重划分,并进行区域内特征更新,增强抗噪性能,也避免点云信息丢失㊂在空间尺度区域中,在区域T 内有s 个点x 被归为低权重系数组,该点集的空间信息集为P ∈R N s ×3㊂点集的局部特征集为F ∈R N s ×D f [14],其中D f 表示特征的维度空间,N s 表示s 个域内点的集合㊂设p i 以及f i 为点x i 的空间信息和特征信息㊂在点集内,对点x i 进行小范围内的N 邻域搜索,搜索其邻域点㊂则点x i 的邻域点{x i ,1,x i ,2, ,x i ,N }∈N (x i ),其特征集合为{f i ,1,f i ,2, ,f i ,N }∈F ㊂在利用空间尺度进行区域划分后,对空间尺度区域特征s mt 较低的区域进行区域内特征更新,通过聚合函数对权重最低的邻域点在图中的局部特征进行改写㊂已知中心点m ,点x i 的特征f mx i 和空间尺度区域特征s mt ,目的是求出f ′mx i ,即中心点m 的低权重邻域点x i 在进行邻域特征更新后得到的新特征㊂对区域T 内的点x i ,∀x i ,j ∈H (x i ),x i 与其邻域H 内的邻域点的特征相似性域为R (x i ,x i ,j )=S [C (f i ,j )T C (f i ,j )/D o ],(4)其中C 表示由输入至输出维度的一维卷积,D o 表示输出维度值,T 表示转置㊂从而获得更新后的x i 的426吉林大学学报(信息科学版)第41卷特征㊂对R (x i ,x i ,j )进行聚合,并将特征f mx i 维度变换为输出维度f ′mx i =∑[R (x i ,x i ,j )S (s mt f mx i )]㊂(5) 图4为特征更新网络模块示意图,展示了上述特征更新的计算过程㊂图5为特征更新的动态图卷积网络示意图㊂图4 特征更新网络模块示意图Fig.4 Schematic diagram of feature update network module 图5 特征更新的动态图卷积网络示意图Fig.5 Flow chart of dynamic graph convolution network with feature update 动态图卷积网络(DGCNN)利用自创的边缘卷积层模块,逐层进行边卷积[15]㊂其前一层的输出都会动态地产生新的特征空间和局部区域,新一层从前一层学习特征(见图5)㊂在每层的边卷积模块中,笔者在边卷积和池化后加入了空间尺度区域注意力特征,捕捉特定空间区域T 内的邻域点,用于特征更新㊂特征更新会降低局域异常值点对局部特征的污染㊂网络相比传统图卷积神经网络能获得更多的特征信息,并且在面对拥有较多噪声值的点云数据时,具有更好的抗干扰性[16],在对性质不稳定㊁不平滑并含有需采集分割的突出中心的点云数据时,会有更好的抗干扰效果㊂相比于传统预处理方式,其稳定性更强,不会发生将突出部分误分割或漏分割的现象[17]㊂2 实验结果与分析点云分割的精度评估指标主要由两组数据构成[18],即平均交并比和总体准确率㊂平均交并比U (MIoU:Mean Intersection over Union)代表真实值和预测值合集的交并化率的平均值,其计算式为526第4期张闻锐,等:特征更新的动态图卷积表面损伤点云分割方法U =1T +1∑Ta =0p aa ∑Tb =0p ab +∑T b =0p ba -p aa ,(6)其中T 表示类别,a 表示真实值,b 表示预测值,p ab 表示将a 预测为b ㊂总体准确率A (OA:Overall Accuracy)表示所有正确预测点p c 占点云模型总体数量p all 的比,其计算式为A =P c /P all ,(7)其中U 与A 数值越大,表明点云分割网络越精准,且有U ≤A ㊂2.1 实验准备与数据预处理实验使用Kinect V2,采用Depth Basics⁃WPF 模块拍摄金属部件损伤表面获得深度图,将获得的深度图进行SDK(Software Development Kit)转化,得到pcd 格式的点云数据㊂Kinect V2采集的深度图像分辨率固定为512×424像素,为获得更清晰的数据图像,需尽可能近地采集数据㊂选择0.6~1.2m 作为采集距离范围,从0.6m 开始每次增加0.2m,获得多组采量数据㊂点云中分布着噪声,如果不对点云数据进行过滤会对后续处理产生不利影响㊂根据统计原理对点云中每个点的邻域进行分析,再建立一个特别设立的标准差㊂然后将实际点云的分布与假设的高斯分布进行对比,实际点云中误差超出了标准差的点即被认为是噪声点[19]㊂由于点云数据量庞大,为提高效率,选择采用如下改进方法㊂计算点云中每个点与其首个邻域点的空间距离L 1和与其第k 个邻域点的空间距离L k ㊂比较每个点之间L 1与L k 的差,将其中差值最大的1/K 视为可能噪声点[20]㊂计算可能噪声点到其K 个邻域点的平均值,平均值高出标准差的被视为噪声点,将离群噪声点剔除后完成对点云的滤波㊂2.2 金属表面损伤点云关键信息提取分割方法对点云损伤分割,在制作点云数据训练集时,如果只是单一地将所有损伤进行统一标记,不仅不方便进行结果分析和应用,而且也会降低特征分割的效果㊂为方便分析和控制分割效果,需要使用ArcGIS 将点云模型转化为不规则三角网TIN(Triangulated Irregular Network)㊂为精确地分类损伤,利用图6 不规则三角网模型示意图Fig.6 Schematic diagram of triangulated irregular networkTIN 的表面轮廓性质,获得训练数据损伤点云的损伤内(外)体积,损伤表面轮廓面积等㊂如图6所示㊂选择损伤体积指标分为相对损伤体积V (RDV:Relative Damege Volume)和邻域内相对损伤体积比N (NRDVR:Neighborhood Relative Damege Volume Ratio)㊂计算相对平均深度平面与点云深度网格化平面之间的部分,得出相对损伤体积㊂利用TIN 邻域网格可获取某损伤在邻域内的相对深度占比,有效解决制作测试集时,将因弧度或是形状造成的相对深度判断为损伤的问题㊂两种指标如下:V =∑P d k =1h k /P d -∑P k =1h k /()P S d ,(8)N =P n ∑P d k =1h k S d /P d ∑P n k =1h k S ()n -()1×100%,(9)其中P 表示所有点云数,P d 表示所有被标记为损伤的点云数,P n 表示所有被认定为损伤邻域内的点云数;h k 表示点k 的深度值;S d 表示损伤平面面积,S n 表示损伤邻域平面面积㊂在获取TIN 标准包络网视图后,可以更加清晰地描绘损伤情况,同时有助于量化损伤严重程度㊂笔者将损伤分为6种类型,并利用计算得出的TIN 指标进行损伤分类㊂同时,根据损伤部分体积与非损伤部分体积的关系,制定指标损伤体积(SDV:Standard Damege Volume)区分损伤类别㊂随机抽选5个测试组共50张图作为样本㊂统计非穿透损伤的RDV 绝对值,其中最大的30%标记为凹陷或凸起,其余626吉林大学学报(信息科学版)第41卷标记为表面损伤,并将样本分类的标准分界值设为SDV㊂在设立以上标准后,对凹陷㊁凸起㊁穿孔㊁表面损伤㊁破损和缺损6种金属表面损伤进行分类,金属表面损伤示意图如图7所示㊂首先,根据损伤是否产生洞穿,将损伤分为两大类㊂非贯通伤包括凹陷㊁凸起和表面损伤,贯通伤包括穿孔㊁破损和缺损㊂在非贯通伤中,凹陷和凸起分别采用相反数的SDV 作为标准,在这之间的被分类为表面损伤㊂贯通伤中,以损伤部分平面面积作为参照,较小的分类为穿孔,较大的分类为破损,而在边缘处因腐蚀㊁碰撞等原因缺角㊁内损的分类为缺损㊂分类参照如表1所示㊂图7 金属表面损伤示意图Fig.7 Schematic diagram of metal surface damage表1 损伤类别分类Tab.1 Damage classification 损伤类别凹陷凸起穿孔表面损伤破损缺损是否形成洞穿××√×√√RDV 绝对值是否达到SDV √√\×\\S d 是否达到标准\\×\√\2.3 实验结果分析为验证改进的图卷积深度神经网络在点云语义分割上的有效性,笔者采用TensorFlow 神经网络框架进行模型测试㊂为验证深度网络对损伤分割的识别准确率,采集了带有损伤特征的金属部件损伤表面点云,对点云进行预处理㊂对若干金属部件上的多个样本金属面的点云数据进行筛选,删除损伤占比低于5%或高于60%的数据后,划分并装包制作为点云数据集㊂采用CloudCompare 软件对样本金属上的损伤部分进行分类标记,共分为6种如上所述损伤㊂部件损伤的数据集制作参考点云深度学习领域广泛应用的公开数据集ModelNet40part㊂分割数据集包含了多种类型的金属部件损伤数据,这些损伤数据显示在510张总点云图像数据中㊂点云图像种类丰富,由各种包含损伤的金属表面构成,例如金属门,金属蒙皮,机械构件外表面等㊂用ArcGIS 内相关工具将总图进行随机点拆分,根据数据集ModelNet40part 的规格,每个独立的点云数据组含有1024个点,将所有总图拆分为510×128个单元点云㊂将样本分为400个训练集与110个测试集,采用交叉验证方法以保证测试的充分性[20],对多种方法进行评估测试,实验结果由单元点云按原点位置重新组合而成,并带有拆分后对单元点云进行的分割标记㊂分割结果比较如图8所示㊂726第4期张闻锐,等:特征更新的动态图卷积表面损伤点云分割方法图8 分割结果比较图Fig.8 Comparison of segmentation results在部件损伤分割的实验中,将不同网络与笔者网络(FAS⁃DGCNN:Feature Adaptive Shifting⁃Dynamic Graph Convolutional Neural Networks)进行对比㊂除了采用不同的分割网络外,其余实验均采用与改进的图卷积深度神经网络方法相同的实验设置㊂实验结果由单一损伤交并比(IoU:Intersection over Union),平均损伤交并比(MIoU),单一损伤准确率(Accuracy)和总体损伤准确率(OA)进行评价,结果如表2~表4所示㊂将6种不同损伤类别的Accuracy 与IoU 进行对比分析,可得出结论:相比于基准实验网络Pointet++,笔者在OA 和MioU 方面分别在贯通伤和非贯通伤上有10%和20%左右的提升,在整体分割指标上,OA 能达到90.8%㊂对拥有更多点数支撑,含有较多点云特征的非贯通伤,几种点云分割网络整体性能均能达到90%左右的效果㊂而不具有局部特征识别能力的PointNet 在贯通伤上的表现较差,不具备有效的分辨能力,导致分割效果相对于其他损伤较差㊂表2 损伤部件分割准确率性能对比 Tab.2 Performance comparison of segmentation accuracy of damaged parts %实验方法准确率凹陷⁃1凸起⁃2穿孔⁃3表面损伤⁃4破损⁃5缺损⁃6Ponitnet 82.785.073.880.971.670.1Pointnet++88.786.982.783.486.382.9DGCNN 90.488.891.788.788.687.1FAS⁃DGCNN 92.588.892.191.490.188.6826吉林大学学报(信息科学版)第41卷表3 损伤部件分割交并比性能对比 Tab.3 Performance comparison of segmentation intersection ratio of damaged parts %IoU 准确率凹陷⁃1凸起⁃2穿孔⁃3表面损伤⁃4破损⁃5缺损⁃6PonitNet80.582.770.876.667.366.9PointNet++86.384.580.481.184.280.9DGCNN 88.786.589.986.486.284.7FAS⁃DGCNN89.986.590.388.187.385.7表4 损伤分割的整体性能对比分析 出,动态卷积图特征以及有效的邻域特征更新与多尺度注意力给分割网络带来了更优秀的局部邻域分割能力,更加适应表面损伤分割的任务要求㊂3 结 语笔者利用三维点云独特的空间结构特征,将传统K 邻域内权重相近的邻域点采用空间尺度进行区分,并将空间尺度划分运用于邻域内权重分配上,提出了一种能将邻域内噪声点降权筛除的特征更新模块㊂采用此模块的动态图卷积网络在分割上表现出色㊂利用特征更新的动态图卷积网络(FAS⁃DGCNN)能有效实现金属表面损伤的分割㊂与其他网络相比,笔者方法在点云语义分割方面表现出更高的可靠性,可见在包含空间尺度区域信息的注意力和局域点云特征更新下,笔者提出的基于特征更新的动态图卷积网络能发挥更优秀的作用,而且相比缺乏局部特征提取能力的分割网络,其对于点云稀疏㊁特征不明显的非贯通伤有更优的效果㊂参考文献:[1]LAWIN F J,DANELLJAN M,TOSTEBERG P,et al.Deep Projective 3D Semantic Segmentation [C]∥InternationalConference on Computer Analysis of Images and Patterns.Ystad,Sweden:Springer,2017:95⁃107.[2]MATURANA D,SCHERER S.VoxNet:A 3D Convolutional Neural Network for Real⁃Time Object Recognition [C]∥Proceedings of IEEE /RSJ International Conference on Intelligent Robots and Systems.Hamburg,Germany:IEEE,2015:922⁃928.[3]LE T,DUAN Y.PointGrid:A Deep Network for 3D Shape Understanding [C]∥2018IEEE /CVF Conference on ComputerVision and Pattern Recognition (CVPR).Salt Lake City,USA:IEEE,2018:9204⁃9214.[4]QI C R,SU H,MO K,et al.PointNet:Deep Learning on Point Sets for 3D Classification and Segmentation [C]∥IEEEConference on Computer Vision and Pattern Recognition (CVPR).Hawaii,USA:IEEE,2017:652⁃660.[5]QI C R,SU H,MO K,et al,PointNet ++:Deep Hierarchical Feature Learning on Point Sets in a Metric Space [C]∥Advances in Neural Information Processing Systems.California,USA:SpringerLink,2017:5099⁃5108.[6]HU J,SHEN L,SUN G,Squeeze⁃and⁃Excitation Networks [C ]∥IEEE Conference on Computer Vision and PatternRecognition.Vancouver,Canada:IEEE,2018:7132⁃7141.[7]LI Y,BU R,SUN M,et al.PointCNN:Convolution on X⁃Transformed Points [C]∥Advances in Neural InformationProcessing Systems.Montreal,Canada:NeurIPS,2018:820⁃830.[8]ANH VIET PHAN,MINH LE NGUYEN,YEN LAM HOANG NGUYEN,et al.DGCNN:A Convolutional Neural Networkover Large⁃Scale Labeled Graphs [J].Neural Networks,2018,108(10):533⁃543.[9]任伟建,高梦宇,高铭泽,等.基于混合算法的点云配准方法研究[J].吉林大学学报(信息科学版),2019,37(4):408⁃416.926第4期张闻锐,等:特征更新的动态图卷积表面损伤点云分割方法036吉林大学学报(信息科学版)第41卷REN W J,GAO M Y,GAO M Z,et al.Research on Point Cloud Registration Method Based on Hybrid Algorithm[J]. Journal of Jilin University(Information Science Edition),2019,37(4):408⁃416.[10]ZHANG K,HAO M,WANG J,et al.Linked Dynamic Graph CNN:Learning on Point Cloud via Linking Hierarchical Features[EB/OL].[2022⁃03⁃15].https:∥/stamp/stamp.jsp?tp=&arnumber=9665104. [11]林少丹,冯晨,陈志德,等.一种高效的车体表面损伤检测分割算法[J].数据采集与处理,2021,36(2):260⁃269. LIN S D,FENG C,CHEN Z D,et al.An Efficient Segmentation Algorithm for Vehicle Body Surface Damage Detection[J]. Journal of Data Acquisition and Processing,2021,36(2):260⁃269.[12]ZHANG L P,ZHANG Y,CHEN Z Z,et al.Splitting and Merging Based Multi⁃Model Fitting for Point Cloud Segmentation [J].Journal of Geodesy and Geoinformation Science,2019,2(2):78⁃79.[13]XING Z Z,ZHAO S F,GUO W,et al.Processing Laser Point Cloud in Fully Mechanized Mining Face Based on DGCNN[J]. ISPRS International Journal of Geo⁃Information,2021,10(7):482⁃482.[14]杨军,党吉圣.基于上下文注意力CNN的三维点云语义分割[J].通信学报,2020,41(7):195⁃203. YANG J,DANG J S.Semantic Segmentation of3D Point Cloud Based on Contextual Attention CNN[J].Journal on Communications,2020,41(7):195⁃203.[15]陈玲,王浩云,肖海鸿,等.利用FL⁃DGCNN模型估测绿萝叶片外部表型参数[J].农业工程学报,2021,37(13): 172⁃179.CHEN L,WANG H Y,XIAO H H,et al.Estimation of External Phenotypic Parameters of Bunting Leaves Using FL⁃DGCNN Model[J].Transactions of the Chinese Society of Agricultural Engineering,2021,37(13):172⁃179.[16]柴玉晶,马杰,刘红.用于点云语义分割的深度图注意力卷积网络[J].激光与光电子学进展,2021,58(12):35⁃60. CHAI Y J,MA J,LIU H.Deep Graph Attention Convolution Network for Point Cloud Semantic Segmentation[J].Laser and Optoelectronics Progress,2021,58(12):35⁃60.[17]张学典,方慧.BTDGCNN:面向三维点云拓扑结构的BallTree动态图卷积神经网络[J].小型微型计算机系统,2021, 42(11):32⁃40.ZHANG X D,FANG H.BTDGCNN:BallTree Dynamic Graph Convolution Neural Network for3D Point Cloud Topology[J]. Journal of Chinese Computer Systems,2021,42(11):32⁃40.[18]张佳颖,赵晓丽,陈正.基于深度学习的点云语义分割综述[J].激光与光电子学,2020,57(4):28⁃46. ZHANG J Y,ZHAO X L,CHEN Z.A Survey of Point Cloud Semantic Segmentation Based on Deep Learning[J].Lasers and Photonics,2020,57(4):28⁃46.[19]SUN Y,ZHANG S H,WANG T Q,et al.An Improved Spatial Point Cloud Simplification Algorithm[J].Neural Computing and Applications,2021,34(15):12345⁃12359.[20]高福顺,张鼎林,梁学章.由点云数据生成三角网络曲面的区域增长算法[J].吉林大学学报(理学版),2008,46 (3):413⁃417.GAO F S,ZHANG D L,LIANG X Z.A Region Growing Algorithm for Triangular Network Surface Generation from Point Cloud Data[J].Journal of Jilin University(Science Edition),2008,46(3):413⁃417.(责任编辑:刘俏亮)。
航天返回与遥感第44卷第6期130 SPACECRAFT RECOVERY & REMOTE SENSING2023年12月基于单像素成像的遥感图像分辨率增强模型陈瑞林章博段熙锴孙鸣捷*(北京航空航天大学仪器科学与光电工程学院,北京100191)摘要目前对地遥感的最主要途径之一便是通过遥感相机获得目标物信息,然而遥感相机的分辨率直接影响成像质量。
结合遥感相机的推扫式成像技术,文章提出了一种基于单像素成像的超分辨增强技术模型,该模型能够简化重建过程,其设计目标是基于单像素超分辨的技术手段将航天遥感相机的图像分辨率增强4倍。
为了验证该设计思想及其重建效果,文章设置了超分辨增强仿真试验,最终仿真试验结果表明,基于单像素的超分辨模型可以将图像的信噪比提高1.1倍,且重建的图像具有明显的抑制噪声的效果,起到了良好的降噪功能,相较于其他传统图像分辨率增强方法(如双三次内插、超深超分辨神经网络)具有更高的优越性。
该方法可为地理遥感探测、土地资源探查与管理、气象观测与预测、目标毁伤情况实时评估等诸多领域的图像处理和应用提供有力支持。
关键词单像素超分辨分辨率增强推扫式成像降噪效果遥感应用中图分类号: TP751.2文献标志码: A 文章编号: 1009-8518(2023)06-0130-10 DOI: 10.3969/j.issn.1009-8518.2023.06.012Remote Sensing Image Resolution Enhancement Technology Based onSingle-Pixel ImagingCHEN Ruilin ZHANG Bo DUAN Xikai SUN Mingjie*(School of Instrument Science and Optoelectronics Engineering, Beijing University of Aeronautics and Astronautics,Beijing 100191, China)Abstract At present, one of the most important ways of earth remote sensing is to obtain target information through remote sensing cameras, but the resolution of remote sensing cameras directly affects the imaging quality. Combined with the pushbroom imaging technology of remote sensing camera, this paper proposes a super-resolution enhancement technology model based on single-pixel imaging, which can simplify the reconstruction process, and its design goal is to enhance the image resolution of aerospace remote sensing camera by 4 times based on single-pixel super-resolution technology. In order to verify the design idea and its reconstruction effect, the super-resolution enhancement simulation experiment is set up, and the final simulation results show that the single-pixel super-resolution model can improve the signal-to-noise ratio of the image by 1.1 times, and the reconstructed image has the obvious effect of suppressing noise, which plays a good noise reduction function, and has higher superiority than other收稿日期:2023-06-30基金项目:国家自然科学基金委项目(U21B2034)引用格式:陈瑞林, 章博, 段熙锴, 等. 基于单像素成像的遥感图像分辨率增强模型[J]. 航天返回与遥感, 2023, 44(6): 130-139.CHEN Ruilin, ZHANG Bo, DUAN Xikai, et al. Remote Sensing Image Resolution Enhancement Technology Based on Single-Pixel Imaging[J]. Spacecraft Recovery & Remote Sensing, 2023, 44(6): 130-139. (in Chinese)第6期陈瑞林等: 基于单像素成像的遥感图像分辨率增强模型 131traditional image resolution enhancement methods (such as bicubic interpolation and ultra-deep super-resolution neural network). This method can provide strong support for image processing and application in many fields, such as geographic remote sensing detection, land resources exploration and management, meteorological observation and prediction, and real-time assessment of target damage.Keywords single-pixel super-resolution; resolution enhancement; push-broom imaging; noise reduction effect; remote sensing application0 引言对地遥感成像的主要途径之一就是航天遥感相机,由于其具有覆盖范围广、成像速度快、风险低等优势,在国土资源管理、气象预报、地理测绘等领域发挥着举足轻重的作用。
《ERDAS IMAGE遥感图像处理方法》操作一空间增强(Spatial Enhancement)1卷积增强处理(Convolution)功能:用一个系数矩阵将整个图像按照象元分块进行平均处理,用于改变图像的空间频率特征。
to效果:地物的轮廓和线条勾勒变清晰了。
2非定向边缘增强(Non-directional Edge)功能:应用两个非常通用的滤波器(Sobel 滤波器和Prewitt 滤波器),首先通过两个正交卷积算子(Horizontal算子和Vertical算子)分别对遥感图像进行边缘检测,然后将两个正交结果进行平均化处理。
to效果:效果明显而且强烈分别出邻区不同的部分。
3.聚焦分析(Focal Analysis)功能:使用类似卷积滤波的方法,选择一定的窗口呼函数,对输入图像文件的数值进行多种变换,应用窗口范围内的象元数值计算窗口中心象元的值,达到图像增强的目的。
to效果:深色地方变模糊,浅色地物图象得到增强,但也变得不清晰。
4.纹理分析(Texture Analysis)功能:通过二次变异等分析使图象的纹理结构更加清晰。
to效果:纹理边缘部分十分清晰。
5.自适应滤波(Adaptive Filter)功能:应用自适应滤波器对图像的感兴趣区域进行对比度拉伸处理。
to效果:颜色变浅了。
6.分辨率融合(Resolution Merge)功能:对不同空间分辨率遥感图像的融合处理,使处理后的遥感图像即具有较好的空间分辨率,又具有多光谱特征,达到图象增强的目的。
+ =效果:处理后图象既有高分辨率又有多光谱特征(彩色)。
7.锐化增强处理(Crisp Enhancement)功能:对图像进行卷积滤波处理,使整景图像的亮度得到增强而不使其专题内容发生变化。
效果:区别不大,亮度得到些许增强。
二.辐射增强(Radiometric Enhancement)1.查找表拉伸(LUT Stretch)功能:通过修改图像查找表使输出图像值发生变化。