Fast Anisotropic Smoothing of Multi-Valued Images using Curvature-Preserving PDE's
- 格式:pdf
- 大小:950.62 KB
- 文档页数:18
MULTI-SCALE STRUCTURAL SIMILARITY FOR IMAGE QUALITY ASSESSMENT Zhou Wang1,Eero P.Simoncelli1and Alan C.Bovik2(Invited Paper)1Center for Neural Sci.and Courant Inst.of Math.Sci.,New York Univ.,New York,NY10003 2Dept.of Electrical and Computer Engineering,Univ.of Texas at Austin,Austin,TX78712 Email:zhouwang@,eero.simoncelli@,bovik@ABSTRACTThe structural similarity image quality paradigm is based on the assumption that the human visual system is highly adapted for extracting structural information from the scene,and therefore a measure of structural similarity can provide a good approxima-tion to perceived image quality.This paper proposes a multi-scale structural similarity method,which supplies moreflexibility than previous single-scale methods in incorporating the variations of viewing conditions.We develop an image synthesis method to calibrate the parameters that define the relative importance of dif-ferent scales.Experimental comparisons demonstrate the effec-tiveness of the proposed method.1.INTRODUCTIONObjective image quality assessment research aims to design qual-ity measures that can automatically predict perceived image qual-ity.These quality measures play important roles in a broad range of applications such as image acquisition,compression,commu-nication,restoration,enhancement,analysis,display,printing and watermarking.The most widely used full-reference image quality and distortion assessment algorithms are peak signal-to-noise ra-tio(PSNR)and mean squared error(MSE),which do not correlate well with perceived quality(e.g.,[1]–[6]).Traditional perceptual image quality assessment methods are based on a bottom-up approach which attempts to simulate the functionality of the relevant early human visual system(HVS) components.These methods usually involve1)a preprocessing process that may include image alignment,point-wise nonlinear transform,low-passfiltering that simulates eye optics,and color space transformation,2)a channel decomposition process that trans-forms the image signals into different spatial frequency as well as orientation selective subbands,3)an error normalization process that weights the error signal in each subband by incorporating the variation of visual sensitivity in different subbands,and the vari-ation of visual error sensitivity caused by intra-or inter-channel neighboring transform coefficients,and4)an error pooling pro-cess that combines the error signals in different subbands into a single quality/distortion value.While these bottom-up approaches can conveniently make use of many known psychophysical fea-tures of the HVS,it is important to recognize their limitations.In particular,the HVS is a complex and highly non-linear system and the complexity of natural images is also very significant,but most models of early vision are based on linear or quasi-linear oper-ators that have been characterized using restricted and simplistic stimuli.Thus,these approaches must rely on a number of strong assumptions and generalizations[4],[5].Furthermore,as the num-ber of HVS features has increased,the resulting quality assessment systems have become too complicated to work with in real-world applications,especially for algorithm optimization purposes.Structural similarity provides an alternative and complemen-tary approach to the problem of image quality assessment[3]–[6].It is based on a top-down assumption that the HVS is highly adapted for extracting structural information from the scene,and therefore a measure of structural similarity should be a good ap-proximation of perceived image quality.It has been shown that a simple implementation of this methodology,namely the struc-tural similarity(SSIM)index[5],can outperform state-of-the-art perceptual image quality metrics.However,the SSIM index al-gorithm introduced in[5]is a single-scale approach.We consider this a drawback of the method because the right scale depends on viewing conditions(e.g.,display resolution and viewing distance). In this paper,we propose a multi-scale structural similarity method and introduce a novel image synthesis-based approach to calibrate the parameters that weight the relative importance between differ-ent scales.2.SINGLE-SCALE STRUCTURAL SIMILARITYLet x={x i|i=1,2,···,N}and y={y i|i=1,2,···,N}be two discrete non-negative signals that have been aligned with each other(e.g.,two image patches extracted from the same spatial lo-cation from two images being compared,respectively),and letµx,σ2x andσxy be the mean of x,the variance of x,and the covariance of x and y,respectively.Approximately,µx andσx can be viewed as estimates of the luminance and contrast of x,andσxy measures the the tendency of x and y to vary together,thus an indication of structural similarity.In[5],the luminance,contrast and structure comparison measures were given as follows:l(x,y)=2µxµy+C1µ2x+µ2y+C1,(1)c(x,y)=2σxσy+C2σ2x+σ2y+C2,(2)s(x,y)=σxy+C3σxσy+C3,(3) where C1,C2and C3are small constants given byC1=(K1L)2,C2=(K2L)2and C3=C2/2,(4)Fig.1.Multi-scale structural similarity measurement system.L:low-passfiltering;2↓:downsampling by2. respectively.L is the dynamic range of the pixel values(L=255for8bits/pixel gray scale images),and K1 1and K2 1aretwo scalar constants.The general form of the Structural SIMilarity(SSIM)index between signal x and y is defined as:SSIM(x,y)=[l(x,y)]α·[c(x,y)]β·[s(x,y)]γ,(5)whereα,βandγare parameters to define the relative importanceof the three components.Specifically,we setα=β=γ=1,andthe resulting SSIM index is given bySSIM(x,y)=(2µxµy+C1)(2σxy+C2)(µ2x+µ2y+C1)(σ2x+σ2y+C2),(6)which satisfies the following conditions:1.symmetry:SSIM(x,y)=SSIM(y,x);2.boundedness:SSIM(x,y)≤1;3.unique maximum:SSIM(x,y)=1if and only if x=y.The universal image quality index proposed in[3]corresponds to the case of C1=C2=0,therefore is a special case of(6).The drawback of such a parameter setting is that when the denominator of Eq.(6)is close to0,the resulting measurement becomes unsta-ble.This problem has been solved successfully in[5]by adding the two small constants C1and C2(calculated by setting K1=0.01 and K2=0.03,respectively,in Eq.(4)).We apply the SSIM indexing algorithm for image quality as-sessment using a sliding window approach.The window moves pixel-by-pixel across the whole image space.At each step,the SSIM index is calculated within the local window.If one of the image being compared is considered to have perfect quality,then the resulting SSIM index map can be viewed as the quality map of the other(distorted)image.Instead of using an8×8square window as in[3],a smooth windowing approach is used for local statistics to avoid“blocking artifacts”in the quality map[5].Fi-nally,a mean SSIM index of the quality map is used to evaluate the overall image quality.3.MULTI-SCALE STRUCTURAL SIMILARITY3.1.Multi-scale SSIM indexThe perceivability of image details depends the sampling density of the image signal,the distance from the image plane to the ob-server,and the perceptual capability of the observer’s visual sys-tem.In practice,the subjective evaluation of a given image varies when these factors vary.A single-scale method as described in the previous section may be appropriate only for specific settings.Multi-scale method is a convenient way to incorporate image de-tails at different resolutions.We propose a multi-scale SSIM method for image quality as-sessment whose system diagram is illustrated in Fig. 1.Taking the reference and distorted image signals as the input,the system iteratively applies a low-passfilter and downsamples thefiltered image by a factor of2.We index the original image as Scale1, and the highest scale as Scale M,which is obtained after M−1 iterations.At the j-th scale,the contrast comparison(2)and the structure comparison(3)are calculated and denoted as c j(x,y) and s j(x,y),respectively.The luminance comparison(1)is com-puted only at Scale M and is denoted as l M(x,y).The overall SSIM evaluation is obtained by combining the measurement at dif-ferent scales usingSSIM(x,y)=[l M(x,y)]αM·Mj=1[c j(x,y)]βj[s j(x,y)]γj.(7)Similar to(5),the exponentsαM,βj andγj are used to ad-just the relative importance of different components.This multi-scale SSIM index definition satisfies the three conditions given in the last section.It also includes the single-scale method as a spe-cial case.In particular,a single-scale implementation for Scale M applies the iterativefiltering and downsampling procedure up to Scale M and only the exponentsαM,βM andγM are given non-zero values.To simplify parameter selection,we letαj=βj=γj forall j’s.In addition,we normalize the cross-scale settings such thatMj=1γj=1.This makes different parameter settings(including all single-scale and multi-scale settings)comparable.The remain-ing job is to determine the relative values across different scales. Conceptually,this should be related to the contrast sensitivity func-tion(CSF)of the HVS[7],which states that the human visual sen-sitivity peaks at middle frequencies(around4cycles per degree of visual angle)and decreases along both high-and low-frequency directions.However,CSF cannot be directly used to derive the parameters in our system because it is typically measured at the visibility threshold level using simplified stimuli(sinusoids),but our purpose is to compare the quality of complex structured im-ages at visible distortion levels.3.2.Cross-scale calibrationWe use an image synthesis approach to calibrate the relative impor-tance of different scales.In previous work,the idea of synthesizing images for subjective testing has been employed by the“synthesis-by-analysis”methods of assessing statistical texture models,inwhich the model is used to generate a texture with statistics match-ing an original texture,and a human subject then judges the sim-ilarity of the two textures [8]–[11].A similar approach has also been qualitatively used in demonstrating quality metrics in [5],[12],though quantitative subjective tests were not conducted.These synthesis methods provide a powerful and efficient means of test-ing a model,and have the added benefit that the resulting images suggest improvements that might be made to the model[11].M )distortion level (MSE)12345Fig.2.Demonstration of image synthesis approach for cross-scale calibration.Images in the same row have the same MSE.Images in the same column have distortions only in one specific scale.Each subject was asked to select a set of images (one from each scale),having equal quality.As an example,one subject chose the marked images.For a given original 8bits/pixel gray scale test image,we syn-thesize a table of distorted images (as exemplified by Fig.2),where each entry in the table is an image that is associated witha specific distortion level (defined by MSE)and a specific scale.Each of the distorted image is created using an iterative procedure,where the initial image is generated by randomly adding white Gaussian noise to the original image and the iterative process em-ploys a constrained gradient descent algorithm to search for the worst images in terms of SSIM measure while constraining MSE to be fixed and restricting the distortions to occur only in the spec-ified scale.We use 5scales and 12distortion levels (range from 23to 214)in our experiment,resulting in a total of 60images,as demonstrated in Fig.2.Although the images at each row has the same MSE with respect to the original image,their visual quality is significantly different.Thus the distortions at different scales are of very different importance in terms of perceived image quality.We employ 10original 64×64images with different types of con-tent (human faces,natural scenes,plants,man-made objects,etc.)in our experiment to create 10sets of distorted images (a total of 600distorted images).We gathered data for 8subjects,including one of the authors.The other subjects have general knowledge of human vision but did not know the detailed purpose of the study.Each subject was shown the 10sets of test images,one set at a time.The viewing dis-tance was fixed to 32pixels per degree of visual angle.The subject was asked to compare the quality of the images across scales and detect one image from each of the five scales (shown as columns in Fig.2)that the subject believes having the same quality.For example,one subject chose the images marked in Fig.2to have equal quality.The positions of the selected images in each scale were recorded and averaged over all test images and all subjects.In general,the subjects agreed with each other on each image more than they agreed with themselves across different images.These test results were normalized (sum to one)and used to calculate the exponents in Eq.(7).The resulting parameters we obtained are β1=γ1=0.0448,β2=γ2=0.2856,β3=γ3=0.3001,β4=γ4=0.2363,and α5=β5=γ5=0.1333,respectively.4.TEST RESULTSWe test a number of image quality assessment algorithms using the LIVE database (available at [13]),which includes 344JPEG and JPEG2000compressed images (typically 768×512or similar size).The bit rate ranges from 0.028to 3.150bits/pixel,which allows the test images to cover a wide quality range,from in-distinguishable from the original image to highly distorted.The mean opinion score (MOS)of each image is obtained by averag-ing 13∼25subjective scores given by a group of human observers.Eight image quality assessment models are being compared,in-cluding PSNR,the Sarnoff model (JNDmetrix 8.0[14]),single-scale SSIM index with M equals 1to 5,and the proposed multi-scale SSIM index approach.The scatter plots of MOS versus model predictions are shown in Fig.3,where each point represents one test image,with its vertical and horizontal axes representing its MOS and the given objective quality score,respectively.To provide quantitative per-formance evaluation,we use the logistic function adopted in the video quality experts group (VQEG)Phase I FR-TV test [15]to provide a non-linear mapping between the objective and subjective scores.After the non-linear mapping,the linear correlation coef-ficient (CC),the mean absolute error (MAE),and the root mean squared error (RMS)between the subjective and objective scores are calculated as measures of prediction accuracy .The prediction consistency is quantified using the outlier ratio (OR),which is de-Table1.Performance comparison of image quality assessment models on LIVE JPEG/JPEG2000database[13].SS-SSIM: single-scale SSIM;MS-SSIM:multi-scale SSIM;CC:non-linear regression correlation coefficient;ROCC:Spearman rank-order correlation coefficient;MAE:mean absolute error;RMS:root mean squared error;OR:outlier ratio.Model CC ROCC MAE RMS OR(%)PSNR0.9050.901 6.538.4515.7Sarnoff0.9560.947 4.66 5.81 3.20 SS-SSIM(M=1)0.9490.945 4.96 6.25 6.98 SS-SSIM(M=2)0.9630.959 4.21 5.38 2.62 SS-SSIM(M=3)0.9580.956 4.53 5.67 2.91 SS-SSIM(M=4)0.9480.946 4.99 6.31 5.81 SS-SSIM(M=5)0.9380.936 5.55 6.887.85 MS-SSIM0.9690.966 3.86 4.91 1.16fined as the percentage of the number of predictions outside the range of±2times of the standard deviations.Finally,the predic-tion monotonicity is measured using the Spearman rank-order cor-relation coefficient(ROCC).Readers can refer to[15]for a more detailed descriptions of these measures.The evaluation results for all the models being compared are given in Table1.From both the scatter plots and the quantitative evaluation re-sults,we see that the performance of single-scale SSIM model varies with scales and the best performance is given by the case of M=2.It can also be observed that the single-scale model tends to supply higher scores with the increase of scales.This is not surprising because image coding techniques such as JPEG and JPEG2000usually compressfine-scale details to a much higher degree than coarse-scale structures,and thus the distorted image “looks”more similar to the original image if evaluated at larger scales.Finally,for every one of the objective evaluation criteria, multi-scale SSIM model outperforms all the other models,includ-ing the best single-scale SSIM model,suggesting a meaningful balance between scales.5.DISCUSSIONSWe propose a multi-scale structural similarity approach for image quality assessment,which provides moreflexibility than single-scale approach in incorporating the variations of image resolution and viewing conditions.Experiments show that with an appropri-ate parameter settings,the multi-scale method outperforms the best single-scale SSIM model as well as state-of-the-art image quality metrics.In the development of top-down image quality models(such as structural similarity based algorithms),one of the most challeng-ing problems is to calibrate the model parameters,which are rather “abstract”and cannot be directly derived from simple-stimulus subjective experiments as in the bottom-up models.In this pa-per,we used an image synthesis approach to calibrate the param-eters that define the relative importance between scales.The im-provement from single-scale to multi-scale methods observed in our tests suggests the usefulness of this novel approach.However, this approach is still rather crude.We are working on developing it into a more systematic approach that can potentially be employed in a much broader range of applications.6.REFERENCES[1] A.M.Eskicioglu and P.S.Fisher,“Image quality mea-sures and their performance,”IEEE munications, vol.43,pp.2959–2965,Dec.1995.[2]T.N.Pappas and R.J.Safranek,“Perceptual criteria for im-age quality evaluation,”in Handbook of Image and Video Proc.(A.Bovik,ed.),Academic Press,2000.[3]Z.Wang and A.C.Bovik,“A universal image quality in-dex,”IEEE Signal Processing Letters,vol.9,pp.81–84,Mar.2002.[4]Z.Wang,H.R.Sheikh,and A.C.Bovik,“Objective videoquality assessment,”in The Handbook of Video Databases: Design and Applications(B.Furht and O.Marques,eds.), pp.1041–1078,CRC Press,Sept.2003.[5]Z.Wang,A.C.Bovik,H.R.Sheikh,and E.P.Simon-celli,“Image quality assessment:From error measurement to structural similarity,”IEEE Trans.Image Processing,vol.13, Jan.2004.[6]Z.Wang,L.Lu,and A.C.Bovik,“Video quality assessmentbased on structural distortion measurement,”Signal Process-ing:Image Communication,special issue on objective video quality metrics,vol.19,Jan.2004.[7] B.A.Wandell,Foundations of Vision.Sinauer Associates,Inc.,1995.[8]O.D.Faugeras and W.K.Pratt,“Decorrelation methods oftexture feature extraction,”IEEE Pat.Anal.Mach.Intell., vol.2,no.4,pp.323–332,1980.[9] A.Gagalowicz,“A new method for texturefields synthesis:Some applications to the study of human vision,”IEEE Pat.Anal.Mach.Intell.,vol.3,no.5,pp.520–533,1981. [10] D.Heeger and J.Bergen,“Pyramid-based texture analy-sis/synthesis,”in Proc.ACM SIGGRAPH,pp.229–238,As-sociation for Computing Machinery,August1995.[11]J.Portilla and E.P.Simoncelli,“A parametric texture modelbased on joint statistics of complex wavelet coefficients,”Int’l J Computer Vision,vol.40,pp.49–71,Dec2000. [12]P.C.Teo and D.J.Heeger,“Perceptual image distortion,”inProc.SPIE,vol.2179,pp.127–141,1994.[13]H.R.Sheikh,Z.Wang, A. C.Bovik,and L.K.Cormack,“Image and video quality assessment re-search at LIVE,”/ research/quality/.[14]Sarnoff Corporation,“JNDmetrix Technology,”http:///products_services/video_vision/jndmetrix/.[15]VQEG,“Final report from the video quality experts groupon the validation of objective models of video quality assess-ment,”Mar.2000./.PSNRM O SSarnoffM O S(a)(b)Single−scale SSIM (M=1)M O SSingle−scale SSIM (M=2)M O S(c)(d)Single−scale SSIM (M=3)M O SSingle−scale SSIM (M=4)M O S(e)(f)Single−scale SSIM (M=5)M O SMulti−scale SSIMM O S(g)(h)Fig.3.Scatter plots of MOS versus model predictions.Each sample point represents one test image in the LIVE JPEG/JPEG2000image database [13].(a)PSNR;(b)Sarnoff model;(c)-(g)single-scale SSIM method for M =1,2,3,4and 5,respectively;(h)multi-scale SSIM method.。
脊柱侧凸是一种脊柱三维结构的畸形疾病,全球有1%~4%的青少年受到此疾病的影响[1]。
该疾病的诊断主要参考患者的脊柱侧凸角度,目前X线成像方式是诊断脊柱侧凸的首选,在X线图像中分割脊柱是后续测量、配准以及三维重建的基础。
近期出现了不少脊柱X线图像分割方法。
Anitha等人[2-3]提出了使用自定义的滤波器自动提取椎体终板以及自动获取轮廓的形态学算子的方法,但这些方法存在一定的观察者间的误差。
Sardjono等人[4]提出基于带电粒子模型的物理方法来提取脊柱轮廓,实现过程复杂且实用性不高。
叶伟等人[5]提出了一种基于模糊C均值聚类分割算法,该方法过程繁琐且实用性欠佳。
以上方法都只对椎体进行了分割,却无法实现对脊柱的整体轮廓分割。
深度学习在图像分割的领域有很多应用。
Long等人提出了全卷积网络[6](Full Convolutional Network,FCN),将卷积神经网络的最后一层全连接层替换为卷积层,得到特征图后再经过反卷积来获得像素级的分类结果。
通过对FCN结构改进,Ronneberger等人提出了一种编码-解码的网络结构U-Net[7]解决图像分割问题。
Wu等人提出了BoostNet[8]来对脊柱X线图像进行目标检测以及一个基于多视角的相关网络[9]来完成对脊柱框架的定位。
上述方法并未直接对脊柱图像进行分割,仅提取了关键点的特征并由定位的特征来获取脊柱的整体轮廓。
Fang等人[10]采用FCN对脊柱的CT切片图像进行分割并进行三维重建,但分割精度相对较低。
Horng等人[11]将脊柱X线图像进行切割后使用残差U-Net 来对单个椎骨进行分割,再合成完整的脊柱图像,从而导致分割过程过于繁琐。
Tan等人[12]和Grigorieva等人[13]采用U-Net来对脊柱X线图像进行分割并实现对Cobb角的测量或三维重建,但存在分割精度不高的问题。
以上研究方法虽然在一定程度上完成脊柱分割,但仍存在两个问题:(1)只涉及椎体的定位和计算脊柱侧凸角度,却没有对图像进行完整的脊柱分割。
图像特征的各向异性扩散去噪方法基于图像特征的各向异性扩散去噪方法摘要:对图像去噪滤波方法,j.weickert模型未考虑图像光滑区域与其他图像特征的区别,在光滑区域的扩散也按照局部结构特征值进行,因而在光滑区域不可避免地产生虚假边缘,为此,提出一种改进的各向异性扩散方法。
该方法首先用维纳滤波减弱噪声对图像的影响,再利用相干性正确判断边缘区域、光滑区域和t形拐角等图像特征,并依据图像特征设置相应区域扩散张量的特征值。
实验结果表明,改进方法在消除噪声和保护边缘方面能取得较好的效果,并有效消除光滑区域的虚假边缘,可得到较高的峰值信噪比。
关键词:图像特征;各向异性扩散;相干性;扩散张量;特征值anisotropic diffusion denoising method based on image featureke dan dan, cai guang cheng*, cao qian qian (faculty of science, kunming university of science and technology, kunming yunnan 650500, china)abstract:as for the image denoising filter method, the model proposedby j. weickert does not consider the distinctions between the smooth area and other image features. the diffusion in smooth area is also in accordance with the eigenvalues of local structure characteristics, thus inevitably producing false edges in smooth area. an improved anisotropic diffusion method was proposed. this method firstly used the wiener filter to weaken the influence of noise on the image, then coherence was applied to judge image feature correctly, as edge region, smooth area, t shape corner and so on, and the diffusion tensor s eigenvalues in corresponding region were set based on image feature. the experimental results show that the improved method can not only achieve better results in elimination of noise and protection of edge, but also remove false edge in smooth area effectively and get higher peak signal to noise ratio.key words:image feature; anisotropic diffusion; coherence; diffusion tensor; eigenvalue0 引言成像过程中由于受内外因素的干扰,图像总不可避免地存在噪声,这些噪声在很大程度上影响了图像细节的真实情况,降低了图像质量。
文章编号 2097-1842(2024)01-0160-07基于光照模型的细胞内镜图像不均匀光照校正算法邹鸿博1,章 彪1,王子川1,陈 可2,王立强2,袁 波1 *(1. 浙江大学 光电科学与工程学院, 浙江 杭州 310027;2. 之江实验室类人感知研究中心, 浙江 杭州 311100)摘要:细胞内镜需实现最大倍率约500倍的连续放大成像,受光纤照明及杂散光的影响,其图像存在不均匀光照,且光照分布会随放大倍率的变化而变化。
这会影响医生对病灶的观察及判断。
为此,本文提出一种基于细胞内镜光照模型的图像不均匀光照校正算法。
根据图像信息由光照分量和反射分量组成这一基础,该算法通过卷积神经网络学习图像的光照分量,并基于二维Gamma 函数实现不均匀光照校正。
实验表明,经本文方法进行不均匀光照校正后,图像的光照分量平均梯度和离散熵分别为0.22和7.89,优于自适应直方图均衡化、同态滤波和单尺度Retinex 等传统方法以及基于深度学习的WSI-FCN 算法。
关 键 词:细胞内镜;不均匀光照;光照模型;卷积神经网络中图分类号:TN29;TP391.4 文献标志码:A doi :10.37188/CO.2023-0059Non-uniform illumination correction algorithm for cytoendoscopyimages based on illumination modelZOU Hong-bo 1,ZHANG Biao 1,WANG Zi-chuan 1,CHEN Ke 2,WANG Li-qiang 2,YUAN Bo 1 *(1. College of Optical Science and Engineering , Zhejiang University , Hangzhou 310027, China ;2. Research Center for Humanoid Sensing , Zhejiang Lab., Hangzhou 311100, China )* Corresponding author ,E-mail : **************.cnAbstract : Cytoendoscopy requires continuous amplification with a maximum magnification rate of about 500 times. Due to optical fiber illumination and stray light, the image has non-uniform illumination that changes with the magnification rate, which affects the observation and judgement of lesions by doctors.Therefore, we propose an image non-uniform illumination correction algorithm based on the illumination model of cytoendoscopy. According to the principle that image information is composed of illumination and reflection components, the algorithm obtains the illumination component of the image through a convolution-al neural network, and realizes non-uniform illumination correction based on the two-dimensional Gamma function. Experiments show that the average gradient of the illumination channel and the discrete entropy of the image are 0.22 and 7.89, respectively, after the non-uniform illumination correction by the proposed method, which is superior to the traditional methods such as adaptive histogram equalization, homophobic收稿日期:2023-04-04;修订日期:2023-05-15基金项目:国家重点研发计划项目(No. 2021YFC2400103);之江实验室科研项目(No. 2019MC0AD02,No. 2022MG0AL01)Supported by the National Key Research and Development Program of China (No. 2021YFC2400103); Key Research Project of Zhejiang Lab (No. 2019MC0AD02, No. 2022MG0AL01)第 17 卷 第 1 期中国光学(中英文)Vol. 17 No. 12024年1月Chinese OpticsJan. 2024filtering, single-scale Retinex and the WSI-FCN algorithm based on deep learning.Key words: cytoendoscopy;non-uniform illumination;illumination model;convolutional neural network1 引 言细胞内镜是一种具有超高放大倍率的内窥镜[1-4],可实现常规倍率到细胞级放大倍率的连续放大观察。
Anisotropic diffusionFrom Wikipedia, the free encyclopediaIn image processing and computer vision, anisotropic diffusion, also called Perona–Malik diffusion, is a technique aiming at reducing image noise without removing significant parts of the image content, typically edges, lines or other details that are important for the interpretation of the image.[1][2][3] Anisotropic diffusion resembles the process that creates a scale-space, where an image generates a parameterized family of successively more and more blurred images based on a diffusion process. Each of the resulting images in this family are given as a convolution between the image and a 2D isotropic Gaussian filter, where the width of the filter increases with the parameter. This diffusion process is a linear and space-invariant transformation of the original image. Anisotropic diffusion is a generalization of this diffusion process: it produces a family of parameterized images, but each resulting image is a combination between the original image and a filter that depends on the local content of the original image. As a consequence, anisotropic diffusion is a non-linear and space-variant transformation of the original image.In its original formulation, presented by Perona and Malik in 1987[1], the space-variant filter is in fact isotropic but depends on the image content such that it approximates an impulse function close to edges and other structures that should be preserved in the image over the different levels of the resulting scale-space. This formulation was referred to as anisotropic diffusion by Perona and Malik even though the locally adapted filter is isotropic, but it has also been referred to as inhomogeneous and nonlinear diffusion[4] or Perona-Malik diffusion[5] by other authors. A more general formulation allows the locally adapted filter to be truly anisotropic close to linear structures such as edges or lines: it has an orientation given by the structure such that it is elongated along the structure and narrow across. Such methods are referred to as shape-adapted smoothing[6][7] or coherence enhancing diffusion[8]. As a consequence, the resulting images preserve linear structures while at the same time smoothing is made along these structures. Both these cases can be described by a generalization of the usual diffusion equation where the diffusion coefficient, instead of being a constant scalar, is a function of image position and assumes a matrix (or tensor) value (see structure tensor).Although the resulting family of images can be described as a combination between the original image and space-variant filters, the locally adapted filter and its combination with the image do not have to be realized in practice. Anisotropic diffusion is normally implemented by means of an approximation of the generalized diffusion equation: each new image in the family is computed by applying this equation to the previous image. Consequently, anisotropic diffusion is an iterative process where a relatively simple set of computation are used to compute each successive image in the family and this process is continued until a sufficient degree of smoothing is obtained.Contents1 Formal definition2 Motivation3 Applications4 See also5 External links6 ReferencesFormal definitionFormally, let denote a subset of the plane and be a family of gray scale images, then anisotropic diffusion is defined aswhere ∆ denotes the Laplacian, denotes the gradient, is the divergence operator and c(x,y,t) is the diffusion coefficient. c(x,y,t) controls the rate of diffusion and is usually chosen as a function of the image gradient so as to preserve edges in the image. Pietro Perona and Jitendra Malik pioneered the idea of anisotropic diffusion in 1990 and proposed two functions for the diffusion coefficient:andthe constant K controls the sensitivity to edges and is usually chosen experimentally or as a function of the noise in the image.MotivationLet M denote the manifold of smooth images, then the diffusion equations presented above can be interpreted as the gradient descent equations for the minimization of the energy functional defined bywhere is a real-valued function which we will see is intimately related to the diffusion coefficient. Then for any compactly supported infinitely differentiable test function h, we havewhere the last line follow from multidimensional integration by parts. Letting denote the gradient of E with respect to the inner product evaluated at I, this givesTherefore, the gradient descent equations on the functional E are given byThus by letting c = g' we obtain the anisotropic diffusion equations.ApplicationsAnisotropic diffusion can be used to remove noise from digital images without blurring edges. With a constant diffusion coefficient, the anisotropic diffusion equations reduce to the heat equation which is equivalent to Gaussian blurring. This is ideal for removing noise but also indiscriminately blurs edges too. When the diffusion coefficient is chosen as an edge seeking function, such as in Perona () and Malik, the resulting equations encourage diffusion (hence smoothing) within regions and prohibit it across strong edges. Hence the edges can be preserved while removing noise from the image.Along the same lines as noise removal, anisotropic diffusion can be used in edge detection algorithms. By running the diffusion with an edge seeking diffusion coefficient for a certain number of iterations, the image can be evolved towards a piecewise constant image with the boundaries between the constant components being detected as edges.See alsoBilateral filterEdge detectionEdge-preserving smoothingHeat equationImage noiseNoise reductionScale spaceTotal variation denoisingExternal linksMathematica PeronaMalikFilter (/mathematica/ref/PeronaMalikFilter.html) function.IDL nonlinear anisotropic diffusion package(edge enhancing and coherence enhancing): [1](/fac/sci/physics/research/cfsa/people/yuan/studytracking/computation/idllib/)References^ a b Pietro Perona and Jitendra Malik (November 1987). "Scale-space and edge detection using anisotropic 1.diffusion". Proceedings of IEEE Computer Society Workshop on Computer Vision,. pp. 16–22.2.^ Pietro Perona and Jitendra Malik (July 1990). "Scale-space and edge detection using anisotropic diffusion"(/Xplore/login.jsp?url=http%3A%2F%%2Fiel1%2F34%2F2032%2F00056205.pdf%3Farnumber%3D56205&authDecision=-203) .IEEE Transactions on Pattern Analysis and Machine Intelligence,12 (7): 629–639. doi:10.1109/34.56205(/10.1109%2F34.56205) . /Xplore/login.jsp?url=http%3A%2F%%2Fiel1%2F34%2F2032%2F00056205.pdf%3Farnumber%3D56205&authDecision=-203.3.^ Guillermo Sapiro (2001). Geometric partial differential equations and image analysis ( /?id=4z5cCjFxIBoC&pg=PA223&dq=perona-malik+anisotropic-diffusion) . Cambridge University Press. p. 223.ISBN 9780521790758. /?id=4z5cCjFxIBoC&pg=PA223&dq=perona-malik+anisotropic-diffusion.4.^ Joachim Weickert (July 1997). "A Review of Nonlinear Diffusion Filtering". Scale-Space Theory in Computer Vision. Springer, LNCS 1252. pp. 1–28. doi:10.1007/3-540-63167-4 (/10.1007%2F3-540-63167-4) .^ Bernd Jähne and Horst Haußecker (2000). Computer Vision and Applications, A Guide for Students and 5.Practitioners. Academic Press. ISBN 0-13-085198-1.6.^ Lindeberg, T., Scale-Space Theory in Computer Vision, Kluwer Academic Publishers, 1994(http://www.csc.kth.se/~tony/book.html) , ISBN 0-7923-9418-6, (chapter 15).7.^ Andres Almansa and Tony Lindeberg (2000). "Fingerprint Enhancement by Shape Adaptation of Scale-Space Operators with Automatic Scale-Selection" (http://www.csc.kth.se/cvap/abstracts/cvap226.html) . IEEETransactions on Image Processing9 (12): 2027–2042. doi:10.1109/83.887971 (/10.1109%2F83.887971) . http://www.csc.kth.se/cvap/abstracts/cvap226.html.8.^ Weickert, J Anisotropic diffusion in image processing, Teuber V erlag, Stuttgart, 1998. (http://www.mia.uni-saarland.de/weickert/book.html)Retrieved from "/w/index.php?title=Anisotropic_diffusion&oldid=455706785" Categories: Image processing Computer visionThis page was last modified on 15 October 2011 at 16:22.Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. See Terms of use for details.Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.。
图像检索(imageretrieval)-13-Smooth-AP:Smoothingth。
Smooth-AP: Smoothing the Path Towards Large-Scale Image RetrievalAbstract优化⼀个基于排名的度量,⽐如Average Precision(AP),是出了名的具有挑战性,因为它是不可微的,因此不能直接使⽤梯度下降⽅法进⾏优化。
为此,我们引⼊了⼀个优化AP平滑近似的⽬标,称为Smooth-AP。
Smooth-AP是⼀个即插即⽤的⽬标函数,允许对深度⽹络进⾏端到端训练,实现简单⽽优雅。
我们还分析了为什么直接优化基于AP度量的排名⽐其他深度度量学习损失更有好处。
我们将Smooth-AP应⽤于标准检索基准:Stanford Online products和VehicleID,也评估更⼤规模的数据集:INaturalist⽤于细粒度类别检索,VGGFace2和IJB-C⽤于⼈脸检索。
在所有情况下,我们都改善了最先进的技术的性能,特别是对于更⼤规模的数据集,从⽽证明了Smooth-AP在真实场景中的有效性和可扩展性。
1 Introduction本⽂的⽬标是提⾼“实例查询”的性能,其任务是:给定⼀个查询图像,根据实例与查询的相关性对检索集中的所有实例进⾏排序。
例如,假设你有⼀张朋友或家⼈的照⽚,想要在你的⼤型智能⼿机图⽚集合中搜索那个⼈的所有图⽚;或者在照⽚授权⽹站上,您希望从⼀张照⽚开始查找特定建筑或对象的所有照⽚。
在这些⽤例中,⾼recall是⾮常重要的,不同于“Google Lens”应⽤程序从图像中识别⼀个物体,其中只有⼀个“hit”(匹配)就⾜够了。
检索质量的基准度量是Average Precision(AP)(或其⼴义变体,Normalized Discounted Cumulative Gain,其中包括⾮⼆进制相关性判断)。
随着深度神经⽹络的兴起,端到端训练已经成为解决特定视觉任务的实际选择。
2022年 2月 图 学 学 报 February2022第43卷 第1期 JOURNAL OF GRAPHICS Vol.43No.1
收稿日期:2021-06-23;定稿日期:2021-08-06 Received:23 June,2021;Finalized:6 August,2021 基金项目:科技部重点研发计划项目(2020YFA0714100) Foundation items:Key R&D Program of the Ministry of Science and Technology (2020YFA0714100) 第一作者:唐晓天(1997–),男,硕士研究生。主要研究方向为视频超分辨率重建。E-mail:****************First author:TANG Xiao-tian (1997–), master student. His main research interest covers video super-resolution reconstruction. E-mail:****************通信作者:李 峰(1975–),男,研究员,博士。主要研究方向为图像重建、压缩感知等。E-mail:**************** Corresponding author:LI Feng (1975–), researcher, Ph.D. His main research interests cover image reconstruction, compressed sensing, etc. E-mail:****************
基于多尺度时域3D卷积的视频超分辨率重建 唐晓天1,2, 马 骏2, 李 峰1, 杨 雪1, 梁 亮3 (1. 钱学森空间技术实验室,北京 100086; 2. 河南大学软件学院,河南 开封 475004; 3. 清华大学电子工程系,北京 100084)
1. vtkActorCollection: List of actors. This can be useful for storing groups of actors to show/hide together.2. vtkAppendPolyData: Appends one or more polygonal datasets together. This is extremely useful for forming complex objects consisting of many polygonal pieces. It is often preferable to combine the objects at this level as opposed to have multiple actors.3. vtkApproximatingSubdivisionFilter: Generate a subdivision surface using an Approximating Scheme.4. vtkAssembly: Create hierarchies of vtkProp3Ds (transformable props). This can be used to combine multiple actors into a single one.5. vtkCallbackCommand: Supports function callbacks. This is exceedingly useful in large programs.6. vtkCamera: Virtual camera for 3D rendering. This is used by all vtkRenderer classes.7. vtkCell: Abstract class to specify cell behavior – this is worth looking at if you are interested in finite element meshes and arbitrary topologies.8. vtkCellArray: Object to represent cell connectivity9. vtkCleanPolyData: Merge duplicate points, and/or remove unused points and/or remove degenerate cells. This is very useful for downsampling a surface to generate a set of points to use as an input to a point-based registration method. It does “destroy” the surface structure.10. vtkCollection: Create and manipulate unsorted lists of objects. This is very useful in large programs, for grouping a set of objects together.11. vtkConnectivityFilter: Extract data based on geometric connectivity.12. vtkContourFilter: Generate isosurfaces/isolines from scalar values. This can be used to extract, for example, zero-levelsets from a levelset propagation algorithm.13. vtkCurvatures: Compute curvatures (Gauss and mean) of a Polydata object.14. vtkDataArray: Abstract superclass for arrays. This is the parent class for all data arrays and it is worth being familar with it’s structure.15. vtkDataObject: General representation of visualization data16. vtkDataSet: Abstract class to specify dataset behavior17. vtkDecimate: Reduce the number of triangles in a mesh18. vtkDICOMImageReader: Reads DICOM images. This is not a complete implementation but useful nonetheless.19. vtkExtractVOI: Select piece (e.g., volume of interest) and/or subsample structured points dataset20. vtkGaussianSplatter: Splat points into a volume with an elliptical, Gaussian distribution21. vtkGeneralTransform: Allows operations on any transforms. This is very useful for concatenating transformations of different types into a single transformation. Note that for proper operationg the “PostMultiply” flag needs to be set.22. vtkGeometryFilter: Extract geometry from data (or convert data to polygonal type)23. vtkGridTransform: Nonlinear warp transformation. This stores a transformation as a displacement field (with either linear or cubic interpolation).24. vtkHeap: Replacement for malloc/free and new/delete25. vtkHull: Produce an n-sided convex hull26. vtkIdentityTransform: Transform that doesn’t do anything. This may sound useless, but if you have to have a transformation somewhere that is identity, this is the fastest way to do it!27. vtkIdList: List of point or cell ids28. vtkIdListCollection: Maintain an unordered list of dataarray objects. This can be useful for getting a list of unique indices for example.29. vtkImageAccumulate: Generalized histograms up to 4 dimensions.30. vtkImageAnisotropicDiffusion3D: Edge preserving smoothing. This is worth looking at.31. vtkImageAppend: Collects data from multiple inputs into one image. The AppendAxis is used to define the dire ction of “stiching”.32. vtkImageAppendComponents: Collects components from two inputs into one output.33. vtkImageBlend: Blend images together using alpha or opacity.34. vtkImageCast: Image Data type Casting Filter35. vtkImageConvolve: Convolution of an image with a kernel36. vtkImageCorrelation: Correlation imageof the two inputs37. vtkImageData: Topologically and geometrically regular array of data38. vtkImageExport: Export VTK images to third-party systems. This basically gets you a raw pointer that you can use to access the data.39. vtkImageExtractComponents: Outputs a single component40. vtkImageFFT: Fast Fourier Transform41. vtkImageFlip: This flips an axis of an image. Right becomes left ..42. vtkImageGaussianSmooth: Performs a gaussian convolution43. vtkImageGradient: Computes the gradient vector44. vtkImageGradientMagnitude: Computes magnitude of the gradient45. vtkImageImport: Import data from a C array. This is useful for integrating with legacy code.46. vtkImageLaplacian: Computes divergence of gradient47. vtkImageMagnitude: Colapses components with magnitude function.48. vtkImageMarchingCubes: Generate isosurface(s) from volume/images49. vtkImageMask: Combines a mask and an image50. vtkImageMathematics: Add, subtract, multiply, divide, invert, sin, cos, exp, log51. vtkImageMedian3D: Median Filter52. vtkImageNonMaximumSuppression: Performs non-maximum suppression. This is part of the implementation of the Canny edge detection filter.53. vtkImageRFFT: Reverse Fast Fourier Transform54. vtkImageSeedConnectivity: SeedConnectivity with user defined seeds55. vtkImageSeparableConvolution: 3 1D convolutions on an image56. vtkImageShiftScale: Shift and scale an input image. This also allows for changing the image type from e.g. short to float.57. vtkLabeledDataMapper: Draw text labels at dataset points. This is useful for automatically numbering points.58. vtkLandmarkTransform: Linear transform specified by two corresponding point sets59. vtkLookupTable: Map scalar values into colors via a lookup table60. vtkMarchingContourFilter: Generate isosurfaces/isolines from scalar values61. vtkMarchingCubes: Generate isosurface(s) from volume62. vtkMath: Performs common math operations. This is a good example of integrating procedural code into VTK. All member methods of this class are static. Pi is also defined here in a cross-platform manner.63. vtkMatrix4x4: Represent and manipulate 4x4 transformation matrices64. vtkOutlineFilter: Create wireframe outline for arbitrary data set65. vtkPCAAnalysisFilter: Performs principal component analysis of a set of aligned pointsets66. vtkPlane: Perform various plane computations67. vtkPlaneSource: Create an array of quadrilaterals located in a plane68. vtkPNMReader: Read pnm (i.e., portable anymap) files69. vtkPNMWriter: Writes PNM (portable any map) files70. vtkPoints: Represent and manipulate 3D points71. vtkPolyData: Concrete dataset represents vertices, lines, polygons, and triangle strips72. vtkPolyDataConnectivityFilter: Extract polygonal data based on geometric connectivity73. vtkPolyDataMapper: Map vtkPolyData to graphics primitives74. vtkPolyDataNormals: Compute normals for polygonal mesh75. vtkPolyDataReader: Read vtk polygonal data file76. vtkPolyDataWriter: Write vtk polygonal data77. vtkPostScriptWriter: Writes an image as a PostScript file78. vtkPriorityQueue: List of ids arranged in priority order79. vtkProbeFilter: Sample data values at specified point locations. This is extremely useful for sampling images at arbitrary locations, e.g. computing line integrals using surfaces and or curves for the implementation of deformable model segmentation.80. vtkProcrustesAlignmentFilter: Aligns a set of pointsets together81. vtkRenderer: Abstract specification for renderers82. vtkRenderWindow: Create a window for renderers to draw into83. vtkScalarBarActor: Create a scalar bar with labels84. vtkTextActor: An actor that displays text. Scaled or unscaled85. vtkTextMapper: 2D text annotation86. vtkTexture: Handles properties associated with a texture map87. vtkThinPlateSplineTransform: Nonlinear warp transformation. This is used in many non-rigid registration applications.88. vtkTkImageViewerWidget: Tk Widget for viewing vtk images89. vtkTkRenderWidget: Tk Widget for vtk renderering90. vtkTransform: Describes linear transformations via a 4x4 matrix91. vtkTransformFilter: Transform points and associated normals and vectors92. vtkTransformPolyDataFilter: Transform points and associated normals and vectors for polygonal dataset93. vtkTriangleFilter: Create triangle polygons from input polygons and triangle strips94. vtkUnstructuredGrid: Dataset represents arbitrary combinations of all possible cell types. This is useful for representing meshes.95. vtkVolume: Volume (data & properties) in a rendered scene96. vtkVolumeRayCastCompositeFunction: Ray function for compositing97. vtkVolumeRayCastMapper: A slow but accurate mapper for rendering volumes98. vtkVolumeTextureMapper2D: Abstract class for a volume mapper99. vtkVRMLExporter: Export a scene into VRML 2.0 format100. vtkWindowLevelLookupTable: Map scalar values into colors or colors to scalars; generate color table。
第43卷 第9期 包 装 工 程2022年5月PACKAGING ENGINEERING ·197·收稿日期:2021-09-06基金项目:国家自然科学基金(62172418);天津市教委科研计划(2018KJ246);中央高校基本科研业务费项目中国民航大学专项(3122018S008)基于FFST 和Hessenberg 分解的立体图像零水印算法韩绍程1a ,张鹏1b ,李鹏程2(1.中国民航大学 a.工程技术训练中心 b.电子信息与自动化学院,天津 300300;2.成都宇飞信息工程有限公司,成都 610041)摘要:目的 为提高立体图像零水印方案抗几何攻击性能,提出一种基于快速有限剪切波变换(FFST )和Hessenberg 分解的立体图像零水印算法。
方法 首先分别对原始立体图像在YCbCr 颜色空间下的左右视点亮度分量进行FFST ,然后在得到的2个低频子带中,基于随机子块选择策略和Hessenberg 联合分解来构造鲁棒特征矩阵,最后与预处理后的二值水印执行异或运算生成认证零水印。
此外,零水印检测前先采用傅里叶-梅林变换方法对待认证的立体图像进行几何校正。
结果 与相关算法相比,文中算法对常见非几何和几何攻击提取水印的平均NC 值为0.984 1,表现出更优的抗攻击性能。
结论 所提算法具有无损性和较强的安全性,可为立体图像版权保护提供可行的解决方案。
关键词:立体图像;零水印;快速有限剪切波变换;Hessenberg 分解;几何攻击 中图分类号:TP391 文献标识码:A 文章编号:1001-3563(2022)09-0197-10 DOI :10.19554/ki.1001-3563.2022.09.027Stereo Image Zero Watermarking Algorithm Based on FFST andHessenberg DecompositionHAN Shao-cheng 1a , ZHANG Peng 1b , LI Peng-cheng 2(1. a. Engineering Techniques Training Center b. College of Electronic Information and Automation, Civil AviationUniversity of China, Tianjin 300300, China; 2. Chengdu Yufei Information Engineering Co., Ltd.,Chengdu 610041, China) ABSTRACT: To improve the capability of stereo image zero watermarking scheme anti-geometric attacks, this work a stereo image zero watermarking algorithm based on Fast Finite Shearlet Transform (FFST) and Hessenberg decomposition is proposed. Firstly, the FFST is applied to the luminance components on the left and right viewpoint respectively of the original stereo image in the YCbCr color space to obtain the two low-frequency subbands. Then the robust feature matrix is constructed based on the random sub-block selecting strategy and Hessenberg joint decomposition from the above subbands. Finally, the certified zero watermark is produced by executing exclusive OR on the robust matrix and the preprocessed binary watermark. Furthermore, the Fourier-Mellin Transform is used to correct the stereo image to be certified before zero water-marking detection. Compared with related algorithms, the algorithm in this paper has an average NC value of 0.984 1 for com-mon non-geometric and geometric attacks, showing better anti-attack capability. The proposed algorithm has lossless and strong security, and can provide a feasible solution for stereo image copyright protection.KEY WORDS: stereo image; zero watermarking; fast finite shearlet transform (FFST); Hessenberg decomposition; geo-metric attack·198·包装工程2022年5月近年来,多媒体数据的盗版、侵权现象屡见不鲜。