数字图像处理 外文翻译 外文文献 英文文献 数字图像处理
- 格式:doc
- 大小:46.50 KB
- 文档页数:15
中英文资料对照外文翻译一、英文原文A NEW CONTENT BASED MEDIAN FILTERABSTRACTIn this paper the hardware implementation of a contentbased median filter suitabl e for real-time impulse noise suppression is presented. The function of the proposed ci rcuitry is adaptive; it detects the existence of impulse noise in an image neighborhood and applies the median filter operator only when necessary. In this way, the blurring o f the imagein process is avoided and the integrity of edge and detail information is pre served. The proposed digital hardware structure is capable of processing gray-scale im ages of 8-bit resolution and is fully pipelined, whereas parallel processing is used to m inimize computational time. The architecturepresented was implemented in FPGA an d it can be used in industrial imaging applications, where fast processing is of the utm ost importance. The typical system clock frequency is 55 MHz.1. INTRODUCTIONTwo applications of great importance in the area of image processing are noise filtering and image enhancement [1].These tasks are an essential part of any image pro cessor,whether the final image is utilized for visual interpretation or for automatic an alysis. The aim of noise filtering is to eliminate noise and its effects on the original im age, while corrupting the image as little as possible. To this end, nonlinear techniques (like the median and, in general, order statistics filters) have been found to provide mo re satisfactory results in comparison to linear methods. Impulse noise exists in many p ractical applications and can be generated by various sources, including a number of man made phenomena, such as unprotected switches, industrial machines and car ign ition systems. Images are often corrupted by impulse noise due to a noisy sensor or ch annel transmission errors. The most common method used for impulse noise suppressi on n forgray-scale and color images is the median filter (MF) [2].The basic drawback o f the application of the MF is the blurringof the image in process. In the general case,t he filter is applied uniformly across an image, modifying pixels that arenot contamina ted by noise. In this way, the effective elimination of impulse noise is often at the exp ense of an overalldegradation of the image and blurred or distorted features[3].In this paper an intelligent hardware structure of a content based median filter (CBMF) suita ble for impulse noise suppression is presented. The function of the proposed circuit is to detect the existence of noise in the image window and apply the corresponding MFonly when necessary. The noise detection procedure is based on the content of the im age and computes the differences between the central pixel and thesurrounding pixels of a neighborhood. The main advantage of this adaptive approach is that image blurrin g is avoided and the integrity of edge and detail information are preserved[4,5]. The pro posed digital hardware structure is capable of processing gray-scale images of 8-bitres olution and performs both positive and negative impulse noise removal. The architectt ure chosen is based on a sequence of four basic functional pipelined stages, and parall el processing is used within each stage. A moving window of a 3×3 and 5×5-pixel im age neighborhood can be selected. However, the system can be easily expanded to acc ommodate windows of larger sizes. The proposed structure was implemented using fi eld programmable gate arrays (FPGA). The digital circuit was designed, compiled and successfully simulated using the MAX+PLUS II Programmable Logic Development S ystem by Altera Corporation. The EPF10K200SFC484-1 FPGA device of the FLEX1 0KE device family was utilized for the realization of the system. The typical clock fre quency is 55 MHz and the system can be used for real-time imaging applications whe re fast processing is required [6]. As an example,the time required to perform filtering of a gray-scale image of 260×244 pixels is approximately 10.6 msec.2. ADAPTIVE FILTERING PROCEDUREThe output of a median filter at a point x of an image f depends on the values of t he image points in the neighborhood of x. This neighborhood is determined by a wind ow W that is located at point x of f including n points x1, x2, …, xn of f, with n=2k+1. The proposed adaptive content based median filter can be utilized for impulse noisesu p pression in gray-scale images. A block diagram of the adaptive filtering procedure is depicted in Fig. 1. The noise detection procedure for both positive and negative noise is as follows:(i) We consider a neighborhood window W that is located at point x of the image f. Th e differences between the central pixel at point x and the pixel values of the n-1surr ounding points of the neighborhood (excluding thevalue of the central pixel) are co mputed.(ii) The sum of the absolute values of these differences is computed, denoted as fabs(x ). This value provides ameasure of closeness between the central pixel and its su rrounding pixels.(iii) The value fabs(x) is compared to fthreshold(x), which is anappropriately selected positive integer threshold value and can be modified. The central pixel is conside red to be noise when the value fabs(x) is greater than thethreshold value fthresho d(x).(iv) When the central pixel is considered to be noise it is substituted by the median val ue of the image neighborhood,denoted as fk+1, which is the normal operationof the median filter. In the opposite case, the value of the central pixel is not altered and the procedure is repeated for the next neighborhood window.From the noised etection scheme described, it should be mentioned that the noise detection level procedure can be controlled and a range of pixel values (and not only the fixedvalues of 0 and 255, salt and pepper noise) is considered asimpulse noise.In Fig. 2 the results of the application of the median filter and the CBMF in the gray-sca le image “Peppers” are depicted.More specifically, in Fig. 2(a) the original,uncor rupted image“Peppers” is depicted. In Fig. 2(b) the original imagedegraded by 5% both positive and negative impulse noise isillustrated. In Figs 2(c) and 2(d) the resultant images of the application of median filter and CBMF for a 3×3-pixel win dow are shown, respectively. Finally, the resultant images of the application of m edian filter and CBMF for a 5×5-pixelwindow are presented in Figs 2(e) and 2(f). It can be noticed that the application of the CBMF preserves much better edges a nddetails of the images, in comparison to the median filter.A number of different objective measures can be utilized forthe evaluation of these results. The most wi dely used measures are the Mean Square Error (MSE) and the Normalized Mean Square Error (NMSE) [1]. The results of the estimation of these measures for the two filters are depicted in Table I.For the estimation of these measures, the result ant images of the filters are compared to the original, uncorrupted image.From T able I it can be noticed that the MSE and NMSE estimatedfor the application of t he CBMF are considerably smaller than those estimated for the median filter, in all the cases.Table I. Similarity measures.3. HARDWARE ARCHITECTUREThe structure of the adaptive filter comprises four basic functional units, the mo ving window unit , the median computation unit , the arithmetic operations unit , and th e output selection unit . The input data of the system are the gray-scale values of the pi xels of the image neighborhood and the noise threshold value. For the computation of the filter output a3×3 or 5×5-pixel image neighborhood can be selected. Image input d ata is serially imported into the first stage. In this way,the total number of the inputpin s are 24 (21 inputs for the input data and 3 inputs for the clock and the control signalsr equired). The output data of the system are the resultant gray-scale values computed f or the operation selected (8pins).The moving window unit is the internal memory of the system,used for storing th e input values of the pixels and for realizing the moving window operation. The pixel values of the input image, denoted as “IMAGE_INPUT[7..0]”, areimported into this u nit in serial. For the representation of thethreshold value used for the detection of a no Filter Impulse noise 5% mse Nmse(×10-2) 3×3 5×5 3×3 5×5Median CBMF 57.554 35.287 130.496 84.788 0.317 0.194 0.718 0.467ise pixel 13 bits are required. For the moving window operation a 3×3 (5×5)-pixel sep entine type memory is used, consisting of 9 (25)registers. In this way,when the windoP1 P2 P3w is moved into the next image neighborhood only 3 or 5 pixel values stored in the memory are altered. The “en5×5” control signal is used for the selection of the size of th e image window, when“en5×5” is equal to “0” (“1”) a 3×3 (5×5)-pixel neighborhood is selected. It should be mentioned that the modules of the circuit used for the 3×3-pix el window are utilized for the 5×5-pixel window as well. For these modules, 2-to-1mu ltiplexers are utilized to select the appropriate pixel values,where necessary. The mod ules that are utilized only in the case of the 5×5-pixel neighborhood are enabled by th e“en5×5” control signal. The outputs of this unit are rows ofpixel values (3 or 5, respe ctively), which are the inputs to the median computation unit.The task of the median c omputation unit is to compute themedian value of the image neighborhood in order to substitutethe central pixel value, if necessary. For this purpose a25-input sorter is utili zeed. The structure of the sorter has been proposed by Batcher and is based on the use of CS blocks. ACS block is a max/min module; its first output is the maximumof the i nputs and its second output the minimum. The implementation of a CS block includes a comparator and two 2-to-1 multiplexers. The outputs values of the sorter, denoted a s “OUT_0[7..0]”…. “OUT_24[7..0]”, produce a “sorted list” of the 25 initial pixel val ues. A 2-to-1 multiplexer isused for the selection of the median value for a 3×3 or 5×5-pixel neighborhood.The function of the arithmetic operations unit is to computethe value fabs(x), whi ch is compared to the noise threshold value in the final stage of the adaptive filter.The in puts of this unit are the surrounding pixel values and the central pixelof the neighb orhood. For the implementation of the mathematical expression of fabs(x), the circuit of this unit contains a number of adder modules. Note that registers have been used to achieve a pipelined operation. An additional 2-to-1 multiplexer is utilized for the selec tion of the appropriate output value, depending on the “en5×5” control signal. From th e implementation point of view, the use of arithmetic blocks makes this stage hardwar e demanding.The output selection unit is used for the selection of the appropriateoutput value of the performed noise suppression operation. For this selection, the corresponding no ise threshold value calculated for the image neighborhood,“NOISE_THRES HOLD[1 2..0]”,is employed. This value is compared to fabs(x) and the result of the comparison Classifies the central pixel either as impulse noise or not. If thevalue fabs(x) is greater than the threshold value fthreshold(x) the central pixel is positive or negative impulse noise and has to be eliminated. For this reason, the output of the comparison is used as the selection signal of a 2-to-1 multiplexer whose inputs are the central pixel and the c orresponding median value for the image neighborhood. The output of the multiplexer is the output of this stage and the final output of the circuit of the adaptive filter.The st ructure of the CBMF, the computation procedure and the design of the four aforeme n tioned units are illustrated in Fig. 3.ImagewindoeFigure 1: Block diagram of the filtering methodFigure 2: Results of the application of the CBMF: (a) Original image, (b) noise corrupted image (c) Restored image by a 3x3 MF, (d) Restored image by a 3x3 CBMF, (e) Restored image by a 5x5 MF and (f) Restored image by a 5x5 CBMF.4. IMPLEMENTATION ISSUESThe proposed structure was implemented in FPGA,which offer an attractive com bination of low cost, high performance and apparent flexibility, using the software pa ckage+PLUS II of Altera Corporation. The FPGA used is the EPF10K200SFC484-1 d evice of the FLEX10KE device family,a device family suitable for designs that requir e high densities and high I/O count. The 99% of the logic cells(9965/9984 logic cells) of the device was utilized to implement the circuit . The typical operating clock frequ ency of the system is 55 MHz. As a comparison, the time required to perform filtering of a gray-scale image of 260×244 pixelsusing Matlab® software on a Pentium 4/2.4 G Hz computer system is approximately 7.2 sec, whereas the corresponding time using h ardware is approximately 10.6 msec.The modification of the system to accommodate windows oflarger sizes can be done in a straightforward way, requiring onlya small nu mber of changes. More specifically, in the first unit the size of the serpentine memory P4P5P6P7P8P9SubtractorarryMedianfilteradder comparatormuitiplexerf abc(x)valueand the corresponding number of multiplexers increase following a square law. In the second unit, the sorter module should be modified,and in the third unit the number of the adder devicesincreases following a square law. In the last unit no changes are requ ired.5. CONCLUSIONSThis paper presents a new hardware structure of a content based median filter, ca pable of performing adaptive impulse noise removal for gray-scale images. The noise detection procedure takes into account the differences between the central pixel and th e surrounding pixels of a neighborhood.The proposed digital circuit is capable ofproce ssing grayscale images of 8-bit resolution, with 3×3 or 5×5-pixel neighborhoods as op tions for the computation of the filter output. However, the design of the circuit is dire ctly expandableto accommodate larger size image windows. The adaptive filter was d eigned and implemented in FPGA. The typical clock frequency is 55 MHz and the sys tem is suitable forreal-time imaging applications.REFERENCES[1] W. K. Pratt, Digital Image Processing. New York: Wiley,1991.[2] G. R. Arce, N. C. Gallagher and T. Nodes, “Median filters:Theory and applicat ions,” in Advances in ComputerVision and Image Processing, Greenwich, CT: JAI, 1986.[3] T. A. Nodes and N. C. Gallagher, Jr., “The output distributionof median type filte rs,” IEEE Transactions onCommunications, vol. COM-32, pp. 532-541, May1984.[4] T. Sun and Y. Neuvo, “Detail-preserving median basedfilters in imageprocessing,” Pattern Recognition Letters,vol. 15, pp. 341-347, Apr. 1994.[5] E. Abreau, M. Lightstone, S. K. Mitra, and K. Arakawa,“A new efficient approachfor the removal of impulsenoise from highly corrupted images,” IEEE Transa ctionson Image Processing, vol. 5, pp. 1012-1025, June 1996.[6] E. R. Dougherty and P. Laplante, Introduction to Real-Time Imaging, Bellingham:SPIE/IEEE Press, 1995.二、英文翻译基于中值滤波的新的内容摘要在本设计中的提出了基于中值滤波的硬件实现用来抑制脉冲噪声的干扰。
Journal of VLSI Signal Processing39,295–311,2005c 2005Springer Science+Business Media,Inc.Manufactured in The Netherlands.Parallel-Beam Backprojection:An FPGA Implementation Optimizedfor Medical ImagingMIRIAM LEESER,SRDJAN CORIC,ERIC MILLER AND HAIQIAN YU Department of Electrical and Computer Engineering,Northeastern University,Boston,MA02115,USAMARC TREPANIERMercury Computer Systems,Inc.,Chelmsford,MA01824,USAReceived September2,2003;Revised March23,2004;Accepted May7,2004Abstract.Medical image processing in general and computerized tomography(CT)in particular can benefit greatly from hardware acceleration.This application domain is marked by computationally intensive algorithms requiring the rapid processing of large amounts of data.To date,reconfigurable hardware has not been applied to the important area of image reconstruction.For efficient implementation and maximum speedup,fixed-point implementations are required.The associated quantization errors must be carefully balanced against the requirements of the medical community.Specifically,care must be taken so that very little error is introduced compared tofloating-point implementations and the visual quality of the images is not compromised.In this paper,we present an FPGA implementation of the parallel-beam backprojection algorithm used in CT for which all of these requirements are met.We explore a number of quantization issues arising in backprojection and concentrate on minimizing error while maximizing efficiency.Our implementation shows approximately100times speedup over software versions of the same algorithm running on a1GHz Pentium,and is moreflexible than an ASIC implementation.Our FPGA implementation can easily be adapted to both medical sensors with different dynamic ranges as well as tomographic scanners employed in a wider range of application areas including nondestructive evaluation and baggage inspection in airport terminals.Keywords:backprojection,medical imaging,tomography,FPGA,fixed point arithmetic1.IntroductionReconfigurable hardware offers significant potentialfor the efficient implementation of a wide range ofcomputationally intensive signal and image process-ing algorithms.The advantages of utilizing Field Pro-grammable Gate Arrays(FPGAs)instead of DSPsinclude reductions in the size,weight,performanceand power required to implement the computationalplatform.FPGA implementations are also preferredover ASIC implementations because FPGAs have moreflexibility and lower cost.To date,the full utility ofthis class of hardware has gone largely unexploredand unexploited for many mainstream applications.In this paper,we consider a detailed implementa-tion and comprehensive analysis of one of the mostfundamental tomographic image reconstruction steps,backprojection,on reconfigurable hardware.While weconcentrate our analysis on issues arising in the useof backprojection for medical imaging applications,both the implementation and the analysis we providecan be applied directly or easily extended to a widerange of otherfields where this task needs to be per-formed.This includes remote sensing and surveillanceusing synthetic aperture radar and non-destructiveevaluation.296Leeser et al.Tomography refers to the process that generates a cross-sectional or volumetric image of an object from a series of projections collected by scanning the ob-ject from many different directions[1].Projection data acquisition can utilize X-rays,magnetic resonance,ra-dioisotopes,or ultrasound.The discussion presented here pertains to the case of two-dimensional X-ray ab-sorption tomography.In this type of tomography,pro-jections are obtained by a number of sensors that mea-sure the intensity of X-rays travelling through a slice of the scanned object.The radiation source and the sen-sor array rotate around the object in small increments. One projection is taken for each rotational angle.The image reconstruction process uses these projections to calculate the average X-ray attenuation coefficient in cross-sections of a scanned slice.If different structures inside the object induce different levels of X-ray atten-uation,they are discernible in the reconstructed image. The most commonly used approach for image recon-struction from dense projection data(many projections, many samples per projection)isfiltered backprojection (FBP).Depending on the type of X-ray source,FBP comes in parallel-beam and fan-beam variations[1].In this paper,we focus on parallel-beam backprojection, but methods and results presented here can be extended to the fan-beam case with modifications.FBP is a computationally intensive process.For an image of size n×n being reconstructed with n projec-tions,the complexity of the backprojection algorithm is O(n3).Image reconstruction through backprojection is a highly parallelizable process.Such applications are good candidates for implementation in Field Pro-grammable Gate Array(FPGA)devices since they pro-videfine-grained parallelism and the ability to be cus-tomized to the needs of a particular implementation. We have implemented backprojection by making use of these principles and shown approximately100times speedup over a software implementation on a1GHz Pentium.Our architecture can easily be expanded to newer and larger FPGA devices,further accelerating image generation by extracting more data parallelism.A difficulty of implementing FBP is that producing high-resolution images with good resemblance to in-ternal characteristics of the scanned object requires that both the density of each projection and their total num-ber be large.This represents a considerable challenge for hardware implementations,which attempt to maxi-mize the parallelism in the implementation.Therefore, it can be beneficial to usefixed-point implementations and to optimize the bit-width of a projection sample to the specific needs of the targeted application domain. We show this for medical imaging,which exhibits distinctive properties in terms of requiredfixed-point precision.In addition,medical imaging requires high precision reconstructions since visual quality of images must not be compromised.We have paid special attention to this requirement by carefully analyzing the effects of quan-tization on the quality of reconstructed images.We have found that afixed-point implementation with properly chosen bit-widths can give high quality reconstructions and,at the same time,make hardware implementation fast and area efficient.Our quantization analysis inves-tigates algorithm specific and also general data quanti-zation issues that pertain to input data.Algorithm spe-cific quantization deals with the precision of spatial ad-dress generation including the interpolation factor,and also investigates bit reduction of intermediate results for different rounding schemes.In this paper,we focus on both FPGA implemen-tation performance and medical image quality.In pre-vious work in the area of hardware implementations of tomographic processing algorithms,Wu[2]gives a brief overview of all major subsystems in a com-puted tomography(CT)scanner and proposes loca-tions where ASICs and FPGAs can be utilized.Ac-cording to the author,semi-custom digital ASICs were the most appropriate due to the level of sophistica-tion that FPGA technology had in1991.Agi et al.[3]present thefirst description of a hardware solu-tion for computerized tomography of which we are aware.It is a unified architecture that implements for-ward Radon transform,parallel-and fan-beam back-projection in an ASIC based multi-processor system. Our FPGA implementation focuses on backprojection. Agi et al.[4]present a similar investigation of quanti-zation effects;however their results do not demonstrate the suitability of their implementation for medical ap-plications.Although theirfiltered sinogram data are quantized with12-bit precision,extensive bit trunca-tion on functional unit outputs and low accuracy of the interpolation factor(absolute error of up to2)ren-der this implementation significantly less accurate than ours,which is based on9-bit projections and the max-imal interpolation factor absolute error of2−4.An al-ternative to using specially designed processors for the implementation offiltered backprojection(FBP)is pre-sented in[5].In this work,a fast and direct FBP al-gorithm is implemented using texture-mapping hard-ware.It can perform parallel-beam backprojection of aParallel-Beam Backprojection 297512-by-512-pixel image from 804projections in 2.1sec,while our implementation takes 0.25sec for 1024projections.Luiz et al.[6]investigated residue number systems (RNS)for the implementation of con-volution based backprojection to speedup the process-ing.Unfortunately,extra binary-to-RNS and RNS-to-binary conversions are introduced.Other approaches to accelerating the backprojection algorithm have been investigated [7,8].One approach [7]presents an order O (n 2log n )and merits further study.The suitability to medical image quality and hardware implementation of these approaches[7,8]needs to be demonstrated.There are also a lot of interests in the area of fan-beam and cone-beam reconstruction using hardware implementa-tion.An FPGA-based fan-beam reconstruction module [9]is proposed and simulated using MAX +PLUS2,version 9.1,but no actual FPGA implementation is mentioned.Moreover,the authors did not explore the potential parallelism for different projections as we do,which is essential for speed-up.More data and com-putation is needed for 3D cone-beam FBP.Yu’s PC based system [10]can reconstruct the 5123data from 288∗5122projections in 15.03min,which is not suit-able for real-time.The embedded system described in [11]can do 3D reconstruction in 38.7sec with the fastest time reported in the literature.However,itisFigure 1.(a)Illustration of the coordinate system used in parallel-beam backprojection,and (b)geometric explanation of the incremental spatial address calculation.based on a Mercury RACE ++AdapDev 1120devel-opment workstation and need many modifications for a different platform.Bins et al.[12]have investigated precision vs.error in JPEG compression.The goals of this research are very similar to ours:to implement de-signs in fixed-point in order to maximize parallelism and area utilization.However,JPEG compression is an application that can tolerate a great deal more error than medical imaging.In the next section,we present the backprojection algorithm in more detail.In Section 3we present our quantization studies and analysis of error introduced.Section 4presents the hardware implementation in de-tail.Finally we present results and discuss future di-rections.An earlier version of this research was pre-sented [13].This paper provides a fuller discussion of the project and updated results.2.Parallel-Beam Filtered BackprojectionA parallel-beam CT scanning system uses an array of equally spaced unidirectional sources of focused X-ray beams.Generated radiation not absorbed by the object’s internal structure reaches a collinear array of detectors (Fig.1(a)).Spatial variation of the absorbed298Leeser et al.energy in the two-dimensional plane through the ob-ject is expressed by the attenuation coefficient µ(x ,y ).The logarithm of the measured radiation intensity is proportional to the integral of the attenuation coef-ficient along the straight line traversed by the X-ray beam.A set of values given by all detectors in the array comprises a one-dimensional projection of the attenu-ation coefficient,P (t ,θ),where t is the detector dis-tance from the origin of the array,and θis the angle at which the measurement is taken.A collection of pro-jections for different angles over 180◦can be visualized in the form of an image in which one axis is position t and the other is angle θ.This is called a sinogram or Radon transform of the two-dimensional function µ,and it contains information needed for the reconstruc-tion of an image µ(x ,y ).The Radon transform can be formulated aslog e I 0I d= µ(x ,y )δ(x cos θ+y sin θ−t )dx dy≡P (t ,θ)(1)where I 0is the source intensity,I d is the detected inten-sity,and δ(·)is the Dirac delta function.Equation (1)is actually a line integral along the path of the X-ray beam,which is perpendicular to the t axis (see Fig.1(a))at location t =x cos θ+y sin θ.The Radon transform represents an operator that maps an image µ(x ,y )to a sinogram P (t ,θ).Its inverse mapping,the inverse Radon transform,when applied to a sinogram results in an image.The filtered backprojection (FBP)algo-rithm performs this mapping [1].FBP begins by high-pass filtering all projections be-fore they are fed to hardware using the Ram-Lak or ramp filter,whose frequency response is |f |.The dis-crete formulation of backprojection isµ(x ,y )=πK Ki =1 θi(x cos θi +y sin θi ),(2)where θ(t )is a filtered projection at angle θ,and K is the number of projections taken during CT scanning at angles θi over a 180◦range.The number of val-ues in θ(t )depends on the image size.In the case of n ×n pixel images,N =√n D detectors are re-quired.The ratio D =d /τ,where d is the distance between adjacent pixels and τis the detector spac-ing,is a critical factor for the quality of the recon-structed image and it obviously should satisfy D >1.In our implementation,we utilize values of D ≈1.4and N =1024,which are typical for real systems.Higher values do not significantly increase the image quality.Algorithmically,Eq.(2)is implemented as a triple nested “for”loop.The outermost loop is over pro-jection angle,θ.For each θ,we update every pixel in the image in raster-scan order:starting in the up-per left corner and looping first over columns,c ,and next over rows,r .Thus,from (2),the pixel at loca-tion (r ,c )is incremented by the value of θ(t )where t is a function of r and c .The issue here is that the X-ray going through the currently reconstructed pixel,in general,intersects the detector array between detec-tors.This is solved by linear interpolation.The point of intersection is calculated as an address correspond-ing to detectors numbered from 0to 1023.The frac-tional part of this address is the interpolation factor.The equation that performs linear interpolation is given byint θ(i )=[ θ(i +1)− θ(i )]·I F + θ(i ),(3)where IF denotes the interpolation factor, θ(t )is the 1024element array containing filtered projection data at angle θ,and i is the integer part of the calculated address.The interpolation can be performed before-hand in software,or it can be a part of the backpro-jection hardware itself.We implement interpolation in hardware because it substantially reduces the amount of data that must be transmitted to the reconfigurable hardware board.The key to an efficient implementation of Eq.(2)is shown in Fig.1(b).It shows how a distance d between square areas that correspond to adjacent pixels can be converted to a distance t between locations where X-ray beams that go through the centers of these areas hit the detector array.This is also derived from the equa-tion t =x cos θ+y sin θ.Assuming that pixels are pro-cessed in raster-scan fashion,then t =d cos θfor two adjacent pixels in the same row (x 2=x 1+d )and sim-ilarly t =d sin θfor two adjacent pixels in the same column (y 2=y 1−d ).Our implementation is based on pre-computing and storing these deltas in look-up tables(LUTs).Three LUTs are used corresponding to the nested “for”loop structure of the backprojection algorithm.LUT 1stores the initial address along the detector axis (i.e.along t )for a given θrequired to update the pixel at row 1,column 1.LUT 2stores the increment in t required as we increment across a row.LUT 3stores the increment for columns.Parallel-Beam Backprojection299Figure2.Major simulation steps.3.QuantizationMapping the algorithm directly to hardware will not produce an efficient implementation.Several modifica-tions must be made to obtain a good hardware realiza-tion.The most significant modification is usingfixed-point arithmetic.For hardware implementation,narrow bit widths are preferred for more parallelism which translates to higher overall processing speed.How-ever,medical imaging requires high precision which may require wider bit widths.We did extensive analy-sis to optimize this tradeoff.We quantize all data and all calculations to increase the speed and decrease the re-sources required for implementation.Determining al-lowable quantization is based on a software simulation of the tomographic process.Figure2shows the major blocks of the simulation. An input image isfirst fed to the software implementa-tion of the Radon transform,also known as reprojection [14],which generates the sinogram of1024projections and1024samples per projection.Thefiltering block convolves sinogram data with the impulse response of the rampfilter generating afiltered sinogram,which is then backprojected to give a reconstructed image.All values in the backprojection algorithm are real numbers.These can be implemented as eitherfloating-point orfixed-point values.Floating-point represen-tation gives increased dynamic range,but is signifi-cantly more expensive to implement in reconfigurable hardware,both in terms of area and speed.For these reasons we have chosen to usefixed-point arithmetic. An important issue,especially in medical imaging,is how much numerical accuracy is sacrificed whenfixed-point values are used.Here,we present the methods used tofind appropriate bit-widths for maintaining suf-ficient numerical accuracy.In addition,we investigate possibilities for bit reduction on the outputs of certain functional units in the datapath for different rounding schemes,and what influence that has on the error intro-duced in reconstructed images.Our analysis shows that medical images display distinctive properties with re-spect to how different quantization choices affect their reconstruction.We exploit this and customize quan-tization to bestfit medical images.We compute the quantization error by comparing afixed-point image reconstruction with afloating-point one.Fixed-point variables in our design use a general slope/bias-encoding,meaning that they are represented asV≈V a=SQ+B,(4) where V is an arbitrary real number,V a is itsfixed-point approximation,Q is an integer that encodes V,S is the slope,and B is the bias.Fixed-point versions of the sinogram and thefiltered sinogram use slope/bias scaling where the slope and bias are calculated to give maximal precision.The quantization of these two vari-ables is calculated as:S=max(V)−min(V)max(Q)−min(Q)=max(V)−min(V)2,(5) B=max(V)−S·max(Q)orB=min(V)−S·min(Q),(6) Q=roundV−BS,(7)where ws is the word size in bits of integer Q.Here, max(V)and min(V)are the maximum and mini-mum values that V will take,respectively.max(V) was determined based on analysis of data.Since sinogram data are unsigned numbers,in this case min(V)=min(Q)=B=0.The interpolation factor is an unsigned fractional number and uses radix point-only scaling.Thus,the quantized interpolation factor is calculated as in Eq.(7),with saturation on overflow, with S=2−E where E is the number of fractional bits, and with B=0.For a given sinogram,S and B are constants and they do not show up in the hardware—only the quan-tized value Q is part of the hardware implementation. Note that in Eq.(3),two data samples are subtracted from each other before multiplication with the inter-polation factor takes place.Thus,in general,the bias B is eliminated from the multiplication,which makes quantization offiltered sinogram data with maximal precision scaling easily implementable in hardware.300Leeser etal.Figure 3.Some of the images used as inputs to the simulation process.The next important issue is the metric used for evalu-ating of the error introduced by quantization.Our goal was to find a metric that would accurately describe vi-sual differences between compared images regardless of their dynamic range.If 8-bit and 16-bit versions of a single image are reconstructed so that there is no vis-ible difference between the original and reconstructed images,the proper metric should give a comparable estimate of the error for both bit-widths.The proper metric should also be insensitive to the shift of pixel value range that can emerge for different quantization and rounding schemes.Absolute values of single pix-els do not effect visual image quality as long as their relative value is preserved,because pixel values are mapped to a set of grayscale values.The error metric we use that meets these criteria is the Relative Error (RE):RE = M i =1 (x i −¯x )− y F P i−¯y F P 2M i =1 y F P i−¯y F P ,(8)Here,M is the total number of pixels,x i and y F Pi are the values of the i -th pixel in the quantized and floating-point reconstructions respectively,and ¯x,¯y FP are their means.The mean value is sub-tracted because we only care about the relative pixel values.Figure 3shows some characteristic images from a larger set of 512-by-512-pixel images used as inputs to the simulation process.All images are monochrome 8-bit images,but 16-bit versions are also used in simu-lations.Each image was chosen for a certain reason.For example,the Shepp-Logan phantom is well known and widely used in testing the ability of algorithms to accu-rately reconstruct cross sections of the human head.It is believed that cross-sectional images of the human head are the most sensitive to numerical inaccuracies and the presence of artifacts induced by a reconstruction algo-rithm [1].Other medical images were Female,Head,and Heart obtained from the visible human web site [15].The Random image (a white noise image)should result in the upper bound on bit-widths required for a precise reconstruction.The Artificial image is unique because it contains all values in the 8-bit grayscale range.This image also contains straight edges of rect-angles,which induce more artifacts in the reconstructed image.This is also characteristic of the Head image,which contains a rectangular border around the head slice.Figure 4shows the detailed flowchart of the simu-lated CT process.In addition to the major blocks des-ignated as Reproject,Filter and Backproject,Fig.4also includes the different quantization steps that we have investigated.Each path in this flowchart rep-resents a separate simulation cycle.Cycle 1gives aParallel-Beam Backprojection301Figure 4.Detailed flowchart of the simulation process.floating-point (FP)reconstruction of an input image.All other cycles perform one or more type of quan-tization and their resulting images are compared to the corresponding FP reconstruction by computing the Relative Error.The first quantization step converts FP projection data obtained by the reprojection step to a fixed-point representation.Simulation cycle 2is used to determine how different bit-widths for quantized sino-gram data affect the quality of a reconstructed image.Our research was based on a prototype system that used 12-bit accurate detectors for the acquisition of sino-gram data.Simulations showed that this bit-width is a good choice since worst case introduced error amounts to 0.001%.The second quantization step performsthe Figure 5.Simulation results for the quantization of filtered sinogram data.conversion of filtered sinogram data from FP to fixed-point representation.Simulation cycle 3is used to find the appropriate bit-width of the words representing a filtered sinogram.Figure 5shows the results for this cycle.Since we use linear interpolation of projection values corresponding to adjacent detectors,the interpo-lation factor in Eq.(3)also has to be quantized.Figure 6summarizes results obtained from simulation cycle 4,which is used to evaluate the error induced by this quantization.Figures 5and 6show the Relative Error metric for different word length values and for different simula-tion cycles for a number of input images.Some input images were used in both 8-bit and 16-bit versions.302Leeser etal.Figure 6.Simulation results for the quantization of the interpolation factor.Figure 5corresponds to the quantization of filtered sinogram data (path 3in Fig.4).The conclusion here is that 9-bit quantization is the best choice since it gives considerably smaller error than 8-bit quantiza-tion,which for some images induces visible artifacts.At the same time,10-bit quantization does not give vis-ible improvement.The exceptions are images 2and 3,which require 13bits.From Fig.6(path 4in Fig.4),we conclude that 3bits for the interpolation factor (mean-ing the maximum error for the spatial address is 2−4)Figure 7.Relative error between fixed-point and floating-point reconstruction.is sufficiently accurate.As expected,image 1is more sensitive to the precision of the linear interpolation be-cause of its randomness.Figure 7shows that combining these quantization schemes results in a very small error for image “Head”in Fig.3.We also investigated whether it is feasible to discard some of the least significant bits (LSBs)on outputs of functional units (FUs)in the datapath and still not introduce any visible artifacts.The goal is for the re-constructed pixel values to have the smallest possibleParallel-Beam Backprojection 303bit-widths.This is based on the intuition that bit re-duction done further down the datapath will introduce a smaller amount of error in the result.If the same bit-width were obtained by simply quantizing filtered projection data with fewer bits,the error would be mag-nified by the operations performed in the datapath,es-pecially by the multiplication.Path number 5in Fig.4depicts the simulation cycles that investigates bit reduc-tion at the outputs of three of the FUs.These FUs imple-ment subtraction,multiplication and addition that are all part of the linear interpolation from Eq.(3).When some LSBs are discarded,the remaining part of a binary word can be rounded in different ways.We investigate two different rounding schemes,specifically rounding to nearest and truncation (or rounding to floor).Round-ing to nearest is expected to introduce the smallest er-ror,but requires additional logic resources.Truncation has no resource requirements,but introduces a nega-tive shift of values representing reconstructed pixels.Bit reduction effectively optimizes bit-widths of FUs that are downstream in the data flow.Figure 8shows tradeoffs of bit reduction and the two rounding schemes after multiplication for medi-cal images.It should be noted that sinogram data are quantized to 12bits,filtered sinogram to 9bits,and the interpolation factor is quantized to 3bits (2−4pre-cision).Similar studies were done for the subtraction and addition operations and on a broader set of im-ages.It was determined that medical images suffer the least amount of error introduced by combining quanti-zations and bit reduction.For medical images,in case of rounding to nearest,there is very little difference inthe Figure 8.Bit reduction on the output of the interpolation multiplier.introduced error between 1and 3discarded bits after multiplication and addition.This difference is higher in the case of bit reduction after addition because the multiplication that follows magnifies the error.For all three FUs,when only medical images are considered,there is a fixed relationship between rounding to near-est and truncation.Two least-significant bits discarded with rounding to nearest introduce an error that is lower than or close to the error of 1bit discarded with trun-cation.Although rounding to nearest requires logic re-sources,even when only one LSB is discarded with rounding to nearest after each of three FUs,the overall resource consumption is reduced because of savings provided by smaller FUs and pipeline registers (see Figs.11and 12).Figure 9shows that discarding LSBs introduces additional error on medical images for this combination of quantizations.In our case there was no need for using bit reduction to achieve smaller resource consumption because the targeted FPGA chip (Xilinx Virtex1000)provided sufficient logic resources.There is one more quantization issue we considered.It pertains to data needed for the generation of the ad-dress into a projection array (spatial address addr )and to the interpolation factor.As described in the intro-duction,there are three different sets of data stored in look-up tables (LUTs)that can be quantized.Since pixels are being processed in raster-scan order,the spa-tial address addr is generated by accumulating entries from LUTs 2and 3to the corresponding entry in LUT 1.The 10-bit integer part of the address addr is the index into the projection array θ(·),while its fractional part is the interpolation factor.By using radix point-only。
Digital Image Processing, 2nd ed(数字图像处理(第2版))数据摘要:DIGITAL IMAGE PROCESSING has been the world-wide leading textbook in its field for more than 30 years. As the 1977 and 1987 editions by Gonzalez and Wintz, and the 1992 edition by Gonzalez and Woods, the present edition was prepared with students and instructors in mind. The material is timely, highly readable, and illustrated with numerous examples of practical significance. All mainstream areas of image processing are covered, including a totally revised introduction and discussion of image fundamentals, image enhancement in the spatial and frequency domains, restoration, color image processing, wavelets, image compression, morphology, segmentation, and image description. Coverage concludes with a discussion on the fundamentals of object recognition.Although the book is completely self-contained, this companion web site provides additional support in the form of review material, answers to selected problems, laboratory project suggestions, and a score of other features. A supplementary instructor's manual is available to instructors who have adopted the book for classroom use.中文关键词:数字图像处理,图像基础,图像在空间和频率域的增强,图像压缩,图像描述,英文关键词:digital image processing,image fundamentals,image compression,image description,数据格式:IMAGE数据用途:DIGITAL IMAGE PROCESSING数据详细介绍:Digital Image Processing, 2nd editionAbout the BookBasic InformationISBN number 020*******.Publisher: Prentice Hall12 chapters.793 pages.© 2002.DIGITAL IMAGE PROCESSING has been the world-wide leading textbook in its field for more than 30 years. As the 1977 and 1987 editions by Gonzalez and Wintz, and the 1992 edition by Gonzalez and Woods, the present edition was prepared with students and instructors in mind. The material is timely, highly readable, and illustrated with numerous examples of practical significance. All mainstream areas of image processing are covered, including a totally revised introduction and discussion of image fundamentals, image enhancement in the spatial and frequency domains, restoration, color image processing, wavelets, image compression, morphology, segmentation, and image description. Coverage concludes with a discussion on the fundamentals of object recognition.Although the book is completely self-contained, this companion web site provides additional support in the form of review material, answers to selected problems, laboratory project suggestions, and a score of other features. A supplementary instructor's manual is available to instructors who have adopted the book for classroom use.Partial list of institutions that use the book.NEW FEATURESNew chapters on wavelets, image morphology, and color image processing.A revision and update of all chapters, including topics such as segmentation by watersheds.More than 500 new images and over 200 new line drawings and tables.A reorganization that allows the reader to get to the material on actual image processing much sooner than before.A more intuitive development of traditional topics such as image transforms and image restoration.Numerous new examples with processed images of higher resolution. Updated image compression standards and a new section on compression using wavelets.Updated bibliography.Differences Between the DIP and DIPUM BooksDigital Image Processing is a book on fundamentals.Digital Image Processing Using MATLAB is a book on the software implementation of those fundamentals.The key difference between the books is that Digital Image Processing (DIP) deals primarily with the theoretical foundation of digital image processing, while Digital Image Processing Using MATLAB (DIPUM) is a book whose main focus is the use of MATLAB for image processing. The DIPUM book covers essentially the same topics as DIP, but the theoretical treatment is not asdetailed. Some instructors prefer to fill in the theoretical details in class in favor of having available a book with a strong emphasis on implementation.© 2002 by Prentice-Hall, Inc.Upper Saddle River, New Jersey 07458All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing from the publisher.The author and publisher of this book have used their best efforts in preparing this book.These efforts include the development, research, and testing of the theories and programs to determine their effectiveness.The author and publisher make no warranty of any kind, expressed or implied, with regard to these programs or the documentation contained in this book.The author and publisher shall not be liable in any event for incidental or consequential damages in connection with, or arising out of, the furnishing, performance, or use of these programs.数据预览:点此下载完整数据集。
数字图像处理技术综述摘要:随着计算机的普及,数字图像处理技术也获得了迅速发展,逐渐走进社会生产生活的各个方面。
本文是对数字图像处理技术的一个总体概述,包括其内涵、优势、主要方法及应用,最后对其发展做了简单的总结。
关键词:数字图像、图像处理技术、处理方法、应用领域Overview of digital image processing technologyAbstract: With the popularization of computer, digital image processing technology also won the rapid development, and gradually go into all aspects of social life and production. This paper is a general overview of the digital image processing technology, including its connotation, advantage, main method and its application. And finally, I do a simple summary of the development.Keywords: digital image, image processing technology, processing method, application field前言:图像处理技术被分为模拟图像处理和数字图像处理两大类。
数字图像处理技术一般都用计算机处理或实时的硬件处理,因此也称之为计算机图像处理[1]。
而时至今日,随着计算机的迅速普及,数字图像处理技术也飞速发展着,因为其用途的多样性,可以被广泛运用于医学、交通、化学等各个领域。
一、数字图像处理技术的概念内涵数字图像处理技术是指将一种图像信号转变为二进制数字信号,经过计算机对而其进行的图像变换、编码压缩、增强和复原以及分割、特征提取等处理,而高精准的还原到显示器的过程[2]。
附录A 英文原文Scene recognition for mine rescue robotlocalization based on visionCUI Yi-an(崔益安), CAI Zi-xing(蔡自兴), WANG Lu(王璐)Abstract:A new scene recognition system was presented based on fuzzy logic and hidden Markov model(HMM) that can be applied in mine rescue robot localization during emergencies. The system uses monocular camera to acquire omni-directional images of the mine environment where the robot locates. By adopting center-surround difference method, the salient local image regions are extracted from the images as natural landmarks. These landmarks are organized by using HMM to represent the scene where the robot is, and fuzzy logic strategy is used to match the scene and landmark. By this way, the localization problem, which is the scene recognition problem in the system, can be converted into the evaluation problem of HMM. The contributions of these skills make the system have the ability to deal with changes in scale, 2D rotation and viewpoint. The results of experiments also prove that the system has higher ratio of recognition and localization in both static and dynamic mine environments.Key words: robot location; scene recognition; salient image; matching strategy; fuzzy logic; hidden Markov model1 IntroductionSearch and rescue in disaster area in the domain of robot is a burgeoning and challenging subject[1]. Mine rescue robot was developed to enter mines during emergencies to locate possible escape routes for those trapped inside and determine whether it is safe for human to enter or not. Localization is a fundamental problem in this field. Localization methods based on camera can be mainly classified into geometric, topological or hybrid ones[2]. With its feasibility and effectiveness, scene recognition becomes one of the important technologies of topological localization.Currently most scene recognition methods are based on global image features and have twodistinct stages: training offline and matching online.During the training stage, robot collects the images of the environment where it works and processes the images to extract global features that represent the scene. Some approaches were used to analyze the data-set of image directly and some primary features were found, such as the PCA method [3]. However, the PCA method is not effective in distinguishing the classes of features. Another type of approach uses appearance features including color, texture and edge density to represent the image. For example, ZHOU et al[4] used multidimensional histograms to describe global appearance features. This method is simple but sensitive to scale and illumination changes. In fact, all kinds of global image features are suffered from the change of environment.LOWE [5] presented a SIFT method that uses similarity invariant descriptors formed by characteristic scale and orientation at interest points to obtain the features. The features are invariant to image scaling, translation, rotation and partially invariant to illumination changes. But SIFT may generate 1 000 or more interest points, which may slow down the processor dramatically.During the matching stage, nearest neighbor strategy(NN) is widely adopted for its facility and intelligibility[6]. But it cannot capture the contribution of individual feature for scene recognition. In experiments, the NN is not good enough to express the similarity between two patterns. Furthermore, the selected features can not represent the scene thoroughly according to the state-of-art pattern recognition, which makes recognition not reliable[7].So in this work a new recognition system is presented, which is more reliable and effective if it is used in a complex mine environment. In this system, we improve the invariance by extracting salient local image regions as landmarks to replace the whole image to deal with large changes in scale, 2D rotation and viewpoint. And the number of interest points is reduced effectively, which makes the processing easier. Fuzzy recognition strategy is designed to recognize the landmarks in place of NN, which can strengthen the contribution of individual feature for scene recognition. Because of its partial information resuming ability, hidden Markov model is adopted to organize those landmarks, which can capture the structure or relationship among them. So scene recognition can be transformed to the evaluation problem of HMM, which makes recognition robust.2 Salient local image regions detectionResearches on biological vision system indicate that organism (like drosophila) often pays attention to certain special regions in the scene for their behavioral relevance or local image cues while observing surroundings [8]. These regions can be taken as natural landmarks to effectively represent and distinguish different environments. Inspired by those, we use center-surround difference method to detect salient regions in multi-scale image spaces. The opponencies of color and texture are computed to create the saliency map.Follow-up, sub-image centered at the salient position in S is taken as the landmark region. The size of the landmark region can be decided adaptively according to the changes of gradient orientation of the local image [11].Mobile robot navigation requires that natural landmarks should be detected stably when environments change to some extent. To validate the repeatability on landmark detection of our approach, we have done some experiments on the cases of scale, 2D rotation and viewpoint changes etc. Fig.1 shows that the door is detected for its saliency when viewpoint changes. More detailed analysis and results about scale and rotation can be found in our previous works[12].3 Scene recognition and localizationDifferent from other scene recognition systems, our system doesn’t need training offline. In other words, our scenes are not classified in advance. When robot wanders, scenes captured at intervals of fixed time are used to build the vertex of a topological map, which represents the place where robot locates. Although the map’s geometric layout is ignored by the localization system, it is useful for visualization and debugging[13] and beneficial to path planning. So localization means searching the best match of current scene on the map. In this paper hidden Markov model is used to organize the extracted landmarks from current scene and create the vertex of topological map for its partial information resuming ability.Resembled by panoramic vision system, robot looks around to get omni-images. FromFig.1 Experiment on viewpoint changeseach image, salient local regions are detected and formed to be a sequence, named as landmark sequence whose order is the same as the image sequence. Then a hidden Markov model is created based on the landmark sequence involving k salient local image regions, which is taken as the description of the place where the robot locates. In our system EVI-D70 camera has a view field of ±170°. Considering the overlap effect, we sample environment every 45° to get 8 images.Let the 8 images as hidden state Si (1≤i≤8), the created HMM can be illustrated by Fig.2. The parameters of HMM, aij and bjk, are achieved by learning, using Baulm-Welch algorithm[14]. The threshold of convergence is set as 0.001.As for the edge of topological map, we assign it with distance information between twovertices. The distances can be computed according to odometry readings.Fig.2 HMM of environmentTo locate itself on the topological map, robot must run its ‘eye’ on environment and extract a landmark sequence L1′ −Lk′ , then search the map for the best matched vertex (scene). Different from traditional probabilistic localization[15], in our system localization problem can be converted to the evaluation problem of HMM. The vertex with the greatest evaluation value, which must also be greater than a threshold, is taken as the best matched vertex, which indicates the most possible place where the robot is.4 Match strategy based on fuzzy logicOne of the key issues in image match problem is to choose the most effective features or descriptors to represent the original image. Due to robot movement, those extracted landmark regions will change at pixel level. So, the descriptors or features chosen should be invariant to some extent according to the changes of scale, rotation and viewpoint etc. In this paper, we use 4 features commonly adopted in the community that are briefly described as follows.GO: Gradient orientation. It has been proved that illumination and rotation changes are likely to have less influence on it[5].ASM and ENT: Angular second moment and entropy, which are two texture descriptors.H: Hue, which is used to describe the fundamental information of the image.Another key issue in match problem is to choose a good match strategy or algorithm. Usually nearest neighbor strategy (NN) is used to measure the similarity between two patterns. But we have found in the experiments that NN can’t adequately exhibit the individual descriptor or feature’s contribution to similarity measurement. As indicated in Fig.4, the input image Fig.4(a) comes from different view of Fig.4(b). But the distance between Figs.4(a) and (b) computed by Jefferey divergence is larger than Fig.4(c).To solve the problem, we design a new match algorithm based on fuzzy logic for exhibiting the subtle changes of each features. The algorithm is described as below.And the landmark in the database whose fused similarity degree is higher than any others is taken as the best match. The match results of Figs.2(b) and (c) are demonstrated by Fig.3. As indicated, this method can measure the similarity effectively between two patterns.Fig.3 Similarity computed using fuzzy strategy5 Experiments and analysisThe localization system has been implemented on a mobile robot, which is built by our laboratory. The vision system is composed of a CCD camera and a frame-grabber IVC-4200. The resolution of image is set to be 400×320 and the sample frequency is set to be 10 frames/s. The computer system is composed of 1 GHz processor and 512 M memory, which is carried by the robot. Presently the robot works in indoor environments.Because HMM is adopted to represent and recognize the scene, our system has the ability to capture the discrimination about distribution of salient local image regions and distinguish similar scenes effectively. Table 1 shows the recognition result of static environments including 5 laneways and a silo. 10 scenes are selected from each environment and HMMs are created for each scene. Then 20 scenes are collected when the robot enters each environment subsequently to match the 60 HMMs above.In the table, “truth” m eans that the scene to be localized matches with the right scene (the evaluation value of HMM is 30% greater than the second high evaluation). “Uncertainty” means that the evaluation value of HMM is greater than the second high evaluation under 10%. “Error match” means that the scene to be localized matches with the wrong scene. In the table, the ratio of error match is 0. But it is possible that the scene to be localized can’t match any scenes and new vertexes are created. Furthermore, the “ratio of truth” about silo is lower because salient cues arefewer in this kind of environment.In the period of automatic exploring, similar scenes can be combined. The process can be summarized as: when localization succeeds, the current landmark sequence is added to the accompanying observation sequence of the matched vertex un-repeatedly according to their orientation (including the angle of the image from which the salient local region and the heading of the robot come). The parameters of HMM are learned again.Compared with the approaches using appearance features of the whole image (Method 2, M2), our system (M1) uses local salient regions to localize and map, which makes it have more tolerance of scale, viewpoint changes caused by robot’s movement and higher ratio of recognition and fewer amount of vertices on the topological map. So, our system has better performance in dynamic environment. These can be seen in Table 2. Laneways 1, 2, 4, 5 are in operation where some miners are working, which puzzle the robot.6 Conclusions1) Salient local image features are extracted to replace the whole image to participate in recognition, which improve the tolerance of changes in scale, 2D rotation and viewpoint of environment image.2) Fuzzy logic is used to recognize the local image, and emphasize the individual feature’s contribution to recognition, which improves the reliability of landmarks.3) HMM is used to capture the structure or relationship of those local images, which converts the scene recognition problem into the evaluation problem of HMM.4) The results from the above experiments demonstrate that the mine rescue robot scene recognition system has higher ratio of recognition and localization.Future work will be focused on using HMM to deal with the uncertainty of localization.附录B 中文翻译基于视觉的矿井救援机器人场景识别CUI Yi-an(崔益安), CAI Zi-xing(蔡自兴), WANG Lu(王璐)摘要:基于模糊逻辑和隐马尔可夫模型(HMM),论文提出了一个新的场景识别系统,可应用于紧急情况下矿山救援机器人的定位。
图像分割图像预处理中英文对照外文翻译文献中英文对照外文翻译一种在线图像编码识别系统的设计摘要:本文介绍了在线图像编码字符识别系统的设计与实现过程,对其中重点环节进行了分析与研究,给出了主要环节问题的解决方法,在识别算法上,结合模板匹配与特征识别,提出了基于特征加权的模板匹配算法,该算法对提高字符识别率提到了较好的作用。
关键词:图像处理;模式识别;特征加权;软件设计0引言图像编码字符识别的研究目前仍是国内外一个重点研究课题,它具有广泛的应用背景,比如车牌号码自动识别、邮政编码的自动识别、试卷自动阅读、报表自动处理等,由于这种在线图像编码字符的识别都具有一些共性,本文结合在线轮胎编码字符识别系统的设计,对一般图像编码字符识别系统进行了阐述,对关键环节进行了研究与分析,该方法对其它在线图像编码字符系统的开发具有一定指导意义。
1在线图像编码识别系统流程在线图像编码字符识别系统主要包括数字图像的采集、存储、图像预处理、编码图像提取、编码特征提取、编码识别和后续处理等一些环节,其流程图如图1所示。
图1 在线图像编码字符识别系统流程图在线轮胎图像编码字符识别系统要求对通过生产流水线上每一个轮胎采集含有轮胎编码的图像,然后通过对图像的处理,提取出轮胎编码特征,采用合适的识别算法将每一位编码字符进行识别。
由于轮胎编码字符在轮胎上有一定变形,且摄像角度不同,得到的编码图像差异也很大,规律性差,所以编码图像的预处理和识别算法的选取显得尤为重要。
2图像采集与存储在线编码图像通常使用数码摄像机、数码照相机、数码摄像头等设备采集并输入计算机进行处理,本系统采用QuickCamPro4000数码摄像头采集轮胎编码图像,直接按JPG格式存储。
编码图像一般都要先转成BMP图像格式,因为BMP格式己经成为PC领域事实上的标准——几乎所有为Windows操作系统设计的图像处理软件都支持这种格式的图像。
BMP是Windows的原始位图格式,它可以用于保存任意类型的位图数据,可以支持所有的屏幕分辨率和Windows所支持的颜色组合。
3HistogramsHistograms are used to depict image statistics in an easily interpreted visual format.With a histogram,it is easy to determine certain types of problems in an image,for example,it is simple to conclude if an image is properly exposed by visual inspection of its histogram.In fact,histograms are so useful that modern digital cameras often provide a real-time histogram overlay on the viewfinder(Fig.3.1)to help prevent taking poorly exposed pictures.It is important to catch errors like this at the image capture stage because poor exposure results in a permanent loss of information which it is not possible to recover later using image-processing techniques.In addition to their usefulness during image capture,histograms are also used later to improve the visual appearance of an image and as a“forensic”tool for determining what type of processing has previously been applied to an image.3.1What Is a Histogram?Histograms in general are frequency distributions,and histograms of images describe the frequency of the intensity values that occur in an image.This concept can be easily explained by considering an old-fashioned grayscale image like the one shown in Fig.3.2.A histogram h for a grayscale image I with intensity values in the range I(u,v)∈[0,K−1]would contain exactly K entries, where for a typical8bit grayscale image,K=28=256.Each individual histogram entry is defined ash(i)=the number of pixels in I with the intensity value i,W. Burger, M.J. Burge, Principles of Digital Image Processing, Undergraduate Topicsin Computer Science, DOI 10.1007/978-1-84800-191-6_3, Springer-Verlag London Limited, 2009©38 3.HistogramsFigure 3.1Digital camera back display showing a histogram overlay.Figure 3.2An 8-bit grayscale image and a histogram depicting the frequency distribution of its 256intensity values.for all 0≤i <K .More formally stated,h (i )=card (u,v )|I (u,v )=i .1(3.1)Therefore h (0)is the number of pixels with the value 0,h (1)the number of pixels with the value 1,and so forth.Finally h (255)is the number of all white pixels with the maximum intensity value 255=K −1.The result of the histogram computation is a one-dimensional vector h of length K .Figure 3.3gives an example for an image with K =16possible intensity values.Since a histogram encodes no information about where each of its individ-ual entries originated in the image,histograms contain no information about the spatial arrangement of pixels in the image.This is intentional since the 1card {...}denotes the number of elements (“cardinality”)in a set (see also p.233).3.2Interpreting Histograms 390123456789101112131415i i h (i h (i )Figure3.3Histogram vector for an image with K =16possible intensity values.The indices of the vector element i =0...15represent intensity values.The value of 10at index 2means that the image contains 10pixels of intensity value 2.Figure 3.4Three very different images with identical histograms.main function of a histogram is to provide statistical information,(e.g.,the distribution of intensity values)in a compact form.Is it possible to reconstruct an image using only its histogram?That is,can a histogram be somehow “in-verted”?Given the loss of spatial information,in all but the most trivial cases,the answer is no.As an example,consider the wide variety of images you could construct using the same number of pixels of a specific value.These images would appear different but have exactly the same histogram (Fig.3.4).3.2Interpreting HistogramsA histogram depicts problems that originate during image acquisition,such as those involving contrast and dynamic range,as well as artifacts resulting from image-processing steps that were applied to the image.Histograms are often used to determine if an image is making effective use of its intensity range (Fig.3.5)by examining the size and uniformity of the histogram’s distribution.40 3.HistogramsFigure3.5The effective intensity range.The graph depicts how often a pixel value occurs linearly(black bars)and logarithmically(gray bars).The logarithmic form makes even relatively low occurrences,which can be very important in the image,readily apparent. 3.2.1Image AcquisitionExposureHistograms make classic exposure problems readily apparent.As an example,a histogram where a large span of the intensity range at one end is largely unused while the other end is crowded with high-value peaks(Fig.3.6)is representative of an improperly exposed image.(a)(b)(c)Figure3.6Exposure errors are readily apparent in histograms.Underexposed(a),properly exposed(b),and overexposed(c)photographs.3.2Interpreting Histograms41(a)(b)(c)Figure 3.7How changes in contrast affect a histogram:low contrast(a),normal con-trast(b),high contrast(c).ContrastContrast is understood as the range of intensity values effectively used within a given image,that is the difference between the image’s maximum and minimum pixel values.A full-contrast image makes effective use of the entire range of available intensity values from a=a min...a max=0...K−1(black to white).Using this definition,image contrast can be easily read directly from the histogram.Figure3.7illustrates how varying the contrast of an image affects its histogram.Dynamic rangeThe dynamic range of an image is,in principle,understood as the number of distinct pixel values in an image.In the ideal case,the dynamic range encom-passes all K usable pixel values,in which case the value range is completely utilized.When an image has an available range of contrast a=a low...a high, witha min<a low and a high<a max,then the maximum possible dynamic range is achieved when all the intensity values lying in this range are utilized(i.e.,appear in the image;Fig.3.8).While the contrast of an image can be increased by transforming its existing values so that they utilize more of the underlying value range available,the dy-42 3.Histograms(a)(b)(c)Figure3.8How changes in dynamic range affect a histogram:high dynamic range(a), low dynamic range with64intensity values(b),extremely low dynamic range with only6 intensity values(c).namic range of an image can only be increased by introducing artificial(that is, not originating with the image sensor)values using methods such as interpola-tion(see Vol.2[6,Sec.10.3]).An image with a high dynamic range is desirable because it will suffer less image-quality degradation during image processing and compression.Since it is not possible to increase dynamic range after im-age acquisition in a practical way,professional cameras and scanners work at depths of more than8bits,often12–14bits per channel,in order to provide high dynamic range at the acquisition stage.While most output devices,such as monitors and printers,are unable to actually reproduce more than256dif-ferent shades,a high dynamic range is always beneficial for subsequent image processing or archiving.3.2.2Image DefectsHistograms can be used to detect a wide range of image defects that originate either during image acquisition or as the result of later image processing.Since histograms always depend on the visual characteristics of the scene captured in the image,no single“ideal”histogram exists.While a given histogram may be optimal for a specific scene,it may be entirely unacceptable for another. As an example,the ideal histogram for an astronomical image would likely be very different from that of a good landscape or portrait photo.Nevertheless,3.2Interpreting Histograms43 there are some general rules;for example,when taking a landscape image with a digital camera,you can expect the histogram to have evenly distributed intensity values and no isolated spikes.SaturationIdeally the contrast range of a sensor,such as that used in a camera,should be greater than the range of the intensity of the light that it receives from a scene. In such a case,the resulting histogram will be smooth at both ends because the light received from the very bright and the very dark parts of the scene will be less than the light received from the other parts of the scene.Unfortunately, this ideal is often not the case in reality,and illumination outside of the sensor’s contrast range,arising for example from glossy highlights and especially dark parts of the scene,cannot be captured and is lost.The result is a histogram that is saturated at one or both ends of its range.The illumination values lying outside of the sensor’s range are mapped to its minimum or maximum values and appear on the histogram as significant spikes at the tail ends.This typically occurs in an under-or overexposed image and is generally not avoidable when the inherent contrast range of the scene exceeds the range of the system’s sensor (Fig.3.9(a)).(a)(b)(c)Figure3.9Effect of image capture errors on histograms:saturation of high intensities(a), histogram gaps caused by a slight increase in contrast(b),and histogram spikes resulting from a reduction in contrast(c).Spikes and gapsAs discussed above,the intensity value distribution for an unprocessed image is generally smooth;that is,it is unlikely that isolated spikes(except for possible44 3.Histograms saturation effects at the tails)or gaps will appear in its histogram.It is also unlikely that the count of any given intensity value will differ greatly from that of its neighbors(i.e.,it is locally smooth).While artifacts like these are ob-served very rarely in original images,they will often be present after an image has been manipulated,for instance,by changing its contrast.Increasing the contrast(see Ch.4)causes the histogram lines to separate from each other and, due to the discrete values,gaps are created in the histogram(Fig.3.9(b)).De-creasing the contrast leads,again because of the discrete values,to the merging of values that were previously distinct.This results in increases in the corre-sponding histogram entries and ultimately leads to highly visible spikes in the histogram(Fig.3.9(c)).2Impacts of image compressionImage compression also changes an image in ways that are immediately evident in its histogram.As an example,during GIF compression,an image’s dynamic range is reduced to only a few intensities or colors,resulting in an obvious line structure in the histogram that cannot be removed by subsequent processing (Fig.3.10).Generally,a histogram can quickly reveal whether an image has ever been subjected to color quantization,such as occurs during conversion to a GIF image,even if the image has subsequently been converted to a full-color format such as TIFF or JPEG.Figure3.11illustrates what occurs when a simple line graphic with only two gray values(128,255)is subjected to a compression method such as JPEG, that is not designed for line graphics but instead for natural photographs.The histogram of the resulting image clearly shows that it now contains a large number of gray values that were not present in the original image,resulting ina poor-quality image3that appears dirty,fuzzy,and blurred.3.3Computing HistogramsComputing the histogram of an8-bit grayscale image containing intensity val-ues between0and255is a simple task.All we need is a set of256counters, one for each possible intensity value.First,all counters are initialized to zero. 2Unfortunately,these types of errors are also caused by the internal contrast“opti-mization”routines of some image-capture devices,especially consumer-type scan-ners.3Using JPEG compression on images like this,for which it was not designed,is one of the most egregious of imaging errors.JPEG is designed for photographs of natural scenes with smooth color transitions,and using it to compress iconic images with large areas of the same color results in strong visual artifacts(see,for example,Fig.1.9on p.19).3.3Computing Histograms45(a)(b)(c)Figure3.10Color quantization effects resulting from GIF conversion.The original image converted to a256color GIF image(left).Original histogram(a)and the histogram after GIF conversion(b).When the RGB image is scaled by50%,some of the lost colors are recreated by interpolation,but the results of the GIF conversion remain clearly visible in the histogram(c).(a)(b)(c)(d)Figure 3.11Effects of JPEG compression.The original image(a)contained only two different gray values,as its histogram(b)makes readily apparent.JPEG compression,a poor choice for this type of image,results in numerous additional gray values,which are visible in both the resulting image(c)and its histogram(d).In both histograms,the linear frequency(black bars)and the logarithmic frequency(gray bars)are shown.46 3.Histograms 1public class Compute_Histogram implements PlugInFilter{23public int setup(String arg,ImagePlus img){4return DOES_8G+NO_CHANGES;5}67public void run(ImageProcessor ip){8int[]H=new int[256];//histogram array9int w=ip.getWidth();10int h=ip.getHeight();1112for(int v=0;v<h;v++){13for(int u=0;u<w;u++){14int i=ip.getPixel(u,v);15H[i]=H[i]+1;16}17}18...//histogram H[]can now be used19}2021}//end of class Compute_HistogramProgram3.1ImageJ plugin for computing the histogram of an8-bit grayscale image.The setup()method returns DOES_8G+NO_CHANGES,which indicates that this plugin requires an8-bit grayscale image and will not alter it(line4).In Java,all elements of a newly instantiated array(line8)are automatically initialized,in this case to zero.Then we iterate through the image I,determining the pixel value p at each location(u,v),and incrementing the corresponding counter by one.At the end,each counter will contain the number of pixels in the image that have the corresponding intensity value.An image with K possible intensity values requires exactly K counter vari-ables;for example,since an8-bit grayscale image can contain at most256 different intensity values,we require256counters.While individual counters make sense conceptually,an actual implementation would not use K individ-ual variables to represent the counters but instead would use an array with K entries(int[256]in Java).In this example,the actual implementation as an array is straightforward.Since the intensity values begin at zero(like arrays in Java)and are all positive,they can be used directly as the indices i∈[0,N−1] of the histogram array.Program3.1contains the complete Java source code for computing a histogram within the run()method of an ImageJ plugin.At the start of Prog.3.1,the array H of type int[]is created(line8)and its elements are automatically initialized4to0.It makes no difference,at least in terms of thefinal result,whether the array is traversed in row or column 4In Java,arrays of primitives such as int,double are initialized at creation to0in the case of integer types or0.0forfloating-point types,while arrays of objects are initialized to null.3.4Histograms of Images with More than8Bits47 order,as long as all pixels in the image are visited exactly once.In contrast to Prog.2.1,in this example we traverse the array in the standard row-first order such that the outer for loop iterates over the vertical coordinates v and the inner loop over the horizontal coordinates u.5Once the histogram has been calculated,it is available for further processing steps or for being displayed.Of course,histogram computation is already implemented in ImageJ and is available via the method getHistogram()for objects of the class Image-Processor.If we use this built-in method,the run()method of Prog.3.1can be simplified topublic void run(ImageProcessor ip){int[]H=ip.getHistogram();//built-in ImageJ method...//histogram H[]can now be used}3.4Histograms of Images with More than8Bits Normally histograms are computed in order to visualize the image’s distribution on the screen.This presents no problem when dealing with images having 28=256entries,but when an image uses a larger range of values,for instance 16-and32-bit orfloating-point images(see Table1.1),then the growing number of necessary histogram entries makes this no longer practical.3.4.1BinningSince it is not possible to represent each intensity value with its own entry in the histogram,we will instead let a given entry in the histogram represent a range of intensity values.This technique is often referred to as“binning”since you can visualize it as collecting a range of pixel values in a container such as a bin or bucket.In a binned histogram of size B,each bin h(j)contains the number of image elements having values within the interval a j≤a<a j+1, and therefore(analogous to Eqn.(3.1))h(j)=card{(u,v)|a j≤I(u,v)<a j+1},for0≤j<B.(3.2) Typically the range of possible values in B is divided into bins of equal size k B=K/B such that the starting value of the interval j isa j=j·KB=j·k B.5In this way,image elements are traversed in exactly the same way that they are laid out in computer memory,resulting in more efficient memory access and with it the possibility of increased performance,especially when dealing with larger images (see also Appendix B,p.242).48 3.Histograms3.4.2ExampleIn order to create a typical histogram containing B =256entries from a 14-bitimage,you would divide the available value range if j =0...214−1into 256equal intervals,each of length k B =214/256=64,so that a 0=0,a 1=64,a 2=128,...a 255=16,320and a 256=a B =214=16,384=K .This results in the following mapping from the pixel values to the histogram bins h (0)...h (255):h (0)←0≤I (u,v )<64h (1)←64≤I (u,v )<128h (2)←128≤I (u,v )<192............h (j )←a j ≤I (u,v )<a j +1............h (255)←16320≤I (u,v )<163843.4.3ImplementationIf,as in the above example,the value range 0...K −1is divided into equal length intervals k B =K/B ,there is naturally no need to use a mapping table to find a j since for a given pixel value a =I (u,v )the correct histogram element j is easily computed.In this case,it is enough to simply divide the pixel value I (u,v )by the interval length k B ;that is,I (u,v )k B =I (u,v )K/B =I (u,v )·B K .(3.3)As an index to the appropriate histogram bin h (j ),we require an integer valuej = I (u,v )·B K,(3.4)where · denotes the floor function.6A Java method for computing histograms by “linear binning”is given in Prog.3.2.Note that all the computations from Eqn.(3.4)are done with integer numbers without using any floating-point op-erations.Also there is no need to explicitly call the floor function because the expressiona *B /Kin line 11uses integer division and in Java the fractional result of such an oper-ation is truncated,which is equivalent to applying the floor function (assuming3.5Color Image Histograms49 1int[]binnedHistogram(ImageProcessor ip){2int K=256;//number of intensity values3int B=32;//size of histogram,must be defined4int[]H=new int[B];//histogram array5int w=ip.getWidth();6int h=ip.getHeight();78for(int v=0;v<h;v++){9for(int u=0;u<w;u++){10int a=ip.getPixel(u,v);11int i=a*B/K;//integer operations only!12H[i]=H[i]+1;13}14}15//return binned histogram16return H;17}Program3.2Histogram computation using“binning”(Java method).Example of comput-ing a histogram with B=32bins for an8-bit grayscale image with K=256intensity levels. The method binnedHistogram()returns the histogram of the image object ip passed to it as an int array of size B.positive arguments).7The binning method can also be applied,in a similar way,tofloating-point images.3.5Color Image HistogramsWhen referring to histograms of color images,typically what is meant is a histogram of the image intensity(luminance)or of the individual color channels. Both of these variants are supported by practically every image-processing application and are used to objectively appraise the image quality,especially directly after image acquisition.3.5.1Intensity HistogramsThe intensity or luminance histogram h Lum of a color image is nothing more than the histogram of the corresponding grayscale image,so naturally all as-pects of the preceding discussion also apply to this type of histogram.The grayscale image is obtained by computing the luminance of the individual chan-nels of the color image.When computing the luminance,it is not sufficient to simply average the values of each color channel;instead,a weighted sum that 6 x rounds x down to the next whole number(see Appendix A,p.233).7For a more detailed discussion,see the section on integer division in Java in Ap-pendix B(p.237).50 3.Histograms takes into account color perception theory should be computed.This process is explained in detail in Chapter8(p.202).3.5.2Individual Color Channel HistogramsEven though the luminance histogram takes into account all color channels, image errors appearing in single channels can remain undiscovered.For ex-ample,the luminance histogram may appear clean even when one of the color channels is oversaturated.In RGB images,the blue channel contributes only a small amount to the total brightness and so is especially sensitive to this problem.Component histograms supply additional information about the intensity distribution within the individual color channels.When computing component histograms,each color channel is considered a separate intensity image and each histogram is computed independently of the other channels.Figure3.12 shows the luminance histogram h Lum and the three component histograms h R, h G,and h B of a typical RGB color image.Notice that saturation problems in all three channels(red in the upper intensity region,green and blue in the lower regions)are obvious in the component histograms but not in the lumi-nance histogram.In this case it is striking,and not at all atypical,that the three component histograms appear completely different from the correspond-ing luminance histogram h Lum(Fig.3.12(b)).3.5.3Combined Color HistogramsLuminance histograms and component histograms both provide useful informa-tion about the lighting,contrast,dynamic range,and saturation effects relative to the individual color components.It is important to remember that they pro-vide no information about the distribution of the actual colors in the image because they are based on the individual color channels and not the combi-nation of the individual channels that forms the color of an individual pixel. Consider,for example,when h R,the component histogram for the red channel, contains the entryh R(200)=24.Then it is only known that the image has24pixels that have a red intensity value of200.The entry does not tell us anything about the green and blue values of those pixels,which could be any valid value(∗);that is,(r,g,b)=(200,∗,∗).Suppose further that the three component histograms included the following entries:h R(50)=100,h G(50)=100,h B(50)=100.3.5Color Image Histograms 51(a)(b)h Lum(c)R (d)G (e)B(f)h R (g)h G (h)h BFigure 3.12Histograms of an RGB color image:original image (a),luminance histogram h Lum (b),RGB color components as intensity images (c–e),and the associated component histograms h R ,h G ,h B (f–h).The fact that all three color channels have saturation problems is only apparent in the individual component histograms.The spike in the distribution resulting from this is found in the middle of the luminance histogram (b).Could we conclude from this that the image contains 100pixels with the color combination(r,g,b )=(50,50,50)or that this color occurs at all?In general,no,because there is no way of ascertaining from these data if there exists a pixel in the image in which all three components have the value 50.The only thing we could really say is that the color value (50,50,50)can occur at most 100times in this image.So,although conventional (intensity or component)histograms of color im-ages depict important properties,they do not really provide any useful infor-mation about the composition of the actual colors in an image.In fact,a collection of color images can have very similar component histograms and still contain entirely different colors.This leads to the interesting topic of the com-bined histogram,which uses statistical information about the combined color components in an attempt to determine if two images are roughly similar in their color composition.Features computed from this type of histogram often52 3.Histograms form the foundation of color-based image retrieval methods.We will return to this topic in Chapter8,where we will explore color images in greater detail.3.6Cumulative HistogramThe cumulative histogram,which is derived from the ordinary histogram,is useful when performing certain image operations involving histograms;for in-stance,histogram equalization(see Sec.4.5).The cumulative histogram H isdefined asH(i)=ij=0h(j)for0≤i<K.(3.5)A particular value H(i)is thus the sum of all the values h(j),with j≤i,in the original histogram.Alternatively,we can define H recursively(as implemented in Prog.4.2on p.66):H(i)=h(0)for i=0H(i−1)+h(i)for0<i<K.(3.6)The cumulative histogram H(i)is a monotonically increasing function with amaximum valueH(K−1)=K−1j=0h(j)=M·N;(3.7)that is,the total number of pixels in an image of width M and height N.Figure 3.13shows a concrete example of a cumulative histogram.The cumulative histogram is useful not primarily for viewing but as a sim-ple and powerful tool for capturing statistical information from an image.In particular,we will use it in the next chapter to compute the parameters for several common point operations(see Sections4.4–4.6).3.7ExercisesExercise3.1In Prog.3.2,B and K are constants.Consider if there would be an advan-tage to computing the value of B/K outside of the loop,and explain your reasoning.Exercise3.2Develop an ImageJ plugin that computes the cumulative histogram of an 8-bit grayscale image and displays it as a new image,similar to H(i)in Fig.3.13.Hint:Use the ImageProcessor method int[]getHistogram()to retrieve3.7Exercises53iiH(i)Figure3.13The ordinary histogram h(i)and its associated cumulative histogram H(i).the original image’s histogram values and then compute the cumulative histogram“in place”according to Eqn.(3.6).Create a new(blank)image of appropriate size(e.g.,256×150)and draw the scaled histogram data as black vertical bars such that the maximum entry spans the full height of the image.Program3.3shows how this plugin could be set up and how a new image is created and displayed.Exercise3.3Develop a technique for nonlinear binning that uses a table of interval limits a j(Eqn.(3.2)).Exercise3.4Develop an ImageJ plugin that uses the Java methods Math.random()or Random.nextInt(int n)to create an image with random pixel values that are uniformly distributed in the range[0,255].Analyze the image’s his-togram to determine how equally distributed the pixel values truly are. Exercise3.5Develop an ImageJ plugin that creates a random image with a Gaussian (normal)distribution with mean valueμ=128and standard deviation σ=e the standard Java method double Random.nextGaussian() to produce normally-distributed random numbers(withμ=0andσ=1) and scale them appropriately to pixel values.Analyze the resulting image histogram to see if it shows a Gaussian distribution too.54 3.Histograms 1public class Create_New_Image implements PlugInFilter{2String title=null;34public int setup(String arg,ImagePlus im){5title=im.getTitle();6return DOES_8G+NO_CHANGES;7}89public void run(ImageProcessor ip){10int w=256;11int h=100;12int[]hist=ip.getHistogram();1314//create the histogram image:15ImageProcessor histIp=new ByteProcessor(w,h);16histIp.setValue(255);//white=25517histIp.fill();//clear this image1819//draw the histogram values as black bars in ip2here,20//for example,using histIp.putpixel(u,v,0)21//...2223//display the histogram image:24String hTitle="Histogram of"+title;25ImagePlus histIm=new ImagePlus(hTitle,histIp);26histIm.show();27//histIm.updateAndDraw();28}2930}//end of class Create_New_ImageProgram3.3Creating and displaying a new image(ImageJ plugin).First,we create a ByteProcessor object(histIp,line15)that is subsequentlyfilled.At this point,histIp has no screen representation and is thus not visible.Then,an associated ImagePlus object is created(line25)and displayed by applying the show()method(line26).Notice how the title(String)is retrieved from the original image inside the setup()method(line5)and used to compose the new image’s title(lines24and25).If histIp is changed after calling show(),then the method updateAndDraw()could be used to redisplay the associated image again(line27).。
数字图像处理 外文翻译 外文文献 英文文献 数字图像处理
Digital Image Processing 1 Introduction Many operators have been proposed for presenting a connected component n a digital image by a reduced amount of data or simplied shape. In general we have to state that the development, choice and modi_cation of such algorithms in practical applications are domain and task dependent, and there is no \best method". However, it is interesting to note that there are several equivalences between published methods and notions, and characterizing such equivalences or di_erences should be useful to categorize the broad diversity of published methods for skeletonization. Discussing equivalences is a main intention of this report. 1.1 Categories of Methods One class of shape reduction operators is based on distance transforms. A distance skeleton is a subset of points of a given component such that every point of this subset represents the center of a maximal disc (labeled with the radius of this disc) contained in the given component. As an example in this _rst class of operators, this report discusses one method for calculating a distance skeleton using the d4 distance function which is appropriate to digitized pictures. A second class of operators produces median or center lines of the digital object in a non-iterative way. Normally such operators locate critical points _rst, and calculate a speci_ed path through the object by connecting these points. The third class of operators is characterized by iterative thinning. Historically, Listing [10] used already in 1862 the term linear skeleton for the result of a continuous deformation of the frontier of a connected subset of a Euclidean space without changing the connectivity of the original set, until only a set of lines and points remains. Many algorithms in image analysis are based on this general concept of thinning. The goal is a calculation of characteristic properties of digital objects which are not related to size or quantity. Methods should be independent from the position of a set in the plane or space, grid resolution (for digitizing this set) or the shape complexity of the given set. In the literature the term \thinning" is not used - 1 - in a unique interpretation besides that it always denotes a connectivity preserving reduction operation applied to digital images, involving iterations of transformations of speci_ed contour points into background points. A subset Q _ I of object points is reduced by a de_ned set D in one iteration, and the result Q0 = Q n D becomes Q for the next iteration. Topology-preserving skeletonization is a special case of thinning resulting in a connected set of digital arcs or curves. A digital curve is a path p =p0; p1; p2; :::; pn = q such that pi is a neighbor of pi?1, 1 _ i _ n, and p = q. A digital curve is called simple if each point pi has exactly two neighbors in this curve. A digital arc is a subset of a digital curve such that p 6= q. A point of a digital arc which has exactly one neighbor is called an end point of this arc. Within this third class of operators (thinning algorithms) we may classify with respect to algorithmic strategies: individual pixels are either removed in a sequential order or in parallel. For example, the often cited algorithm by Hilditch [5] is an iterative process of testing and deleting contour pixels sequentially in standard raster scan order. Another sequential algorithm by Pavlidis [12] uses the de_nition of multiple points and proceeds by contour following. Examples of parallel algorithms in this third class are reduction operators which transform contour points into background points. Di_erences between these parallel algorithms are typically de_ned by tests implemented to ensure connectedness in a local neighborhood. The notion of a simple point is of basic importance for thinning and it will be shown in this report that di_erent de_nitions of simple points are actually equivalent. Several publications characterize properties of a set D of points (to be turned from object points to background points) to ensure that connectivity of object and background remain unchanged. The report discusses some of these properties in order to justify parallel thinning algorithms. 1.2 Basics The used notation follows [17]. A digital image I is a function de_ned on a discrete set C , which is called the carrier of the image.