当前位置：文档之家› Correspondence VQ-Adaptive Block Transform Coding of Images

Correspondence VQ-Adaptive Block Transform Coding of Images

Correspondence

VQ-Adaptive Block Transform Coding of Images Hakan?Caglar,Sinan G¨u nt¨u rk,B¨u lent Sankur,and Emin Anar?m

Abstract—Two new design techniques for adaptive orthogonal block transforms based on vector quantization(VQ)codebooks are presented. Both techniques start from reference vectors that are adapted to the characteristics of the signal to be coded,while using different methods to create orthogonal bases.The resulting transforms represent a signal coding tool that stands between a pure VQ scheme on one extreme and signal-independent,?xed block transformation-like discrete cosine transform(DCT)on the other.The proposed technique has superior compaction performance as compared to DCT both in the rendition of details of the image and in the peak signal-to-noise ratio(PSNR)?gures. Index Terms—Coding gain,orthogonal block transform,permutation, reference vector,vector quantization.

I.I NTRODUCTION

Block transforms(BT’s)are attractive tools for the analysis and compression of images.The Karhunen–Loeve transform(KLT),while optimal among all unitary transforms,has the shortcoming of lacking of a fast transform algorithm.The discrete cosine transform(DCT), on the other hand,possesses the fast transform character and its performance is very close to that of KLT,at least for highly correlated ?rst-order Markov processes[1],[2].These advantages have made, in fact,the DCT the industry standard for the?rst generation image and video codecs[3].

Despite the preponderance of DCT algorithms in practical applica-tions,the quest for new block transforms continues especially in the direction of transform codebook designs where each block transform is adapted to the local statistics of nonstationary?elds[4],[5]. This study has provided further insight into the two well-known techniques,BT’s and vector quantization(VQ).Our proposed cod-ing technique can be interpreted as a bridge between these two techniques.

Since image signals are nonstationary,it is conjectured that a BT for which the basis functions adapt to the local statistics of the image will have superior compression performance.The adaptation of the transform bases is instrumented by constructing a transform book, which is similar in concept to a VQ codebook.In the transform book, each BT corresponds to some typical image structure or waveform.In each segment of the image an appropriate BT should be chosen based on some simple feature of the regional image,or simply using the mean square error criterion.However,such transform books contain many fewer(typically four to eight)transforms as compared to the number of vectors in a conventional VQ codebook.Notice that if some bit assignment rule were to single out the?rst row only in each block,then the proposed coder would reduce to a scheme similar to Manuscript received September23,1995;revised January13,1997.This work was supported by T¨UB˙I TAK under Contract EEEAG-139.This work was presented in part at the IEEE International Conference on Nonlinear Signal and Image Processing,Neos Marmaros,Greece,June,1995.The associate editor coordinating the review of this manuscript and approving it for publication was Prof.Nasser M.Nasrabadi.

The authors are with the Department of Electrical and Electronics Engineering,Bo?g azi?c i University,Bebek80815,Istanbul,Turkey(e-mail: anarim@https://www.doczj.com/doc/13369459.html,.tr).

Publisher Item Identi?er S

1057-7149(98)00311-X.

Fig.1.Block diagram of the proposed VQ-adaptive BT technique.

the VQ coder.On the other hand,in our VQ-based BT,the rows other

than the?rst one can be interpreted as encoding the residual error of

the VQ.Furthermore,in our scheme,the search effort is much less as

compared to a conventional VQ coding,since our VQ size is minute

(e.g.,four to eight),i.e.,which in turn implies that the cluster sizes

are quite large.The loss in resolution due to much reduced VQ size

is compensated by the use of the other(N01)orthogonal vectors

in the BT.In this sense,our adaptive VQ-based BT aims to combine

the best of the two worlds of BT’s and VQ.

In this correspondence,we intend to design“VQ-adaptive BT’s,”

such that the basis vectors of the transform are expected to match the

statistics of the input process.This can be effected by having the rows

of the transform block to resemble waveform portions most typically

encountered in the image class to be coded.Such an adaptive block

transform can be constructed?rst by obtaining a reference vector

that also forms,let us say,the?rst row,while the other rows of the

transform block are generated by operations on this reference vector,

as described in Sections III and IV.These two new methods that

obtain an orthogonal BT from a given reference vector will be called

vector quantization-permutation based block transform(VQ-PBT)and

vector quantization-optimization based block transform(VQ-OBT),

while such techniques in general are referred to as VQ-adaptive

block transforms from here on.Fig.1shows the block diagram of

the proposed VQ-adaptive BT techniques.

The organization of the paper is as follows:Section II discusses a

VQ-based selection of the reference vectors that will be the starting

point for both the VQ-PBT and VQ-OBT types of orthogonal trans-

form design.In Section III,we describe the new VQ-PBT type block

transform design technique,while the VQ-OBT type block transform

is given in Section IV.Some design examples and comparative results

for both the VQ-PBT and VQ-OBT techniques are presented in

Section V.Section VI gives conclusions and suggestions for future

research.

II.S ELECTION OF R EFERENCE V ECTORS

The reference vectors in our scheme act as the seeds of the adaptive

BT.Therefore,these reference vectors should be a collection of

reproduction vectors and,hence,VQ techniques can be invoked to

obtain them.To generate a collection of M transform blocks,M

reference vectors are obtained using a VQ technique known as the

Linde–Buzo–Gray(LBG)algorithm[9].Our proposed scheme is 1057–7149/98$10.00?1998IEEE

(a)

(b)

(c)

Fig.2.(a)Sample line from Lena image.(b)Error signal for adaptive 828VQ-PBT with transform codebook of size eight and reconstructed from two coef?cients in each block.(c)Error signal for DCT with the same compression rate as above.

applicable to both one-dimensional (1-D)and two-dimensional (2-D)cases,albeit with slight differences.1-D Case

Using a clustering technique like the LBG algorithm on the N -sized row or column segments of the images,the feature vectors h m ;m =1;111;M are obtained as the centroids of the partition.These ?nal M reproduction vectors will be called reference vectors from here on.Then for each reference vector,h m ,in this codebook,an orthogonal BT B m of size N 2N is generated either via signed permutation operations of this vector or an optimization in the null space of it as described in Sections III and IV.In either case these vectors remain as the ?rst basis functions of the block transforms.Nonoverlapping segments of the signal x k =[x (Nk )111x (Nk +N 01)]T ;k =0;1;111are transformed to produce

y k =A m

is the most appropriate transform for that segment,and

the transform coef?cients are then quantized and coded.The overhead to index the transforms in the book is moderate since the transform book size is limited to six to seven,which necessitates three bits.However,since not all the bases are selected with equal frequency,with entropy coding the overhead reduces to approximately two bits.Thus,for example,at 1b/pixel rate,the overhead constitutes only 3%of the bit

budget.

(a)

(b)

Fig.3.(a)Reference codebook of size 3.(b)Nine image patterns formed with the outer product of the vectors in Fig.3(a).

The performance of the proposed adaptive BT technique vis-a-vis DCT in the case of 1-D signals is illustrated in Fig.2,where a sample line from the Lena image is employed.The block size was eight for both techniques,and the reconstructed signal in each segment consisted of the two highest energy (unquantized)transform coef?cients.The transform book contained M =8BT’s,and in each segment of the signal,the best reference function yielding the least distortion was selected.As expected,DCT works adequately in the highly correlated regions,but performs worse especially in the regions with abrupt changes.Thus,the adaptive BT technique outperforms DCT both visually as well as in the mean square error sense.Three of the reference vectors,as generated via the LBG algorithm using various training signals,are displayed in Fig.3(a).2-D Case

The idea of adaptive BT can be easily extended to images.However,in this case the VQ algorithm is not run on the raw image data,but ?rst a repertoire of image patterns generated by the outer product of 1-D vectors is obtained.The VQ algorithm is run on these 1-D vectors as detailed below.

?The training images are organized in normalized blocks of N 2N;denoted as I l (i;j );l =1;111;L;i;j =1;111;N where N is typically 8.

?Image patterns are formed with the least square ?tting of a square base Y l to each image block.Here Y l is obtained as the outer

product of two row vectors,i.e.,Y l =r T

c l ;l =1;111;L;where r l ;c l 2R N

:Hence

k I l 0Y l k =k I l 0r T

l c l k

(2)

(a)

(b)

(c)

Fig.4.(a)Adaptive blocks have been superior to DCT on the indicated blocks at rate =0.63b/pixel.(b)Reconstructed image with DCT based coding.PSNR =27.8dB,0.63b/pixel.(c)Reconstructed image with hybrid VQ-OBT based coding.PSNR =28.7dB,0.63b/pixel.

is minimized where k 1k denotes the Euclidean norm.These image patterns (Y l )l =1;111;L ;form,in fact,a new training set for the VQ algorithm.

?As expected,this large set of image patterns is highly redundant.At this stage the VQ algorithm is applied to vectors f r l j l =1;111;L g [f c l j l =1;111;L g to obtain a few cluster centers f h m j m =1;111;M g that would represent as distinct image structures as possible.

An image block I is transformed as A i I A j where A i and A j are the transforms constructed from the selected h i and h j ’s.The selection is conducted by minimizing

k I 0h T p h q k ;

p;q =1;111;M:

(3)

Several test images are used to evaluate the coding performance of

the proposed technique.Simulation results indicate that the codebook sizes four to eight work quite well from both compaction performance and implementation complexity points of view.Fig.3(b)demon-strates the representative M 2image patterns formed from a codebook size of M =3:Notice that these gray-level landscapes due to the f h T i h j g templates correspond to various gray-level facets.

In the ?at regions of the image,the basis functions tend to be DC-like.For a DC reference vector,the BT design technique,PBT,leads to the Walsh–Hadamard transform (WHT)[10]–[12].It is well known that the WHT does not work as well as DCT in image coding applications.Therefore,it should be argued that whenever an image block corresponds to a DC-like ?at area,it should be coded with

TABLE I 828OBT T YPE T RANSFORM H A VING

DC F UNCTION

AS A

F IRST B ASE

FOR

AR(1)I NPUT S

OURCE

DCT,while the other image blocks containing strong sloping patterns should be coded with the appropriate transforms from the book.This hybrid technique improves the performance of the compression in both detailed and ?at areas.Fig.4illustrates the blocks within which VQ-based coder is chosen,while in the rest of the image blocks DCT coding is being used.In the sequel of this work,only the hybrid scheme will be employed.

III.D ESIGN T ECHNIQUE I:P ERMUTATION -B ASED O RTHOGONAL B LOCK T RANSFORM (VQ-PBT)

One method to generate an orthogonal BT matrix from a given reference vector is to use systematic permutations and sign changes on the elements of the reference vector [10]–[12].Thus,while the ?rst row of the transform matrix is constituted of the reference vector itself,the other rows are simply given by permutation and sign change operations on the elements of the reference vector.

Consider now the ?rst row of a transform block,denoted as h 0:The row vector h 0has N components,h 0=[h 0(0)h 0(1)111h 0(N 01)];while the other rows are denoted as h i ;i =1;111;N 01:Furthermore,rows of the transform matrices form orthonormal sets,that is

h i h T j = i 0j :

(4)

The construction of an orthogonal matrix given a reference vector h 0;based on the permutations and sign changes of its components is outlined below.

De?ne permutation functions

0(j ); 1(j );111; N 01(j );

j =0;1;111;N 01and sign change functions

0(j ); 1(j );111; N 01(j );j =0;1;111;N 01

i (j )2f01;1g

where the subscript denotes the row number,and the argument

denotes the component position in a row.Then h i is de?ned as

h i [j ]= i (j )h 0[ i (j )]

j =1;111;N 01:

TABLE II

G T C V ALUES FOR B OTH VQ-OBT AND DCT ON THE C AMERAMAN I MAGE .E ACH C LASS V ALUE C ORRESPONDS TO A P ATTERN AND THE C OMPACTION P ERFORMANCE I S C OMPUTED OVER THE R EGION M ATCHING THAT P

ATTERN

This de?nition says that the j th element of the i th row of the transform matrix is the same as the element of h 0on the i (j )th position except with a sign change given by i (j ):

Such permutation and sign change operations result in pairwise orthogonality in (4)if only if they satisfy the relationship given [11]–[12]as

i (k ) j (k )+ i (t ) j (t )=2 i 0j

0 i;j;k N 01

i ( j (k ))

= 01

( i (k ))0 i;j;k N 01

(5)

where t = 01

( j (k )):As an example,the 828transform matrix formed from the vector [h (0)111h (N 01)]is shown at the bottom of the page.

A =

Fig.5.PSNR versus rate curves for DCT-based and proposed adaptive coding method on the cameraman image.

This technique,a variation of Hadamard modulation of a reference vector,is in fact a generalization of the Walsh–Hadamard transform, since the lowpass function h will have in general components different than one’s.The only restrictions on the reference vectors are their size,which must be an integer power of two.

In a permutation-based orthogonal block coder designed using the algorithm described above,all basis functions use the same set of coef?cients(in different shift and sign positions).One can obtain linear phase BT’s simply by imposing linear phase condition on h i’s.In other words,one can start with symmetric pattern vectors r l=[~r l~r l J];c l=[~c l~c l J];where~r l;~c l are N=2sized vectors and

J is the counter identity matrix.We have found,however,that the resulting image pattern repertoire f Y l g becomes quite restricted due to their horizontal and vertical double symmetry,and consequently the coding gains of VQ-PBT remain unimpressive with respect to that of DCT[13].Thus,when the linear phase property is sacri?ced, local structures are more frequently matched by these richer and more realistic image patterns.Notice that these patterns enable to code,e.g., diagonal edges,with a few coef?cients while this would not be the case with symmetric bases.

IV.D ESIGN T ECHNIQUE II:O PTIMIZATION-B ASED

B LOCK T RANSFORM(VQ-OBT)

In the permutation-based transform design,as described above, only the?rst row(the so-called reference vector)was obtained from VQ so that it resembles the various image structures.Alternatively,it is possible to generate all the rows of the transform matrix out of an optimization scheme,thereby achieving an even higher compaction. This process leads to an orthogonal BT that no longer shares the same coef?cient values.Again starting from h0’s as obtained from the VQ classi?cation algorithm,one proceeds to select any D matrix of dimensions(N012N);in the null space of h0that is

D h0=0

Since D D T>0is a positive de?nite matrix,one can introduce a Q matrix such that

Q=D T(D D T)0(1=2)

(6)

(a)

(b)

(c)

Fig.6.(a)Original detail image.(b)Reconstructed detail image with DCT-based coding[detail of4(b)].(c)Reconstructed detail image with hybrid VQ-OBT based coding[detail of4(c)].

and,thus,one can construct a transform matrix Q such that

A=[h0j D T(D D T)0(1=2)]T:(7) The resulting matrix A is an orthonormal matrix satisfying,A A T= A T A=I N2N for any given reference vector h0:There are several ways to span the null space of h0;i.e.,several possibilities for D: Since our objective is to design an orthonormal transform having good compaction performance,we set an optimization procedure as follows.

The transform coef?cient variances for a given input covariance matrix are found as the diagonal terms of the output autocorrelation matrix R yy given by

R yy=A R xx A T

where R xx is the input autocorrelation matrix and A any orthogonal BT.Optimum compaction performance is obtained when geometric

mean of the transform coef?cient variances is minimum.Thus,we set the optimization problem as

英汉翻译中的显化和隐化

英汉翻译中的显化和隐化 [摘要]:正确认识和了解翻译中显化与隐化的因素,有助于译文的正确表达。本文通过实例分析英汉互译中显化和隐化现象,认为它由语言、译者、社会文化等多种因素造成的,并提出了应用策略。 [关键词]:显化隐化语言特性显化(explicitness / explicitation),又译外显化、明晰化,指的是“目标文本以更明显的形式表述源文本的信息,是译者在翻译过程中增添解释性短语或添加连接词等来增强译本的逻辑性和易解性,从而使原文艰涩晦暗之处在译语中变得清晰、明白、浅显”。而隐化(implicitness)现象是指目标文本通过概括信息或删除连接词等有意忽略或省译源语文本中的某些部分。由于汉语在指称形式、连接词、造句结构等方面都不同于英语,两种语言互译过程中的确存在隐化现象的发生。贺显斌曾选取欧•亨利的短篇小说The Last Leaf及其汉译本进行显化现象的比较。他的结果是:英文小说全篇134句,译成汉语后明晰化程度提高的有79句,占58.96%,也就是说,在英译汉过程中确有显化现象发生。一、影响显化、隐化的因素导致翻译中的显化和隐化的因素主要可归纳为三方面:一是语言特性因素。语言本身的特性,包括结构特性和习惯用法,决定了语言表达中显、隐化程度的区别。分析发现,汉译英时会较多地运用形式上的显化,而英译汉时意义上的隐化是普遍存在的。二是语用因素。基于翻译的目的在于使读者理解原文的观点,翻译者有责任将不同文化的承载内涵和文化缺失部分通过加词、阐释等方法显化表达。三是译者因素。过多仿译会造成过多“显化”。 1.英汉语言特性因素从语言学角度来说,汉、英语言之间最重要的区别特征莫过于形合与意合的区分,邵志洪对汉语和英语的语义对比表明,汉语属语义形语言,而印欧语言属语法形语言;前者力主考究“字”与语义及其相互关系,后者则重点研究主谓序列及其相关词类。贾玉新将英汉语言在句法上的差异精炼的概括如下:“英语高度形式化、逻辑化,句法结构严谨完备,并以动词为核心,重分析轻意合;而汉语则不重形式,

翻译中的显化与隐化现象——以云南历史与少数民族文化外译为例

翻译中的显化与隐化现象——以云南历史与少数民族文化外译为例摘要：显化与隐化是翻译中常见的现象，也是译者常用的翻译策略。本文以云南历史与少数民族文化外译为例，分析了显化和隐化策略在翻译实践中的运用，讨论了两种策略运用的效果，从而了解译者的策略选择对翻译效果的影响。关键词：显化；隐化；少数民族文化；翻译策略引言文本代表着作者的世界观，是作者价值取向的一种表达方式，而当译者介入这一行为后，原作者的观念基础就可能发生改变。如何尽可能地保留原文的所呈现出的价值观，这就成为了译者面临的一大难题。正确认识显隐化策略在翻译中的应用有助于译者对原文的准确理解和表达。本文以云南历史与少数民族文化文本为例，试图分析翻译实践中的显化和隐化现象。 1.显化与隐化策略 1.1 显化与隐化策略显化策略(explicitation) 指译者从上下文或相关语境中推断出源语中的隐含信息，并在目标语中加以说明(Vinay & Darbelnet, 1995)。在翻译实践过程中，译者会将连接词或解释性短语等进行适当添加，从而促进译本易解性和逻辑性的强化，确保译语的浅显化和清晰化。显化在文本层面上表现为两种形式：一是增加，即添加一些新的成分；二是具体化，即提供更多细节信息(Perego, 2003)。 Klaudy(2001a)将显化细分为四类：(1)强制性显化(Obligatory Explicitation), 指两种语言在语义、形态和句法方面存在差异时，译者不得不将某些信息具体化；(2) 选择性显化(Optional Explicitaion), 指两种语言在文本构成策略和文本特征方面存在差异时，译者需要增加连接成分、关系从句、强化词等；(3)语用学显化(Pragmatic Explicitation)，指目标语读者无法理解源语文化中的一些普遍性知识时，译者需要添加解释；(4)翻译本身固有显化(Translation-inherent Explicition)，指由于翻译过程本身的性质，译者在加工源语思想时会影响译文的长度。与之相对应的隐化策略(Implicitation)，指译者依据上下文或语境的要求，隐藏了部分源语中的显在信息(Vinay & Darbelnet, 1995)。它与译文读者的阅读期待密切相关，是指专业译者出于语言、文化、意识形态等方面的考虑，删除了一些源语中的单词、短语、句子甚至段落(Dimitriu, 2004)。 1.2 显化与隐化相关研究 Blum-Kulka(2001)提出的显化假说认为，译者对原作的阐释很可能会造成译文在篇幅上比原文要长。在翻译中一味采用显化策略效果并不一定就好，因为过多的信息显化会造成译文冗长，也剥夺了读者的理解和阐释权(Gutt, 1991)。相比之下，隐化则常常被认为破坏了原作的完整性，有悖于翻译的忠实原则。柯飞(2005)指出，显化将原文中隐含或文化上不言自明的信息显示出来，使译作更清晰易懂，而一定的隐化则会让译文更加简洁地道。任何翻译中都包含着译者的妥协，英汉两种语言无论是在语法句法结构还是在文化背景上都有非常大的差异，因此在双语转换过程中译者要能灵活运用显化策略补出文化背景等，同时运用隐化策略平衡译文，尽可能地忠实于原文，传达出原文的价值观。 2.隐化和显化产生的因素