当前位置:文档之家› Chapter 10 VISION AND VIDEO MODELS AND APPLICATIONS

Chapter 10 VISION AND VIDEO MODELS AND APPLICATIONS

Chapter 10 VISION AND VIDEO MODELS AND APPLICATIONS
Chapter 10 VISION AND VIDEO MODELS AND APPLICATIONS

Chapter10

VISION AND VIDEO:

MODELS AND APPLICATIONS

Stefan Winkler

Swiss Federal Institute of Technology–EPFL

Signal Processing Laboratory

1015Lausanne,Switzerland

Stefan.Winkler@ep?.ch

Christian J.van den Branden Lambrecht

EMC Media Group

80South Street

Hopkinton,MA01748,USA

vdb@https://www.doczj.com/doc/3310765371.html,

Murat Kunt

Swiss Federal Institute of Technology–EPFL

Signal Processing Laboratory

1015Lausanne,Switzerland

Murat.Kunt@ep?.ch

1.INTRODUCTION

While traditional analog systems still form the vast majority of television sets today,production studios,broadcasters and network providers have been installing digital video equipment at an ever-increasing rate.The border line between analog and digital video is moving closer and closer to the consumer. Digital satellite and cable service have been available for a while,and re-cently terrestrial digital television broadcast has been introduced in a number of locations around the world.

1

2

Analog video systems,which have been around for more than half a century now,are among the most successful technical inventions measured by their market penetration(more than1billion TV receivers in the world)and the time span of their widespread use.However,because of the closed-system approach inherent to analog technology,any new functionality or processing is utterly dif?cult to incorporate in the existing systems.The introduction of digital video systems has given engineers additional degrees of freedom due to the?exibility of digital information processing and the ever-decreasing cost of computing power.Reducing the bandwidth and storage requirements while maintaining a quality superior to that of analog video has been the priority in the design of these new systems.

Many optimizations and improvements of video processing methods have relied on purely mathematical measures of optimality,such as mean squared error(MSE)or signal-to-noise ratio(SNR).However,these simple measures operate solely on a pixel-by-pixel basis and neglect the important in?uence of image content and viewing conditions on the actual visibility of artifacts. Therefore,their predictions often do not agree well with visual perception.

In the attempt to increase compression ratios for video coding even further, engineers have turned to vision science in order to better exploit the limitations of the human visual system.As a matter of fact,there is a wide range of applications for vision models in the domain of digital video,some of which we outline in this chapter.However,the human visual system is extremely complex,and many of its properties are still not well understood.While certain aspects have already found their way into video systems design,and while even ad-hoc solutions based on educated guesses can provide satisfying results to a certain extent,signi?cant advancements of the current state of the art will require an in-depth understanding of human vision.

Since a detailed treatment of spatial vision can be found in other chapters of this book,our emphasis here is on temporal aspects of vision and modeling, which is the topic of Section2.Then we take a look at the basic concepts of video coding in Section3.An overview of spatio-temporal vision modeling, including a perceptual distortion metric developed by the authors,is given in Section4.We conclude the chapter by applying vision models to a number of typical video test and measurement tasks in Section5.

2.MOTION PERCEPTION

Motion perception is a fundamental aspect of vision and aids us in many essential visual tasks:it facilitates depth perception,object discrimination, gaze direction,and the estimation of object displacement.Motion,particularly in the peripheral visual?eld,attracts our attention.

Vision and Video:Models and Applications3 There are many controversial opinions about motion perception.Motion has often been closely linked to the notion of optical?ow,particularly in the work on motion prediction for video coding.Sometimes,however,motion can be perceived in stimuli that do not contain any actual movement,which is referred to as apparent motion.In light of these concepts,motion is better de?ned as a psychological sensation,a visual inference,similar to color perception.The images on the retina are just time-varying patterns of light;the evolution of these light distributions over time is then interpreted by the visual system to create a perception of objects moving in a three-dimensional world.

Extending spatial models for still images to handle moving pictures calls for a close examination of the way temporally varying visual information is processed in the human brain[73].The design of spatio-temporal vision models(cf.Section4.)is complicated by the fact that much less attention of vision research has been devoted to temporal aspects than to spatial aspects. In this section,we take a closer look at the perception of motion and the temporal mechanisms of the human visual system,in particular the temporal and spatio-temporal contrast sensitivity functions,temporal masking,and pattern adaptation.

2.1TEMPORAL MECHANISMS

Early models of spatial vision were based on the single-channel assump-tion,i.e.the entire input is processed together and in the same way.Due to their inability to model signal interactions,however,single-channel models are unable to cope with more complex patterns and cannot explain data from experiments on masking and pattern adaptation.This led to the development of multi-channel models,which employ a bank of?lters tuned to different fre-quencies and orientations.Studies of the visual cortex have shown that many of its neurons actually exhibit receptive?elds with such tuning characteristics [14];serving as an oriented band-pass?lter,the neuron responds to a certain range of spatial frequencies and orientations.

Temporal mechanisms have been studied by vision researchers for many years,but there is less agreement about their characteristics than those of spa-tial mechanisms.It is believed that there are one temporal low-pass and one, possibly two,temporal band-pass mechanisms[19,27,39,64],which are gener-ally referred to as sustained and transient channels,respectively.Physiological experiments con?rm these results to the extent that low-pass and band-pass mechanisms have been found[17].However,neurons with band-pass prop-erties exhibit a wide range of peak frequencies.Recent results also indicate that the peak frequency and bandwidth of the mechanisms change consider-ably with stimulus energy[18].The existence of an actual third mechanism is questionable,though[19,24].

4

In a recent study[19],for example,temporal mechanisms are modeled with a two-parameter function and its derivatives.It is possible to achieve a very good?t to a large set of psychophysical data using only this function and its second derivative,corresponding to one sustained and one transient mechanism, respectively.The frequency responses of the corresponding?lters for a typical choice of parameters are used and shown later in Section4.2.2.

2.2CONTRAST SENSITIVITY

The response of the human visual system to a stimulus depends much less on the absolute luminance than on the relation of its local variations to the surrounding luminance.This property is known as Weber’s law,and contrast is a measure of this relative variation of luminance.While Weber’s law is only an approximation of the actual sensory perception,contrast measures based on this concept are widely used in vision science.Unfortunately,a common de?nition of contrast suitable for all situations does not exist,not even for simple stimuli.

Mathematically,Weber contrast can be expressed as C=?L/L.In vi-sion experiments,this de?nition is used mainly for patterns consisting of an increment or decrement?L to an otherwise uniform background luminance L.

However,such a simple de?nition is inappropriate for measuring contrast in complex images,because a few very bright or very dark points would determine the contrast of the entire image.Furthermore,human contrast sensitivity varies with the adaptation level associated with the local average luminance.Local band-limited contrast measures have been introduced to address these issues [41,42,76]and have been used successfully in a number of vision models [12,37].

Our sensitivity to contrast depends on the color as well as the spatial and temporal frequency of the stimuli.Contrast sensitivity functions(CSF’s)are generally used to quantify these dependencies.Contrast sensitivity is de?ned as the inverse of the contrast threshold,i.e.the minimum contrast necessary for an observer to detect a stimulus.

Spatio-temporal CSF approximations are shown in Figure10.1.Achro-matic contrast sensitivity is generally higher than chromatic,especially for high spatio-temporal frequencies.The full range of colors is perceived only at low frequencies.As spatio-temporal frequencies increase,sensitivity to blue-yellow stimuli declines?rst.At even higher frequencies,sensitivity to red-green stimuli diminishes as well,and perception becomes achromatic.On the other hand,achromatic sensitivity decreases slightly at low spatio-temporal frequencies,whereas chromatic sensitivity does not(see Figure10.1).How-ever,this apparent attenuation of sensitivity towards low frequencies may be attributed to implicit masking,i.e.masking by the spectrum of the window within which the test gratings are presented[78].

Vision and Video:Models and Applications5

Figure10.1Approximations of achromatic(left)and chromatic(right)spatio-temporal contrast sensitivity functions[6,32,33].

There has been some debate about the space-time separability of the spatio-temporal CSF.This property is of interest in vision modeling because a CSF that could be expressed as a product of spatial and temporal components would simplify modeling.Early studies concluded that the spatio-temporal CSF was not space-time separable at lower frequencies[34,47].Kelly[31]measured contrast sensitivity under stabilized conditions(i.e.the stimuli were stabilized on the retina by compensating for the observers’eye movements)and?t an analytic function to these measurements[32],which yields a very close ap-proximation of the spatio-temporal CSF for counter-phase?icker.It was found that this CSF and its chromatic counterparts can also be approximated by linear combinations of two space-time separable components termed excitatory and inhibitory CSF’s[6,33].

Measurements of the spatio-temporal CSF for both in-phase and conven-tional counter-phase modulation suggest that the underlying?lters are indeed spatio-temporally separable and have the shape of low-pass exponentials[77]. The spatio-temporal interactions observed for counter-phase modulation can be explained as a product of masking by the zero-frequency component of the gratings.

The important issue of unconstrained eye movements in CSF models is addressed in Chapter??.Natural drift,smooth pursuit and saccadic eye move-ments can be included in Kelly’s formulation of the stabilized spatio-temporal CSF using a model for eye velocity[13].In a similar manner,motion compensa-tion of the CSF can be achieved by estimating smooth-pursuit eye movements under the worst-case assumption that the observer is capable of tracking all objects in the scene[70].

6

2.3TEMPORAL MASKING

Masking is a very important phenomenon in perception as it describes inter-actions between stimuli(cf.Chapter??).Masking occurs when a stimulus that is visible by itself cannot be detected due to the presence of another.Some-times the opposite effect,facilitation,occurs:a stimulus that is not visible by itself can be detected due to the presence of another.Within the framework of imaging and video applications it is helpful to think of the distortion or coding noise being masked(or facilitated)by the original image or sequence acting as background.Masking explains why similar coding artifacts are disturbing in certain regions of an image while they are hardly noticeable elsewhere.

Masking is strongest between stimuli located in the same perceptual channel, and many vision models are limited to this intra-channel masking.However, psychophysical experiments show that masking also occurs between channels of different orientations[16],between channels of different spatial frequency, and between chrominance and luminance channels[8,36,56],albeit to a lesser extent.

Temporal masking is an elevation of visibility thresholds due to temporal discontinuities in intensity,e.g.scene cuts.Within the framework of television, it was?rst studied by Seyler and Budrikis[52,53],who concluded that threshold elevation may last up to a few hundred milliseconds after a transition from dark to bright or from bright to dark.In a more recent study on the visibility of MPEG-2coding artifacts after a scene cut,signi?cant visual masking effects were found only in the?rst subsequent frame[57].A strong dependence on stimulus polarity has also been noticed[7]:The masking effect is much more pronounced when target and masker match in polarity,and it is greatest for local spatial con?gurations.Similar to to the case of spatial stimulus interactions, the opposite of temporal masking,temporal facilitation,has been observed at low-contrast discontinuities.

Interestingly,temporal masking can occur not only after a discontinuity (“forward masking”),but also before.This“backward masking”may be explained as the result of the variation in the latency of the neural signals in the visual system as a function of their intensity[1].

So far,the above-mentioned temporal masking effects have received much less attention in the video coding community than their spatial counterparts. In principle,temporal masking can be taken into account with a contrast gain control model(cf.Section4.2.3),as demonstrated in[21].A video quality metric that incorporates forward masking effects by means of a low-pass?ltered masking sequence is described in[66].

Vision and Video:Models and Applications7 2.4ADAPTATION

Pattern adaptation in the human visual system is the adjustment of contrast sensitivity in response to the prevailing stimulation patterns.For example, adaptation to patterns of a certain frequency can lead to a noticeable decrease of contrast sensitivity around this frequency[22,55,71].Together with mask-ing,adaptation was one of the major incentives for developing a multi-channel theory of vision.However,pattern adaptation has a distinct temporal com-ponent to it and is not automatically taken into account by a multi-channel representation of the input;it needs to be incorporated explicitly by adapting the pertinent model parameters.A single-mechanism model that accounts for both pattern adaptation and masking effects of simple stimuli was presented in [49],for example.

An interesting study in this respect used natural images of outdoor scenes (both distant views and close-ups)as adapting stimuli[68].It was found that exposure to such stimuli induces pronounced changes in contrast sensitivity. The effects can be characterized by selective losses in sensitivity at lower to medium spatial frequencies.This is consistent with the characteristic amplitude spectra of natural images,which decrease with frequency roughly as1/f.

Likewise,an examination of how color sensitivity and appearance might be in?uenced by adaptation to the color distributions of images[69]revealed that natural scenes exhibit a limited range of chromatic distributions,hence the range of adaptation states is normally limited as well.However,the variability is large enough so that different adaptation effects may occur for individual scenes and for different viewing conditions.

3.VIDEO CONCEPTS

3.1STANDARDS

The Moving Picture Experts Group(MPEG)1is a working group of ISO/IEC in charge of the development of international standards for compression,de-compression,processing,and coded representation of moving pictures,au-dio and their combination.MPEG comprises some of the most popular and widespread standards for video coding.The group was established in January 1988,and since then it has produced:

MPEG-1,a standard for storage and retrieval of moving pictures and audio,which was approved in November1992.MPEG-1is intended to be generic,i.e.only the coding syntax is de?ned and therefore mainly the decoding scheme is standardized.MPEG-1de?nes a block-based hybrid 1See http://drogo.cselt.stet.it/mpeg/for an overview of its activities.

8

DCT/DPCM coding scheme with prediction and motion compensation.

It also provides functionality for random access in digital storage media.

MPEG-2,a standard for digital television,which was approved in Novem-ber1994.The video coding scheme used in MPEG-2is again generic;

it is a re?nement of the one in MPEG-1.Special consideration is given to interlaced sources.Furthermore,many functionalities such as scala-bility were introduced.In order to keep implementation complexity low for products not requiring all video formats supported by the standard, so-called“Pro?les”,describing functionalities,and“Levels”,describ-ing resolutions,were de?ned to provide separate MPEG-2conformance levels.

MPEG-4,a standard for multimedia applications,whose?rst version was approved in October1998.MPEG-4addresses the need for robustness in error-prone environments,interactive functionality for content-based access and manipulation,and a high compression ef?ciency at very low bitrates.MPEG-4achieves these goals by means of an object-oriented coding scheme using so-called“audio-visual objects”,for example a ?xed background,the picture of a person in front of that background,the voice associated with that person etc.The basic video coding structure supports shape coding,motion compensation,DCT-based texture coding as well as a zerotree wavelet algorithm.

MPEG-7,a standard for content representation in the context of audio-visual information indexing,search and retrieval,which is scheduled for approval in late2001.

The standards being used commercially today are mainly MPEG-1(in older compact discs),MPEG-2(for digital TV and DVD’s),and H.261/H.263(which use related compression methods for low-bitrate communications).Some broadcasting companies in the US and in Europe have already started broadcast-ing television programs that are MPEG-2compressed,and DVD’s are rapidly gaining in popularity in the home video sector.For further information on these and other compression standards,the interested reader is referred to[4].

3.2COLOR CODING

Many standards,such as PAL,NTSC,MPEG,or JPEG,are already based on human vision in the way color information is processed.In particular,they take into account the nonlinear perception of lightness,the organization of color channels,and the low chromatic acuity of the human visual system.

Conventional television cathode ray tube(CRT)displays have a nonlinear, roughly exponential relationship between frame buffer RGB values or signal

Vision and Video:Models and Applications9 voltage and displayed intensity.In order to compensate for this,gamma cor-

rection is applied to the intensity values before coding.It so happens that the

human visual system has an approximately logarithmic response to intensity,

which is very nearly the inverse of the CRT nonlinearity[45].Therefore,cod-

ing visual information in the gamma-corrected domain not only compensates

for CRT behavior,but is also more meaningful perceptually.

Furthermore,it has been long known that some pairs of hues can coexist in a

single color sensation,while others cannot.This led to the conclusion that the

sensations of red and green as well as blue and yellow are encoded in separate

visual pathways,which is commonly referred to as the theory of opponent

colors(cf.Chapter??).It states that the human visual system decorrelates its

input into black-white,red-green and blue-yellow difference signals.

As pointed out before in Section2.2,chromatic visual acuity is signi?cantly

lower than achromatic acuity.In order to take advantage of this behavior,

the color primaries red,green,and blue are rarely used for coding directly.

Instead,color difference(chroma)signals similar to the ones just mentioned are

computed.In component digital video,for example,the resulting color space

is referred to as Y C B C R,where Y encodes luminance,C B the difference between blue primary and luminance,and C R the difference between red

primary and luminance(the primes are used here to emphasize the nonlinear

nature of these quantities due to the above-mentioned gamma correction).

The low chromatic acuity now permits a signi?cant data reduction of the

color difference signals,which is referred to as chroma subsampling.The

notation commonly used is as follows:

4:4:4denotes no chroma subsampling.

4:2:2denotes chroma subsampling by a factor of2horizontally;this

sampling format is used in the standard for studio-quality component

digital video as de?ned by ITU-R Rec.601[29],for example.

4:2:0denotes chroma subsampling by a factor of2both horizontally

and vertically;this sampling format is often used in JPEG or MPEG

and is probably the closest approximation of actual visual color acuity

achievable by chroma subsampling alone.

4:1:1denotes chroma subsampling by a factor of4horizontally.

3.3INTERLACING

As analog television was developed,it was noted that?icker could be per-

ceived at certain frame rates,and that the magnitude of the?icker was a function

of screen brightness and surrounding lighting conditions.In a movie theater

at relatively low light levels,a motion picture can be displayed at a frame rate

10

of 24Hz,whereas a bright CRT display requires a refresh rate of more than 50Hz for ?icker to disappear.The drawback of such a high frame rate is the high bandwidth of the signal.On the other hand,the spatial resolution of the visual system decreases signi?cantly at such temporal frequencies (cf.Fig-ure 10.1).These two properties combined gave rise to a technique referred to as interlacing .

The concept of interlacing is illustrated in Figure 10.2.Interlacing trades off vertical resolution with temporal resolution.Instead of sampling the video signal at 25or 30frames per second,the sequence is shot at a frequency of 50or 60interleaved ?elds per second.A ?eld corresponds to either the odd or the even lines of a frame,which are sampled at different time instants and displayed alternately (the ?eld containing the even lines is referred to as the top ?eld,and the ?eld containing the odd lines as the bottom ?eld).Thus the required bandwidth of the signal can be reduced by a factor of 2,while the full horizontal and vertical resolution is maintained for stationary image regions,and the refresh rate for objects larger than one scanline is still suf?ciently

high.1/2f

Figure 10.2Illustration of interlacing.The top sequence is progressive;all lines of each frame are transmitted at the frame rate f .The bottom

sequence is interlaced;each

frame is split in two ?elds con-

taining the odd and the even

lines (shown in bold),respec-

tively.These ?elds are trans-

mitted alternately at twice the

original frame rate.

MPEG-1handles only progressive video,which is better adapted to computer displays.MPEG-2on the other hand was designed as the new standard to transmit television signals.Therefore it was decided that MPEG-2would support both interlaced and progressive video.An MPEG-2bitstream can contain a progressive sequence encoded as a succession of frames,an interlaced sequence encoded as a succession of ?elds,or an interlaced sequence encoded as a succession of frames.In the latter case,each frame contains a top and a bottom ?eld,which do not belong to the same time instant.Based on this,a variety of modes and combinations of motion prediction algorithms were de?ned in MPEG-2.

Interlacing poses quite a problem in terms of vision modeling,especially from the point of view of temporal ?ltering.It is not only an implementation

Vision and Video:Models and Applications11 problem,but also a modeling problem,because identifying the signal that is actually perceived is not obvious.Vision models have often overlooked this issue and have taken simplistic approaches;most of them have restricted themselves to progressive input.Newer models incorporate de-interlacing approaches,which aim at creating a progressive video signal that has the spatial resolution of a frame and the temporal frequency of a?eld.A simple solution,which is still very close to the actual signal perceived by the human eye,consists in merging consecutive?elds together into a full-resolution50or 60Hz signal.This is a valid approach as each?eld is actually displayed for two?eld periods due to the properties of the CRT phosphors.Other solutions interpolate both spatially and temporally by upsampling the?elds.Although the latter might seem more elegant,it feeds into the vision model a signal which is not the one that is being displayed.Reviews of various de-interlacing techniques can be found in[15,59].

3.4ARTIFACTS

The?delity of compressed and transmitted video sequences is affected by the following factors:

any pre-or post-processing of the sequence outside of the compression module.This can include chroma subsampling and de-interlacing,which were discussed brie?y above,or frame rate conversion.One particular example is3:2pulldown,which is the standard way to convert progressive ?lm sequences shot at24frames per second to interlaced video at30 frames per second.

the compression operation itself.

the transmission of the bitstream over a noisy channel.

3.4.1Compression Artifacts.The compression algorithms used in var-ious video coding standards today are very similar to each other.Most of them rely on block-based DCT with motion compensation and subsequent quantiza-tion of the DCT coef?cients.In such coding schemes,compression distortions are caused by only one operation,namely the quantization of the DCT coef?-cients.Although other factors affect the visual quality of the stream,such as motion prediction or decoding buffer,these do not introduce any distortion per se,but affect encoding process indirectly by in?uencing the quantization scale factor.

A variety of artifacts can be distinguished in a compressed video sequence:

blockiness or blocking effect,which refers to a block pattern of size 8×8in the compressed sequence.This is due to the8×8block DCT quantization of the compression algorithm.

12

bad edge rendition:edges tend to be fuzzy due to the coarser quantization of high frequencies.

mosquito noise manifests itself as an ambiguity in the edge direction:an edge appears in the direction conjugate to the actual edge.This effect is due to the implementation of the block DCT as a succession of a vertical and a horizontal one-dimensional DCT[9].

jagged motion can be due to poor performance of the motion estimation.

When the residual error of motion prediction is too large,it is coarsely quantized by the DCT quantization process.

?ickering appears when a scene has a high texture content.Texture blocks are compressed with varying quantization factors over time,which results in a visible?ickering effect.

smoothing,loss of detail are typical artifacts of quantization.

aliasing appears when the content of the scene is above the Nyquist rate, either spatially or temporally.

An excellent survey of the various artifacts introduced by typical compres-sion schemes can be found in[79].

3.4.2Transmission Errors.A very important and often overlooked source of distortions is the transmission of the bitstream over a noisy channel.Dig-itally compressed video is typically transferred over a packet network.The actual transport can take place over a wire or wireless,but some higher level protocol such as ATM or TCP/IP ensures the transport of the video stream. Most applications require the streaming of video,i.e.the bitstream needs to be transported in such a way that it can be decoded and displayed in real time. The bitstream is transported in packets whose headers contain sequencing and timing information.This process is illustrated in Figure10.3.Streams can also carry additional signaling information at the session level.A popular trans-port protocol at the moment is TCP/IP.A variety of protocols are then used to transport the audio-visual information.The real-time protocol(RTP)is used to transport,synchronize and signal the actual media and add timing information [51];RTP packets are transported over UDP.The signalling is taken care of by additional protocols such as the H.323family from the ITU[30],or the suite of protocols(SIP,SAP,SDP)from the Internet Engineering Task Force[50].

A comparison of these schemes is provided in[11].

Two different types of impairments can occur when transporting media over noisy channels.Packets can be lost due to excessive buffering in intermediate routers or switches,or they can be delayed to the point where they are not received in time for decoding.The latter is due to the queuing algorithm in

Vision and Video:Models and Applications

13

Encoder

Bitstream

Network

Figure10.3Illustration of a video transmission system.The video sequence is?rst compressed by the encoder.The resulting bitstream is packetized in the network adaptation layer,where a header containing sequencing and synchronization data is added to each packet.The packets are then sent over the network of choice.

routers and switches.To the application,both have the same effect:part of the media stream is not available,thus packets are missing when they are needed for decoding.

Such losses can affect both the semantics and the syntax of the media stream.When the losses affect syntactic information,not only the data relevant to the lost block are corrupted,but also any data that depend on this syntactic information.For example,a loss of packets containing data pertinent to an MPEG macroblock will corrupt all following macroblocks until an end of slice is encountered.This is due to the fact that the DC coef?cient of a macroblock is differentially predicted between macroblocks and resets at the beginning of a slice.Also,for each of these corrupted macroblocks,all blocks that are motion predicted from these will be lost as well.Hence the loss of a single macroblock can affect the stream up to the next intra-coded frame.Figure10.4illustrates this phenomenon.

The effect can be even more damaging when global data is corrupted.An example of this is the timing information in an MPEG stream.The system layer speci?cation of MPEG imposes that the decoder clock be synchronized with the encoder clock via periodic refresh of the program clock reference sent in some packet.Too much jitter on packet arrival can corrupt the synchronization of the decoder clock,which can result in highly noticeable impairments.

The visual effects of such losses vary a lot among decoders depending on their ability to deal with corrupted streams.Some decoders never recover from certain errors,while others apply clever concealment methods in order to minimize such effects.

14

I Picture P or B Picture

P or B Picture Spatial Loss

Propagation

Spatial Loss

Propagation

Figure 10.4Spatial and tem-poral propagation of losses in an MPEG-compressed video sequence.The loss of a single macroblock causes the inabil-ity to decode the data up to the end of the slice.Macroblocks in neighboring frames that are predicted from the damaged area are corrupted as well.

4.VISION MODELS

Modeling the human visual system is a challenging task due to its inherent complexity;many of its properties are not fully understood even today.Its components have been studied in detail,but putting all the pieces together for a comprehensive model of human vision is far from trivial [73].Quite a few models for still images have been developed in the past;their extension to moving pictures,however,has not received much attention until recently.In this section,we brie?y review the development of metrics.We then present a perceptual distortion metric developed by the authors and discuss how the performance of such systems can be evaluated in a meaningful and reliable way.

4.1MODELS AND METRICS

The objective for any vision model must be good agreement with experi-mental data.Threshold experiments and preference tests represent some of the most reliable methods available (cf.Chapter ??).Therefore,an application making use of a vision model to measure perceptual differences in some way provides the most direct evaluation possibility.For this reason,we focus on vision models wrapped into distortion metrics here.

Distortion metrics need not necessarily rely on sophisticated models of the human visual system in order to perform well.They can exploit knowledge about the compression algorithm and the pertinent types of artifacts (cf.Sec-tion 3.4).Considering the variety of compression algorithms available and the rapid change of technology in this ?eld,however,a distortion metric that is independent of the particular algorithm is preferable in order to avoid early obsolescence.Metrics based on human vision models are a way to achieve this technology independence,because they are the most general and potentially the most accurate ones [73].

Vision and Video:Models and Applications15 Lukas and Budrikis[38]were the?rst to propose a spatio-temporal model of the human visual system for use in a video distortion metric.Other models and metrics followed now and then,but only in the past few years has there been an increasing interest in this topic,particularly in the engineering community.This is mainly due to the advent of digital video systems,which have exposed the limitations of the techniques traditionally used for video quality measurement.

For conventional analog video systems there are well-established perfor-mance standards.They rely on particular test signals and measurement pro-cedures to determine parameters such as differential gain,differential phase or waveform distortion,which can be related to perceived quality with rela-tively high accuracy[80].While these parameters are still useful today,their connection with perceived quality has become much more tenuous:because of compression,digital video systems exhibit artifacts fundamentally different from analog video systems(see Section3.4).The amount and visibility of these distortions strongly depend on the actual scene content.Therefore,tra-ditional signal quality measurements are inadequate for the evaluation of these compression artifacts.

Given these limitations,the designers of compression algorithms have had to resort to subjective viewing tests in order to obtain reliable ratings for the quality of compressed images or video(see Section4.3.1).While these tests–if executed properly–certainly are the best measure of“true”perceptual quality, they are complex,time-consuming and consequently expensive.Hence,they are often highly impractical or not feasible at all.

Looking for faster alternatives,researchers have turned to simple error mea-sures such as mean squared error(MSE)or signal-to-noise ratio(SNR),suggest-ing that they would be equally valid.However,these simple error measures operate solely on a pixel-by-pixel basis and neglect the important in?uence of image content and viewing conditions on the actual visibility of artifacts. Therefore,they often do not correlate well with perceived quality.These prob-lems prompted the development of distortion metrics based on models of the human visual system.

4.2A PERCEPTUAL DISTORTION METRIC

We now present the perceptual distortion metric(PDM)developed by the authors[60,74].The underlying vision model–an extension of a model for still images[72]–incorporates color perception,temporal and spatial mechanisms, contrast sensitivity,pattern masking,and the response properties of neurons in the primary visual cortex.The PDM works as follows(see Figure10.5):After conversion to opponent-colors space,each of the resulting three components is subjected to a spatio-temporal perceptual decomposition,yielding a number of perceptual channels.They are weighted according to contrast sensitivity data

16

and subsequently undergo a contrast gain control stage.Finally,all the sensor differences are combined into a distortion measure.

Reference Sequence

Processed Sequence

Distortion

Measure Figure10.5Block diagram of the PDM[74].

4.2.1Color Space Conversion.The?rst stage of the PDM performs the color space conversion of the video input,usually coded in Y C B C R.Accord-ing to the theory of opponent colors,the human visual system decorrelates the input signals from the cones on the retina into black-white(B-W),red-green(R-G)and blue-yellow(B-Y)difference signals(cf.Section3.2).The PDM relies on a particular opponent-colors space that is pattern-color separable[43,44], i.e.color perception and pattern sensitivity can be decoupled and treated in separate stages.

4.2.2Perceptual Decomposition.The perceptual decomposition mod-els the multi-channel architecture of the human visual system.It is performed ?rst in the temporal and then in the spatial domain.Decomposing the input into a number of spatio-temporal channels is necessary in order to be able to account for the fact that masking is strongest between stimuli of similar characteristics (e.g.similar frequency and orientation)in subsequent stages.

The temporal?lters used in the PDM are based on a recent model of temporal mechanisms[19].The design objective for these?lters in the PDM was to keep the delay to a minimum,because in some applications of distortion metrics such as monitoring and control,a short response time is crucial.A trade-off has to be found between an acceptable delay and the accuracy with which the temporal mechanisms ought to be approximated.Recursive in?nite impulse response(IIR)?lters fare better in this respect than(non-recursive)?nite impulse response(FIR)?lters[35].

Therefore,the temporal mechanisms are modeled by two IIR?lters in the PDM.They were computed by means of a least-square?t to the frequency magnitude responses of the respective mechanisms.A?lter with2poles and 2zeros was?tted to the sustained mechanism,and a?lter with4poles and4 zeros was?tted to the transient mechanism.This has been found to yield the shortest delay while still maintaining a good approximation of the frequency

Vision and Video:Models and Applications17

Figure10.6Frequency re-

sponses of sustained(low-

pass)and transient(band-

pass)mechanisms of vision

according to[19](solid),and

the IIR?lter approximations

used in the PDM for a sam-

pling frequency of50Hz

(dashed).

responses,as shown in Figure10.6.In the present implementation,the low-pass?lters are applied to all three color channels,but the band-pass?lter is applied only to the luminance channel in order to reduce computing time.This simpli?cation is based on the fact that color contrast sensitivity is rather low for high frequencies(cf.Section2.2).

The decomposition in the spatial domain is carried out by means of the steerable pyramid transform[54].2This transform decomposes an image into a number of spatial frequency and orientation bands.Its basis functions are directional derivative operators.For use within a vision model,it has the advantage of being rotation-invariant and self-inverting,and it minimizes the amount of aliasing in the subbands.In the present implementation,the basis ?lters have octave bandwidth and octave spacing;?ve subband levels with four orientation bands each plus one low-pass band are computed(see Figure10.7 for an illustration).The same decomposition is used for all channels.

4.2.3Contrast Gain Control Stage.Modeling pattern masking is one of the most critical aspects of video quality assessment,because the visibility of distortions is highly dependent on the local background.Contrast gain control models can explain a wide variety of empirical masking data.These models were inspired by analyses of the responses of neurons in the visual cortex of the cat[2,25,26],where contrast gain control serves as a mechanism to keep neural responses within the permissible dynamic range while at the same time retaining global pattern information.

2Source code and?lter kernels for the steerable pyramid transform are available at https://www.doczj.com/doc/3310765371.html,/?eero/steerpyr.html.

18

Figure10.7Illustration of the partitioning of the spatial frequency plane by the steerable pyramid transform[54].Three levels and the isotropic low-pass?lter are shown.The bands at each level are tuned to orientations of0,45,90and135degrees.The shaded region indicates the spectral support of a single subband,whose actual frequency response is shown on the right.

Contrast gain control can be realized by an excitatory nonlinearity that is inhibited divisively by a pool of responses from other neurons[16,58].Masking occurs through the inhibitory effect of the normalizing pool.A mathematical generalization of these models facilitates the integration of many kinds of channel interactions and spatial pooling[67].Introduced for luminance images, this contrast gain control model can be extended to color and to sequences [72,74].In its most general form,the above-mentioned response pool may combine coef?cients from the dimensions of time,color,temporal frequency, spatial frequency,orientation,space,and phase;in the present implementation of the PDM,it is limited to orientation.

4.2.4Detection and Pooling.The information residing in various chan-nels is integrated in higher-level areas of the brain.This can be simulated by gathering the data from these channels according to rules of probability or vector summation,also known as pooling[46].

The pooling stage of the PDM combines the elementary differences between the sensor outputs over several dimensions by means of vector summation.In principle,any subset of dimensions can be used,depending on what kind of result is desired.For example,pooling may be limited to single frames?rst to determine the variation of distortions over time,and the total distortion can then be computed from the values for each frame.

4.2.5Model Fitting.The model contains several parameters that have to be adjusted in order to accurately represent the human visual system.Threshold

Vision and Video:Models and Applications19 data from contrast sensitivity and contrast masking experiments are used for this procedure.In the?tting process,the input of the PDM imitates the stimuli used in these experiments,and the free model parameters are adjusted in such a way that the output approximates these threshold curves.

Contrast sensitivity is modeled by setting the gains of the spatial and temporal ?lters such that the model predictions match empirical threshold data from spatio-temporal contrast sensitivity experiments for both color and luminance stimuli.While this approach may be slightly inferior to pre-?ltering the B-W, R-G and B-Y channels with their respective contrast sensitivity functions in terms of approximation accuracy,it is easier to implement and saves computing time.For the B-W channels,the weights are chosen so as to match contrast sensitivity measurements from[32].For the R-G and B-Y channels,similar data from[33]are used.

The parameters of the contrast gain control stage are determined by?tting the model’s responses to masked gratings.For the B-W channel,empirical data from several intra-and inter-channel contrast masking experiments from [16]are used.For the R-G and B-Y channels,the parameters are adjusted to ?t similar data from[56].

In the vector summation of the pooling process,different exponents have been found to yield good results for different experiments and implementations. In the PDM,pooling over channels and over pixels is carried out with an exponent of2,whereas an exponent of4is used for pooling over frames.

Our simulation results indicate that the overall quality of the?ts to the above-mentioned empirical data is quite good and close to the difference be-tween measurements from different observers.Most of the effects found in the psychophysical experiments are captured by the model.However,one draw-back of this modeling approach should be noted:Because of the nonlinear nature of the model,the parameters can only be determined by means of a numerical iterative?tting process,which is computationally expensive.

4.3EV ALUATION

In order to evaluate vision models,subjective experiments are necessary. Subjective ratings form the benchmark for objective metrics.However,differ-ent applications may require different testing procedures(cf.Chapter??)and data analysis methods.

4.3.1Subjective Testing.Formal subjective testing is de?ned in ITU-R Rec.500[28],which suggests standard viewing conditions,criteria for observer and test scene selection,assessment procedures,and analysis methods.We outline three of the more commonly used procedures here:

20

Double Stimulus Continuous Quality Scale(DSCQS).Viewers are shown multiple sequence pairs consisting of a“reference”and a“test”sequence, which are rather short(typically10seconds).The reference and test sequence are presented twice in alternating fashion,with the order of the two chosen randomly for each trial.Subjects are not informed which is the reference and which is the test sequence.They rate each of the two separately on a continuous quality scale ranging from“bad”to “excellent”.Analysis is based on the difference in rating for each pair, which is often calculated from an equivalent numerical scale from0to 100.

Double Stimulus Impairment Scale(DSIS).As opposed to the DSCQS method,the reference is always shown before the test sequence,and neither is repeated.Subjects rate the amount of impairment in the test sequence on a discrete?ve-level scale ranging from“very annoying”to “imperceptible”.

Single Stimulus Continuous Quality Evaluation(SSCQE)[40].Instead of seeing separate short sequence pairs,viewers watch a program of typically20-30minutes duration which has been processed by the system under test;the reference is not https://www.doczj.com/doc/3310765371.html,ing a slider whose position is recorded continuously,the subjects rate the instantaneously perceived quality on the DSCQS scale from“bad”to“excellent”.

4.3.2Metric Comparisons.The sequences and subjective ratings used in demonstrations of the performance of a particular metric have been mostly proprietary,as hardly any subjectively rated sequences are publicly available. This has made fair comparisons of different metrics dif?cult.

In order to alleviate this problem,the Video Quality Experts Group(VQEG)3 was formed in1997.Its objectives have been to collect reliable subjective ratings for a well-de?ned set of sequences and to evaluate the performance of different video quality assessment systems with respect to these sequences.The emphasis of the?rst phase of VQEG was on production-and distribution-class video,i.e.mainly MPEG-2encoded sequences with different pro?les,levels and other parameter variations,including encoder concatenation,conversions between analog and digital video,and transmission errors.A set of8-second scenes emphasizing different characteristics(e.g.spatial detail,color,motion) was selected by independent labs;the scenes were disclosed to the proponents only after the submission of their metrics.In total,20scenes were encoded for 16test conditions each.

3See http://www.crc.ca/vqeg/for an overview of its activities.

长句的翻译-复杂的长句

长句的翻译-复杂的长句 泛瑞翻译 在法律翻译中,最难处理的莫过于“长句”。“长句”具有两个典型特点:句子字数多,起码要超过50个单词;句子结构复杂,从句和各种修饰性成分较多。在法律英语中,“长句”出现的频率远远多于等其他英语作品。可以说,正确地理解和翻译长句,是法律翻译成败的关键。 对于法律英语中的“长句”,可以通过“拆分一组合”翻译法加以翻译。具体而言,就是要首先按照从宏观到微观的原则,将“长句”拆分成“短句”,进而分析出句子的主体结构;然后再把“短句”翻译出来;最后再把译文组合成通顺的汉语。其中的关键是“拆分”长句和“组合”译文。 在“拆分”长句的过程中,有一些拆分的“标志”:第一,句子中有标点符号的,可以从标点符号的位置,将句子分割开。第二,句子中有“连接词”的,可以将句子从“连接词”处进一步分割;所谓的“连接词”包括表示逻辑关系的词,比如and, or, but, however等;引导从句的词,如that, which, when, where,while等。第三,句子中有介词短语的,比如at, for, in 等介词所引导的介词成分,可以对介词成分进行进一步的分割。 在“组合”译文的过程中,要注意其中的逻辑关系,在英一汉法律翻译处理上,最终的译文要符合汉语的表达习惯,达到准确、通顺、易懂的标准。 在法律英语中长句的形成主要体现为以下三种方式:第一,复杂的从句;第二,多个并列结构;第三,多个分句。本课将通过例句具体分析上述三种类型“长句”的翻译及技巧。 一、复杂的从句 通过多个从句来使句子结构复杂化,是增加句子长度的重要方式,例如: 例1:When the goods have arrived at their destination,the consignee that demands de-livery of the goods under the contract of carriage shall accept delivery of the goods at the time or within the

大学英语四级长句翻译方法及技巧

第19卷第12期 武汉科技学院学报Vol.19 No.12 2006年12月 JOURNAL OF WUHAN UNIVERSITY OF SCIENCE AND ENGINEERING Dec. 2006 英语长句翻译方法及技巧 张艳萍 (湛江师范学院大学外语部, 广东湛江 524048) 摘要:英语长句翻译是英语学习中的一个难点,本文从英汉语言对比的角度,探讨了英汉两种语言的 差异,进一步分析了两种语言长句的特点,概述了英语长句的常用的四种翻译方法,并举例分析了这些 方法在实际中的运用。 关键词:英汉长句;差异;翻译;技巧 中图分类号:H315.9 文献标识码:A 文章编号:1009-5160(2006)-0200-04 我们在英语教学过程中,往往会发现学生在汉译英时出现中国式英语,英译汉时句子却“西化”。究其原因我认为这主要是因为英语和汉语来自两种完全不同的文化语言体系,语序差别甚大,尤其遇到复杂长句,除了需要较强的对比分析理解能力外,还要求我们掌握一定的翻译理论和技巧以及具备较好的语言表达能力。为此,本人结合自己的英语教学实际,拟从英汉语言对比的角度来对英语长句的翻译问题作些探讨。 1 英汉句子结构的差异 人类语言的多样性,使翻译成为人类交流的重要媒介。同时,由于不同语言体系的差异,在英汉翻译里,英语和汉语会在句法结构、内在逻辑关系存在着明显差异: (1)英语句子重形合,汉语句子重意合。汉语注重隐性连贯注重逻辑事理顺序、注重功能、意义,注重以神统形,形合手段比英语少得多,没有英语所常用的那些关系代词、关系副词、连接代词和连接副词。并且汉语介词数量少,句式结构上也无太多的限制,可以利用说话的语气、环境及语言结构内部的相互衬托等条件使语句尽量辞约义丰。所以汉语是一种必须联系交际人主体意识、语言环境、句子表达功能作动态的意念分析的重“意合”的语言,是有别于英语句子重“形合”,试看下面句子: 例1:My idea of a good P.E. class is one where youth are involved in at least 20 minutes of basic movement that gets their heart rates up. 译文:说到一节好的体育课,我的想法是青年在体育课中至少要进行20分钟使他们心跳加快的基本运动。 这是一个典型的重形合的英语句子,全句用两个关系代词将两个定语从句联系起来,在译文中,将关系一层层理清楚,整句语气从容不迫,这就符合了汉语的叙事方法。 例2:不听老人言,吃亏在眼前。 译文1:If you wish good advice, consult an old man. 原文中的假设关系是隐含的,译成英语时用连词if把假设关系给表达出来,从这一例句可以看出英语重形合而汉语重意合的句子特点。 译文2:Who never consults an old man may suffer loss. 此句用名词性从句来翻译,同样体现了英汉两种语言在“形”和“意”上的区别。 (2)汉语通常根据时间顺序逐个翻译,而英语则较注重空间顺序。 汉语句中可常见两个以上的动词,甚至几乎全句皆动词。如:孩子们手里拿着老师给他们的礼物,唱着、 收稿日期:2006-09-17 作者简介:张艳萍(1978- ),女,硕士研究生,研究方向:英语翻译.

日语句子的翻译技巧---复杂长句专题

一、如何分析理解长句和复杂句 (一)掌握日语句子结构特点: 1、谓语位于句子的末尾。 2、谓语是整个句子的重点和核心。 (二)分析日语的句子结构: 日语和汉语一样,有单句和复句。单句主要是由句节和词组构成,而复句是由分句构成, 还可分为联合复句(由表示并列、选择、递进等关系的分句构成)和偏正复句(由表示转折、假设、因果、推论、取舍、让步等关系的分句构成)。分析单句着眼于分析各句节或词组在句中所属的成分;分析复句则应该弄清分句与分句之间的关系。再逐一分析各个分句。 ○約束の時間にたがわずはいってこられたその方はわたしの隣の椅子に腰を下ろすや否や、十年の知己でもあるかのように、日中友好運動の発展ぶりについて質問され、運動の現状 をいろいろ聞かれた。 ●他按约定的时间走了进来,刚在我旁边的椅子上坐下,就像十年的知己似的,问起中日友好 运动的发展情况,并详细询问了运动现状。 ○同様に、西欧的文明圏に普遍的なエチケットで無作法とされている行為の中で、食卓で音をたてて飲み食いすることの与える不快感の度は、今日の日本人の想像しうるよりもはるかに強いものがあるのである。 ●同样,在西欧文明世界所普遍通行的礼节中,餐桌上吃喝时发出响声被认为是不礼貌的行 为。这种行为给人们带来的不愉快的程度是远非日本人所能想象得到的。 ○頭脳全体の働きは非常に複雑なもので、神秘的にさえ見えるが、これを構成する個々の脳細胞に関しては、それが他の脳細胞に刺激を与える興奮状態にあるか、逆に非興奮状態にあるかの二つの状態の差があるに過ぎない。 ●整个大脑的活动非常复杂,甚至令人觉得神秘。然而,构成人脑的每一个细胞却只有两种状态的差别: 或处于刺激其他脑细胞的兴奋状态,或与此相反,处于非兴奋状态。 (三)分析长句和复杂句的方法和步骤: 第一步:抓住句子的重点和核心:句子末尾的谓语 第二步:找出与句末位于相呼应的主语以及与句末谓语有直接关系的其他谓语,并分析其关系 第三步:逐一找出与上述主语、谓语有直接关系的部分,并分析其关系 第四步:分析剩余的部分,并准确理解它们在句中的作用

(完整版)英语中长句的翻译

Chapter 13 英汉长句的翻译 Translation of English Long Sentences 1. 何谓长句 所谓长句,主要指语法结构复杂、修饰成分较多、内容层次在两个或两个以上的复合句,亦可指含义较多的简单句。 2、汉英长句比较 英、汉两种语言在句法上存在差异,英语多为形合句,汉语多为意合句。汉语句子多属于紧缩型,英语的句子多属于扩展型.英语修辞语位置相对灵活,前置后置,比较自如,尤其倾向于后置,十分有利于句子的扩展。英语句子较长,且较多使用关联词和从句。多种从句(主语、状语、定语、表语从句)并存的长句比比皆是。因为英语结构复杂,层次变化多样,容易产生误解,所以英语长句翻译成为难点。 3、英语长句的分析 在分析长句时可以采用下面的方法: (1) 找出全句的主语、谓语和宾语(主干/句), 从整体上把握句子的结构。 (2) 找出句中所有的谓语结构、非谓语动词、介词短语和从句的引导词。 (3) 分析从句和短语的功能, 例如, 是否为主语从句, 宾语从句, 表语从句等,若是状语, 它是表示时间、原因、结果、还是表示条件等等)。 (4) 分析词、短语和从句之间的相互关系, 例如, 定语从句所修饰的先行词是哪一个等。 (5) 注意插入语等其他成分。 例:Behaviorists suggest that the child who is raised in an environment where there are many stimuli which develop his or her capacity for appropriate responses will experience greater intellectual development. Behaviorists suggest that the child who is raised in an environment where there are many stimuli which develop his or her capacity for appropriate responses will experience greater intellectual development. 分析: (1) 该句的主语为behaviorists, 谓语为suggest, 宾语为一个从句, 因此整个句子为Behaviorist suggest that-clause 结构。 (2) 该句共有五个谓语结构, 它们的谓语动词分别为suggest, is raised, are, develop, experience等。 译文:行为主义者认为, 如果儿童的成长环境里有许多刺激因素, 这些因素又有利于其适当反应能力的发展, 那么, 儿童的智力就会发展到较高的水平 4. 翻译方法 4 . 1 顺译法Synchronizing 有些英语长句叙述的一连串动作按发生的时间先后安排,或按逻辑关系安排,与汉语的表达方式比较一致,可按原文顺序译出。 例(1)In international buying and selling of goods, there are a number of risks, which, if they occur, will involve traders in financial losses. (在)国际贸易货物的买卖(中)存在着各种各样的风险,这些风险的发生将会给(有关的)商人们带来经济损失。 例(2) (1)In Africa I met a boy,(2) who was crying as if his heart would break and said,(3)when I spoke

英语句子翻译技巧

英语句子翻译技巧 以下浅谈英译汉的几点技巧。 第一,翻译时注意英文的句型,英文的句型一般来说有相应的中文译法。如It的句型的翻译: (1)It is+名词十从句: It is a fact that…事实是…… It is a question that………是个问题 It is good news that………是好消息 it is common knowledge………是常识 (2) It is+过去分词十从句: It is said that…据说…… I t must be pointed out that…必须指出…… It is asserted that…有人主张…… It is supposed that…据推测…… It is believed that…据信…… It must be admitted that…必须承认…… It is reported that…据报道…… It will be seen from ii that…由此可见…… It has been proved that…已证明…… It is general1y considered that…人们普遍认为…… (3)It is+形容词十从句: It is necessary that…有必要…… It is likely that…很可能…… It is clear that…很清楚…… It is important that…重要的是…… (4) It+不及物动词十从句: It follows that…由此可见…… It happens that…碰巧…… It turne d out that…结果是…… 第二,注意英语被动句的翻译。英文的被动句经常用汉语主动句表达,如:You are requested to give a performance 英文的被动句译成汉语的主动句:请你给我们表演一个节目。英文中被动意义也可以用汉语中含有主动意义的句子来表达。常译成“被”、“由”、“受”、

长句译法

长句的翻译 1.什么是英语长句? 英语长句一般指的是各种复杂句,复杂句里可能有多个从句,从句与从句之间的关系可能包孕、嵌套,也可能并列,平行。所以翻译长句,实际上我们的重点主要放在对各种从句的翻译上。从功能来说,英语有三大复合句,即:①名词性从句,包括主语从句、宾语从句、表语从句和同位语从句;②形容词性从句,即我们平常所说的定语从句;③状语从句。 2.英语长句的特点是什么? 一般说来,英语长句有如下几个特点: 1)结构复杂,逻辑层次多; 2)常须根据上下文作词义的引申; 3)常须根据上下文对指代词的指代关系做出判断; 4)并列成分多; 5)修饰语多,特别是后置定语很长; 6)习惯搭配和成语经常出现。 3.英语长句的分析方法是什么? 1)找出全句的主语、谓语和宾语,即句子的主干结构; 2)找出句中所有的谓语结构、非谓语结构、介词短语和从句的引导词; 3)分析从句和短语的功能,例如,是否为主语从句、宾语从句、表语从句或状语从句等;以及词,短语和从句之间的关系; 4)分析句子中是否有固定词组或固定搭配、插入语等其他成分。 4.长句翻译方法 In Africa I met a boy,who was crying as if his heart would break and said,when I spoke to him,that he was hungry because he had had no food for two days. 分析:

第一,拆分句子:这个长句可以拆分为四段:In Africa I met a boy/who was crying as if his heart would break/when I spoke to him,that he was hungry because/he had had no food for two days. 第二,句子的结构分析:(1)主干结构是主语+过去式+宾语:I met a boy…。(2)crying后面是状语从句“as if his heart would break”。(3)“when I spoke to him”是介于“said”和“that he was hungry because”之间的插入语。 第三,难点部分的处理:“crying as if his heart would break”应译为“哭得伤心极了”。 参考译文:在非洲,我遇到一个小孩,他哭得伤心极了,我问他时,他说他饿了,两天没有吃饭了。 一般来说,长句的翻译有顺序法、逆序法、分译法和综合法四种。现将各种方法举例说明如下: 1)顺序法 有些英语长句叙述的一连串动作按发生的时间先后安排,或按逻辑关系安排,与汉语的表达方式比较一致,可按原文顺序译出。例如: Combined with digital television sets, videodiscs can not only present films but also offer surround sound which provides theatre quality-amazing reality by which the viewers may have an illusion that they were at the scene and witnessed everything happening just around them. 分析:按意群的关系,该句可以拆分为五部分:Combined with digital television sets/videodiscs can not only present films but also offer surround sound/which provides theatre quality-amazing reality/by which the viewers may have an illusion/that they were at the scene and witnessed everything happening just around them.除了必要的增减词,原文各句的逻辑关系,表达次序与汉语基本一致,因此可以按原文译出。 参考译文:与数字式电视机相结合,图像光盘不仅可以演电影,还提供环境声音,产生电影院效果——令人吃惊的真实感,使观看者产生一种错觉,以为他们在现场目睹他们周围发生的一切。 2)逆序法 “逆序法”又称“倒置法”,主要指句子的前后倒置问题。有些英语长句的表达次序与汉语习惯不同,甚至语序完全相反,这就必须从原文的后面译起,逆着原文的顺序翻译。逆序法在长句的翻译中,我们可根据不同的情况按意群进行全部逆序或部分逆序。例如:

(完整版)英语长句子翻译技巧

英语长句子翻译技巧 对于每一个英语句子的翻译, 并不只是使用一种翻译方法, 而是多种翻译方法的综合运用, 这在英语长句的翻译中表现得尤为突出。长句在科技性的文体中的出现极为频繁, 因此也就成为研究生入学考试的重点, 通过对近年来试题的 分析我们可以看出, 所考查的绝大多数划线的部分都是长句。在翻译长句时, 首先,不要因为句子太长而产生畏惧心理,因为,无论是多么复杂的句子,它都是由一些基本的成分组成的。其次要弄清英语原文的句法结构, 找出整个句子的中心内容及其各层意思, 然后分析几层意思之间的相互逻辑关系, 再按照汉语的 特点和表达方式, 正确地译出原文的意思, 不必拘泥于原文的形式。 英语长句的分析 一般来说, 造成长句的原因有三方面: (1) 修饰语过多;(2) 并列成分多; (3) 语言结构层次多。在分析长句时可以采用下面的方法: (1) 找出全句的主语、谓语和宾语, 从整体上把握句子的结构。 (2) 找出句中所有的谓语结构、非谓语动词、介词短语和从句的引导词。 (3) 分析从句和短语的功能, 例如, 是否为主语从句, 宾语从句, 表语从 句等,若是状语, 它是表示时间、原因、结果、还是表示条件等等)。 (4) 分析词、短语和从句之间的相互关系, 例如, 定语从句所修饰的先行词是哪一个等。 (5) 注意插入语等其他成分。 (6) 注意分析句子中是否有固定词组或固定搭配 英语习惯于用长的句子表达比较复杂的概念, 而汉语则不同,常常使用若干短句, 作层次分明的叙述。因此, 在进行英译汉时, 要特别注意英语和汉语之间的差异, 将英语的长句分解, 翻译成汉语的短句。在英语长句的翻译过程中, 我们一般采取下列的方法。 (1) 顺序法。当英语长句的内容的叙述层次与汉语基本一致时, 可以按照英语原文的顺序翻译成汉语。 (2) 逆序法。英语有些长句的表达次序与汉语表达习惯不同, 甚至完全相反, 这时必须从原文后面开始翻译。 (3)分句法。有时英语长句中主语或主句与修饰词的关系并不十分密切, 翻译时可以按照汉语多用短句的习惯, 把长句的从句或短语化成句子, 分开来 叙述,为了使语意连贯, 有时需要适当增加词语。 (4) 综合法。上面我们讲述了英语长句的逆序法、顺序法和分句法, 事实上,在翻译一个英语长句时, 并不只是单纯地使用一种翻译方法, 而是要求我们把各种方法综合使用, 这在我们上面所举的例子中也有所体现。尤其是在一些情况下, 一些英语长句单纯采用上述任何一种方法都不方便, 这就需要我们的 仔细分析, 或按照时间的先后,或按照逻辑顺序, 顺逆结合, 主次分明地对全句进行综合处理,以便把英语原文翻译成通顺忠实的汉语句子。

英语长句的翻译方法

长句的翻译方法 1.顺序翻译法 (1)The problem is that the last generation or so we’re come to assume that women should be able, and should want, to do everything that by tradition men have done at the same time as prettywell everything that by tradition women have done.问题是,在过去 二十年时间里,我们已经认定,妇 女们应该能够且应该想做男人们 想做男人们传统上所做的一切,而 同时也能够做得跟妇女们传统上

所做的一切同样好。

(2)Exercise *Prior to the twentieth century, women in novels were stereotypes of lacking any features that made them unique individual and were also subject to numerous restrictions imposed by the mali-dominated culture. 2. 逆序翻译法 (1)It therefore becomes more and more important that, if students are not to waste their opportuniti es,there

will have to be much detailed information about courses and advice. 因此,如果要使学生充分利用(上 大学)的机会,就得为他们提供 关于课程的更为详尽的信息,作 更多的指导。这个问题显得越来越 重要了。 (2)Exercises *It is probably easier for teachers than for students to appreciate the reason why learning English seems to become increasingly difficult once the basic structures and

英语长句翻译

1.First put forward by the French mathematician Pierre de Format in the seventeenth century, the theorem had baffled and beaten the finest mathematical minds, including a French woman scientist who made a major advance in working out the problem, and who had to dress like a man in order to be able to study at the Ecolab polytechnique. 这个定理,先是由十七世纪法国数学家皮尔法特提出,曾使一批杰出的数学大师为难,包括一位法国女科学家,她在解决这个难题方面取得了重大的进展,她曾女扮男装为了能够在伊科尔理工学院学习。 简析:夹杂过去分词短语,现在分词短语,动名词及两个定语从句。 2. It is difficult to measure the quantity of paper used as a result of use of Internet-connected computers, although just about anyone who works in an office can tell you that when e-mail is introduced, the printers start working overtime. That is, the growing demand for paper in recent years is largely due to the increased use of the Internet. 由于因特网的使用,计算所使用的纸张的数量是很难的,然而几乎任何在办公室工作的人能告诉你,当引进电子邮件后,打印机就开始超时工作。也就是说近年来人们对于纸张的日益需求主要是由于因特网越来越多的使用。 简析:夹杂较复杂的句型结构,关键词just about几乎;overtime超时地。 3. Perhaps the best sign of how computer and internet use pushes up demand for paper comes from the high-tech industry itself, which sees printing as one of its most promising new market. 或许,表明电脑及因特网使用促进人们对于纸张的需求的最好迹象源于高科技产业本身,印刷业被认为是高科技产业极有前景的新市场之一。 简析:夹杂较复杂的句型结构,关键词promising有前途的。 4. The action group has also found acceptable paper made from materials other than wood, such as agricultural waste. 这个行动组也发现一种人们可接受的纸,制成这种纸的原料不是木料,而是农业废料。 简析:关键词other than而不是。 5. Mostly borrowed from English and Chinese, these terms are often changed into forms no longer understood by native speakers. 这些术语,主要从英语和汉语引入,经常会变成不再被说本族语的人们理解的形式。 简析:关键词term术语。 6. It is one of many language books that are now flying off booksellers’ shelves. 它是现在很畅销的许多外语书中的一本。 简析:比喻生动形象。

[全]复杂的英语长句翻译详解

复杂的英语长句翻译详解 首先,熟悉一下陌生的单词: Interpol:国际刑警组织,全称是International Criminal Police Organization red notice:也就是我们经常听到的“红通”,和“issue”搭配。issue做“发出”、“签发”解释时,后接宾语通常是比较正式的“通知”。 fugitive:逃亡的 heir:继承人 hit-and-run:肇事逃逸 接下来看到原文。可以看到原文的主语是interpol,但我在处理的时候换掉了主语,并变成了一个被动句。这样做的目的在于方便组织,另外还因为这个句子的所有信息基本都指向了arrest后面的宾语heir。所以干脆以他为主,再将主动

语态变成被动语态。“for his role in"被我直接省略为一个“因”字,因为后面的”fatal hit-and-run“就是前面arrest的宾语heir造成的(查看相关报道或后文可以得知,所以翻译还要会查证),所以这几个词在这里原文用于起到连接前后信息作用的结构,表示前后为因果关系而已。再来看“fatal hit-and-run”,如果直接按照原文的顺序译,那就是“致命的肇事逃逸案件”。但如果我们拆开来看,可以清楚地得出“fatal”是此次“hit-and-run”的后果。所谓“致命”,放在这样的语境,用“致人死亡”更适合。所以把“fatal hit-and-run”译成“肇事逃逸致人死亡”。“fugitive”被翻译成“(目前在逃)”,用括号把它处理了。这是因为后面的“to the Red Bull billions"实质上也是一个限定性的片段,为了避免宾语heir在翻译时前面的限定性词语太多而造成句子不顺畅,所以把其中挪到后面用括号处理掉。第一句最后一个要注意的点是“Thai heir to the Red Bull billions"。从译文可以看出,我把”Thai"和“Red Bull”合在了一起,因为从文中的人名、标题和文中的前后文,都可以知道,这里指的是“泰国红牛”。

英语四级翻译辅导:英语长句的译法技巧

英语四级翻译辅导:英语长句的译法技巧 长句在科技性的文体中的出现极为频繁, 因此也就成为研究生入学考试的重点, 通过对近 年来试题的分析我们可以看出,所考查的绝大多数划线的部分都是长句。 在翻译长句时,首先,不要因为句子太长而产生畏惧心理,因为,无论是多么复杂的句 子,它都是由一些基本的成分组成的。其次要弄清英语原文的句法结构,找出整个句子的中心内容及其各层意思, 然后分析几层意思之间的相互逻辑关系, 再按照汉语的特点和表达 方式, 正确地译出原文的意思,不必拘泥于原文的形式。 一、英语长句的分析 一般来说, 造成长句的原因有三方面: (1) 修饰语过多;(2) 并列成分多; (3) 语言结构层次多。在分析长句时可以采用下面的方法: (1) 找出全句的主语、谓语和宾语, 从整体上把握句子的结构。 (2) 找出句中所有的谓语结构、非谓语动词、介词短语和从句的引导词。 (3) 分析从句和短语的功能, 例如, 是否为主语从句, 宾语从句, 表语从句等,若是状语, 它是表示时间、原因、结果、还是表示条件等等)。 (4) 分析词、短语和从句之间的相互关系, 例如, 定语从句所修饰的先行词是哪一个 等。 (5) 注意插入语等其他成分。 (6) 注意分析句子中是否有固定词组或固定搭配。 下面我们结合一些实例来进行分析: 例1. Behaviorists suggest that the child who is raised in an environment where there are many stimuli which develop his or her capacity for appropriate responses will experience greater intellectual development. 分析: (1) 该句的主语为behaviorists, 谓语为suggest, 宾语为一个从句, 因此整个句子为Behaviorist suggest that-clause 结构。

考研英语长句翻译基本功

考研英语长句翻译基本功 长句翻译顺序法 长句翻译逆序法 长句翻译分句法 长句翻译综合法 长句翻译顺序法 英语习惯于用长的句子表达比较复杂的概念,而汉语则不同,常常使用若干短句,作层次分明的叙述。因此,在进行英译汉时,要特别注意英语和汉语之间的差异,将英语的长句分解,翻译成汉语的短句。在英语长句的翻译过程中,我们一般采取下列的方法。 (1) 顺序法。当英语长句的内容的叙述层次与汉语基本一致时,可以按照英语原文的顺序翻译成汉语。例如: 例1. Even when we turn off the beside lamp and are fast asleep,electricity is working for us,driving our refrigerators,heating our water,or keeping our rooms air-conditioned. (84年考题) 分析:该句子由一个主句,三个作伴随状语的现在分词以及位于句首的时间状语从句组成,共有五层意思: A. 既使在我们关掉了床头灯深深地进入梦乡时;B.电仍在为我们工作; C. 帮我们开动电冰箱; D. 加热水; E. 或是

室内空调机继续运转。上述五层意思的逻辑关系以及表达的顺序与汉语完全一致,因此,我们可以通过顺序法,把该句翻译成: 即使在我们关掉了床头灯深深地进入梦乡时,电仍在为我们工作:帮我们开动电冰箱,把水加热,或使室内空调机继续运转。 例2. But now it is realized that supplies of some of them are limited,and it is even possible to give a reasonable estimate of their “expectation of life”,the time it will take to exhaust all known sources and reserves of these materials. (84年考题)分析:该句的骨干结构为“It is realized that…”,it为形式主语,that引导着主语从句以及并列的it is even possible to …结构,其中,不定式作主语,the time …是“expectation of life”的同位语,进一步解释其含义,而time后面的句子是它的定语从句。五个谓语结构,表达了四个层次的意义: A. 可是现在人们意识到;B. 其中有些矿物质的蕴藏量是有限的; C. 人们甚至还可以比较合理的估计出这些矿物质“可望存在多少年”; D. 将这些已知矿源和储量将消耗殆尽的时间。根据同位语从句的翻译方法,把第四层意义的表达作适当的调整,整个句子就翻译为: 可是现在人们意识到,其中有些矿物质的蕴藏量是有限的,人们甚至还可以比较合理的估计出这些矿物质“可望存在多少年”,也就是说,经过若干年后,这些矿物的全部已知矿源和储量将消耗殆尽。 下面我们再列举几个实例: 例3. Prior to the twentieth century,women in novels were stereotypes of lacking any features that made them unique individuals and were also subject to numerous restrictions imposed by the male-dominated culture. 在20世纪以前,小说中的妇女像都是一个模式。她们没有任何特点,因而无法成为具有个性的人;他们还要屈从于由男性主宰的文化传统强加给他们的种种束缚。 例4. This method of using “controls” can be applied to a variety of situations,and can be used to find the answer to questions as widely different as “Must moisture be present if iron is to rust?” and “Which variety of beans gives the greatest yield in one season?”

笔译基础 长句复杂句

长句、复杂句翻译 认真研读下列长句,确定各句的主干结构(主句结构和从句结构中主语、谓语等) 1.The large-scale migration of unskilled and semi-skilled labour, and of professional manpower, which has taken place in the last two decades has been a reflection of imbalances in the income and employment opportunities and, to some extent, of constraints on the international flow of capital and trade. 2.The men and women throughout the world who think that a living future is preferable to a dead world of rocks and deserts will have to rise and demand, in tones so loud that they cannot be ignored, that common sense, humanity, and the dictates of that moral law which Mr. Dulles believes that he respects, should guide our troubled era into that happiness which only its own folly is preventing. 3. WE THE PEOPLES OF THE UNITED NATIONS DETERMINED to save succeeding generations from the scourge of war, which twice in our lifetime has brought untold sorrow to mankind, and to reaffirm faith in fundamental human rights, in the dignity and worth of the human person, in the equal rights of men and women and of nations large and small, and to establish conditions under which justice and respect for the obligations arising from treaties and other sources of international law can be maintained, and to promote social progress and better standards of life in larger freedom, AND FOR THESE ENDS to practice tolerance and live together in peace with one another as good neighbors, and to unite our strength to maintain international peace and security, and to ensure, by the acceptance of principles and the institution of methods, that armed force shall not be used, save in the common interest, and to employ international machinery for the promotion of the economic and social advancement of all peoples, HA VE RESOLVED TO COMBINE OUR EFFORTS TO ACCOMPLISH THESE AIMS.

英语长句翻译方法

英语翻译过程中,英语长句的译法大致可采取以下四种方法来进行。 一、顺序法 有的英语句子虽长,但它所叙述的动作基本上是按时间顺序、地点顺序或逻辑关系安排的,这与我们中国人的思维方式相吻合,因此,翻译时可依从原文语序。 例: In Africa I met a boy, who was crying as if his heart would break, said,when I spoke to him, that he was hungry because he had had no food for two days. 分析:此句不算很长,但主句宾语a boy 细藤结大瓜似地带着一个非限制性定语从句直贯句尾,而从句中又有宾语从句that he was hungry because he had had no food for two days. 此宾语从句中的划线部分又是一个原因状语从句,整个句框形成三级从句;宾语从句中的谓语动词said 与它的宾语从句之间又为一时间状语when I spoke to him 所隔。翻译时若要体现这纷繁复杂的语法关系简直难以行文,因而译文按照原文的顺序,将所发生的事情依序道来,虽未复现原文中诸从句与其被修饰语的关系,但原文的思想内容无一遗漏,叙述脉络清晰,符合逻辑关系和汉语行文习惯。 译文:在非洲,我遇到了一个男孩,他哭得伤心极了,我问他时,他说他饿了,两天没有吃饭了。 二、逆序法 有些英语长句的表达次序与汉语习惯不尽相同甚至恰恰相反,这就要求我们在翻译时须逆着原文,从后面译起。 例: Is it our sound and considered judgment that the tougher subjects of the classics and mathematics should be thrown aside, as suggested by some educators, for doll- playing? 分析:原文为了句子结构的平衡用形式主语“it”代替冗长的主语从句,而汉语中没有相同的句子结构,译文逆着原文的顺序先译主语从句的内容再译主句框架,将原文的意义表达得一清二楚,口气语势也与原文一般无二。 译文:象某些教育家所提议的那样,将古典文学和数学这些难学的课程弃之一边,而去学一些非常轻松的科目,难道这是我们经过深思熟虑的正确判断吗 三、拆译法

英语长句的翻译基本技巧和方法

英语长句的翻译基本技巧和方法 英语习惯于用长的句子表达比较复杂的概念,而汉语则不同,常常使用若干短句作层次分明的叙述。因此,在进行英译汉时,要特别注意英语和汉语之间的差异,将英语的长句分解翻译成汉语的短句。在英语长句的翻译过程中,归纳出以下的一些方法。(1) 顺序法 当英语长句的内容叙述层次与汉语基本一 致时,可以按照英语原文表达的层次顺序翻译成汉语,从而使译文与英语原文的顺序基本一致。例如: But now it is realized that supplies of some of them are limited, and it is even possible to give a reasonable estimate of their "expectation of life", the time it will take to exhaust all known sources and reserves of these materials.(84年考题)

分析:该句的骨干结构为"It is realized that ...",it为形式主语,that引导主语从句以及并列的it is even possible to ...结构,其中,不定式作主语,the time ...是"expectation of life"的同位语,进一步解释其含义,而time后面的句子是它的定语从句。五个谓语结构表达了四个层次的意义:A. 可是现在人们意识到;B. 其中有些矿物质的蕴藏量是有限的;C. 人们甚至还可以比较合理地估计出这些矿物质"可望存在多少年";D. 将这些已知矿源和储量将消耗殆尽的时间。根据同位语从句的翻译方法,把第四层意义的表达作适当的调整, 整个句子就翻译为: 可是现在人们意识到,其中有些矿物质的蕴藏量是有限的,人们甚至还可以比较合理的估计出这些矿物质"可望存在多少年",也就是说,经过若干年后,这些矿物的全部已知矿源和储量将消耗殆尽。 (2) 逆序法

相关主题
文本预览
相关文档 最新文档