Improvements in phrase-based statistical machine translation
- 格式:pdf
- 大小:126.41 KB
- 文档页数:8
Useful phrases of Unit 1-6 (Book Three)Unit 1 Mr. Doherty Builds His Dream LifeGet by过得去, find contentment寻觅心灵的满足, a self-reliant life自力更生的生活, a tough life 艰苦的生活, household routine日常的家务, as the old saying goes正如老话说的那样, love/ enjoy every minute of Sth温馨快乐每一分钟., get through the winter过冬, Ivy League schools常春藤联合学校, with Sb.’s blessings带着。
的祝福, on balance总的来说, be employed full time 担任全职工作, mortgage payments贷款按揭, when it comes to Sth.至于。
, aside from出。
以外, dine out外出就餐, standard of living生活水平, make up the difference in income弥补收入差额, attend the opera and ballet听歌剧看芭蕾演出, a tolerance for solitude耐得住寂寞, a self-sufficient life / self-sufficiency自给自足的生活, resist the temptation to do Sth抵制诱惑, leave with a feeling of sorrow苍然离去, a sense of pride自豪感, once economic conditions improve一旦经济条件好转, earn money赚钱.At that point就在那个时候, pick up提取, cut back减少, tight budget预算紧张, on a small scale 小规模.P. 14-16. vocabulary.1. insurance2. on balance3. aside from4. cut back5. resist6. haul7. supplemented8. sprayed9. wicked 10. illustrated 11. budget 12. digest 13. boundary 14. get by 15. at that point1. cut back / down2. pick up3. get by4. get through5. face up to6. turn in7. turning out8. think up.P. 21. Cloze AGets by, temptation, get through, picked up, improvements, aside from, suspect, supplement, profit, primarily, spraying, stacking.Cloze BWhile, escape, begin, because, quit, start, on, but, be, close, have, cutting, cook, cities, however, family.Unit 2 Thee Freedom GiversA gentle breeze微风stand up for支持, a man of principle有原则的人, a historic site一处历史古迹, liberate slaves解放奴隶, be intent on doing下定决心去做, soft knock, iron molding铸铁铸件, spread the news传播新闻, religious convictions宗教信仰, find refuge寻找避难, the next leg ofthe journey在旅途的下一站, principal routes主要路线,face risks面对风险, impose a fine处以罚款, keep a log of保留日志, road signs路牌, a funeral procession送葬, virgin land未开发的荒地地, in the eyes of在某人看来.Slender woman, give up放弃, ironically具有讽刺意味, forge the Underground Railroad伪造地下铁道, on the side侧面, close in around包围, as for至于, make the best of, at huge risk冒着巨大的风险, pass for传递.come a long way进展;进步P. 50-52. vocabulary.1. decades2. historic3. imposed自己担负的4. racial5. slender6. closing in on7. settlement8. site9. mission 10. authorized 11. terminal 12. make the best of 13. exploits 14. religious 15. on the side1. pass for2. stood up for3. laid down4. take on5. let down6. draw on7. come up8. given up.P. 58. Cloze AUnderground, forged, stand up, transport, compelled, convictions, liberating, mission, abolish, intent on, risk.Cloze BWho, the, along, in, that, through, not, as, referred, escape, where, if, in, even, until, instead, as.Unit 3 The Land of the LockOn the latch门关着但没上锁, carry keys带着钥匙, close up关闭;停歇, well-patrolled urban streets巡逻良好的城市街道, the allegedly tranquil areas据称是安静的地方, the era of …is over。
2020年12月四级考试超全答案四级写作01◆【D i r e c t i o n s】For this part, you are allowed 30 minutes to write a n essayon the change of communication. You should write at least 120 words but no more than180 words.◆【参考范文】With the development of science and technology, we have witnessed the various huge changes of our daily life, among which, the change of communication is striking. However, people’s view on it never come to consensus. Concerning it, both communication online and offline have their merits.For one thing, no one denies that communication online brings great convenience to us, especially to those who have friends or relatives in remote areas. Because the change makes it possible for them to have frequent chat. And, the way we contact with others is diverse. Video calls and voice message can both meet modern people’s satisfaction. For another, the change is also making us disconnected. Due to the availability, people are gradually reluctant to have face-to-face communication with surrounding people, which is isolating us from the people we love.Given the factors above, the change of communication, we have to admit, is more like a double-edged sword. Neither do we discard it nor completely rely on it. Instead, we should make reasonable use of it so as to maximize its benefits.◆【参考译文】随着科学技术的发展,我们见证了我们日常生活中的各种巨大变化,其中沟通方式的变化是引人注目的。
TPO51阅读-3Population Growth in Nineteenth-Century Europe原文 (1)译文 (2)题目 (4)答案 (8)背景知识 (9)原文Population Growth in Nineteenth-Century Europe①Because of industrialization,but also because of a vast increase in agricultural output without which industrialization would have been impossible,Western Europeans by the latter half of the nineteenth century enjoyed higher standards of living and longer,healthier lives than most of the world’s peoples.In Europe as a whole,the population rose from188million in1800to400million in1900.By 1900,virtually every area of Europe had contributed to the tremendous surge of population,but each major region was at a different stage of demographic change.②Improvements in the food supply continued trends that had started in the late seventeenth century.New lands were put under cultivation,while the use of crops of American origin,particularly the potato,continued to expand.Setbacks did occur.Regional agricultural failures were the most common cause of economic recessions until1850,and they could lead to localized famine as well.A major potato blight(disease)in1846-1847led to the deaths of at least one million persons in Ireland and the emigration of another million,and Ireland never recovered the population levels the potato had sustained to that point.Bad grain harvests at the same time led to increased hardship throughout much of Europe.③After1850,however,the expansion of foods more regularly kept pace with population growth,though the poorer classes remained malnourished.Two developments were crucial.First,the application of science and new technology to agriculture increased.Led by German universities,increasing research was devoted to improving seeds,developing chemical fertilizers,and advancing livestock.After 1861,with the development of land-grant universities in the United States that had huge agricultural programs,American crop-production research added to this mix. Mechanization included the use of horse-drawn harvesters and seed drills,many developed initially in the United States.It also included mechanical cream separators and other food-processing devices that improved supply.④The second development involved industrially based transportation.With trains and steam shipping,it became possible to move foods to needy regions withinWestern Europe quickly.Famine(as opposed to malnutrition)became a thing of the past.Many Western European countries,headed by Britain,began also to import increasing amounts of food,not only from Eastern Europe,a traditional source,but also from the Americas,Australia,and New Zealand.Steam shipping, which improved speed and capacity,as well as new procedures for canning and refrigerating foods(particularly after1870),was fundamental to these developments.⑤Europe's population growth included one additional innovation by the nineteenth century:it combined with rapid urbanization.More and more Western Europeans moved from countryside to city,and big cities grew most rapidly of all. By1850,over half of all the people in England lived in cities,a first in human history.In one sense,this pattern seems inevitable growing numbers of people pressed available resources on the land,even when farmwork was combined with a bit of manufacturing,so people crowded into cities seeking work or other resources.Traditionally,however,death rates in cities surpassed those in the countryside by a large margin;cities had maintained population only through steady in-migration.Thus rapid urbanization should have reduced overall population growth,but by the middle of the nineteenth century this was no longer the case.Urban death rates remained high,particularly in the lower-class slums, but they began to decline rapidly.⑥The greater reliability of food supplies was a factor in the decline of urban death rates.Even more important were the gains in urban sanitation,as well as measures such as inspection of housing.Reformers,including enlightened doctors,began to study the causes of high death rates and to urge remediation.Even before the discovery of germs,beliefs that disease spread by"miasmas"(noxious forms of bad air)prompted attention to sewers and open garbage;Edwin Chadwick led an exemplary urban crusade for underground sewers in England in the1830s. Gradually,public health provisions began to cut into customary urban mortality rates.By1900,in some parts of Western Europe life expectancy in the cities began to surpass that of the rural areas.Industrial societies had figured out ways to combine large and growing cities with population growth,a development that would soon spread to other parts of the world.译文十九世纪欧洲的人口增长①由于工业化,也由于农业产量的大幅度增加(如果没有这些就不可能实现工业化),西欧人到了十九世纪后半叶比世界上大多数民族享有更高的生活水平和更长寿,更健康的生活。
关于赞美祖国成就的英语演讲稿Ladies and gentlemen, Today, I am honored to speak about the remarkable achievements of our great nation. As we stand on the cusp of a new era, it is essential to reflect on the milestones we have reached and the progress we continue to make. Our country’s journey is a testament to the unwavering spirit and resilience of its people, and it is with immense pride that I highlight some of our most significant accomplishments.First and foremost, our nation has made unprecedented strides in economic development. From humble beginnings, we have transformed into a global economic powerhouse. Our GDP growth rate consistently ranks among the highest in the world, a clear indicator of our robust and dynamic economy. This economic expansion is not merely a statistic; it translates into tangible improvements in the quality of life for ourcitizens. Poverty rates have plummeted, and the standard of living has risen significantly. Our cities are bustling with activity, and our rural areas are benefiting from development projects that bring infrastructure and opportunities to every corner of the nation.One of the cornerstones of our economic success is our commitment to innovation and technology. We have become a leader in various technological fields, from artificial intelligence to renewable energy. Our research institutions and universities are at the forefront of scientific discovery, driving advancements that not only benefit our country butalso contribute to global progress. The government’s suppor t for start-ups and tech companies has fostered an environment where innovation thrives. As a result, we see homegrown companies competing on the world stage, bringing pride and recognition to our nation.In addition to economic growth, our nation has made significant advancements in education. Recognizing that the future of our country lies in the hands of our youth, we have invested heavily in our education system. From primaryschools to universities, our educational institutions are equipped with the resources and facilities necessary toprovide a world-class education. We have also placed a strong emphasis on vocational training, ensuring that our workforceis skilled and prepared for the demands of the modern economy. As a result, our literacy rates are at an all-time high, and our graduates are making their mark in various fields, both domestically and internationally.Healthcare is another area where our nation has seen remarkable progress. We have implemented comprehensive healthcare reforms that ensure access to quality medical services for all our citizens. Our healthcare system is now ranked among the best in the world, with cutting-edgefacilities and highly trained medical professionals.Preventive care and public health initiatives have significantly improved life expectancy and reduced the prevalence of diseases. Moreover, our pharmaceutical industry is thriving, producing life-saving medications and vaccines that contribute to global health.Our nation’s infrastructure development is nothing sh ort of extraordinary. We have built an extensive network of highways, railways, and airports that facilitate the movement of people and goods across the country. This infrastructure not only supports economic growth but also enhances connectivity and fosters social integration. Our urban centers are models of modern city planning, with efficient public transportation systems, green spaces, and sustainable development practices. In rural areas, infrastructureprojects have brought electricity, clean water, and improved sanitation, drastically improving the quality of life for millions.Environmental sustainability is a key priority for our nation. Recognizing the importance of protecting our natural resources, we have implemented policies and initiatives aimed at reducing our carbon footprint and promoting renewable energy. Our investments in solar, wind, and hydropower have made us a global leader in clean energy production. Additionally, our efforts in reforestation and conservation have preserved our biodiversity and natural beauty for future generations. These initiatives are a testament to our commitment to balancing economic growth with environmental stewardship.Culturally, our nation is a vibrant tapestry of traditions, languages, and customs. We take immense pride in our cultural heritage, which is a source of strength and unity. Our government has actively promoted cultural preservation and the arts, ensuring that our rich history is celebrated and passed down through generations. Festivals, music, dance, and literature thrive, reflecting the diversityand creativity of our people. Our cultural exports, including films and literature, have gained international acclaim, showcasing the unique aspects of our identity to the world.On the global stage, our nation plays a crucial role in promoting peace and cooperation. We are active members of international organizations and participate in diplomatic efforts to address global challenges. Our contributions to peacekeeping missions, humanitarian aid, and disaster relief are recognized and appreciated worldwide. We believe in the power of dialogue and cooperation to resolve conflicts and build a better future for all. Our foreign policy is guidedby principles of mutual respect, non-interference, and a commitment to fostering positive relations with all nations.Social progress is another area where our nation shines. We have made significant strides in promoting gender equality, protecting human rights, and fostering social inclusion. Our legal framework supports the rights of all citizens,regardless of gender, ethnicity, or socioeconomic status. We have implemented policies that empower women, protect children, and support the elderly and disabled. Our societyis becoming more inclusive and equitable, with opportunities for all to thrive and contribute to the nation’s development.In the realm of sports, our nation has achieved remarkable success. Our athletes have excelled in various international competitions, bringing home medals and setting records. Sports are a unifying force, bringing peopletogether and fostering a sense of national pride. Our investment in sports infrastructure and talent development programs has created a generation of athletes who are notonly champions but also role models for the youth. The spirit of sportsmanship and the pursuit of excellence are valuesthat resonate deeply within our society.As we look to the future, our nation remains committed to continuous improvement and innovation. We are embracing thechallenges and opportunities of the 21st century with optimism and determination. Our strategic plans focus on sustainable development, technological advancement, andsocial progress. We are building a society where everyone has the opportunity to reach their full potential and contribute to the nation’s prosperity.In conclusion, the achievements of our great nation are a testament to the dedication, hard work, and resilience of our people. From economic growth to technological innovation, from education to healthcare, from infrastructure to environmental sustainability, we have made remarkable progress. Our cultural heritage and social values continue to inspire and unite us. As we move forward, we do so with confidence and pride, knowing that our best days are yet to come. Together, we will continue to build a nation that is prosperous, inclusive, and respected on the global stage.Thank you.。
中文摘要口外弓肌激动器对安氏II类1分类的疗效分析专业:口腔临床医学申请人:肖文辉指导教师:王大为安氏II豢1分类是正畸临床较为常见的错合畸形,通常分为牙性和骨性,一般认为,安氏ll类1分类错合若不及时矫治有可能妨碍儿童颜面部正常生长发育。
对于安氏II类1分类错合只有作出正确的诊断及治疗计划的制定并积极进行早期治疗,才有可能取得矫治成功。
对于骨性安氏II类错合的生长改良治疗一直是口腔正畸学所密切关注的课题。
自从Andresen发明肌激动器以来,其已成为一种常用的功能矫治器用来治疗II类错合畸形,结果能明显改善面形及建立1类合关系。
关于其对上下颌骨及牙列的作用已有不少研究报道。
一些研究证实肌激动器的治疗会使得上下颌骨向。
卜^,向后旋转。
下颌骨的这种旋转被认为有利于1l类1分类低角病例的面形的改着,但不利于II类合关系的矫治。
为了减少单独应用肌激动器在治疗II类t分类错合时的这一副作用,U1lrichTeuscher于1978年提出在使用肌激动器的同时加用高位头帽口外牵引以抵消这些副作用并成功地应用了这种矫治器(口外弓一肌激动器,Teuscher矫治器)。
目的:关于口外弓一肌激动器治疗前后X线头影测量的比较研究已有报道,但其所引起的形态学变化一直是一个争论的问题,日前尚无定论。
本研究拟采用常规x线头影测量法,结合有限元分析法通过对比口外弓一肌激动器治疗前后的颌面形态学变化研究U外弓一肌激动器矫治安氏II类1分类错合对颌面生长发育的影响,同时评价口外弓~肌激动器在治疗II类1分类错合时的作用机制及疗效。
方法:本研究选择21例采用口外弓一肌激动器治疗的II类1分类错合畸形的青少年患考。
治疗前后均拍摄头颅定位侧位片,所有头颅侧位片的拍摄及测量均由同一医生进行,选择14个常用的头影测量项目测量并计算各项目均值及标准差,并进行小样本t检验及两样本均数比较检验。
研究同时采用有限元分析选择有代表性的x线头颅侧位片解剖标志点17个作为节点,由此划分14个三角形单元,对同一患者矫治前后的2张x线头侧位片,以s点为原点,sN平面为X轴确定座标系,通过图形数字化仪和微机测量各节点治疗前后的x,Y座标值。
Intonational Variation in the British IslesEsther Grabe* & Brechtje Post§*Phonetics Laboratory, University of Oxford§Centre for Speech and Language, University of Cambridge esther.grabe@AbstractModels of intonation are typically based on one dialect and one style and do not account for inter- or intra-speaker variability. Speech data from the IViE corpus, however, demonstrate considerable variation in English intonation that occurs both across and within dialects (IViE = Intonational Variation in English, UK ESRC award R000237145, http://www.phon.ox.ac.k/~esther/ivyweb). In this paper, we introduce the IViE corpus and present a selection of findings. Concentrating on nuclear accents, we provide evidence for (1) variation in the production of nuclear accent types and (2) variation in the phonetic realisation of nuclear accents. We discuss data from seven dialects. The results show that intonational differences between dialects of one language can be greater than intonational differences between dialects of two different languages. They also show that there is considerable intra-dialectal variation.1.IntroductionIn the British Isles, segmental phonetic differences between dialects have been investigated extensively [e.g. 1]. Dialectal variation in intonation has received much less attention. Prior investigations have been mono-dialectal and most provide comparisons only with Southern Standard British English.[e.g. 2-12].There are at least two reasons why we need descriptive data on intonational variation within languages. Firstly, the data are required for work on prosodic typology. This topic is becoming increasingly central in research on prosody [13-17]. Ideally, prosodic typologies are based on quantifiable data on variation across and within languages.Secondly, there is a need for quantifiable data on intonational variation in speech technology. At present, statistical data on variation are scarce. Consequently, the lack of quantitative modelling of intonational variation is widely recognised as a serious gap in speech technology. The provision of appropriate, statistical models of intonation is expected to yield significant improvements in synthesis and recognition.2.Background2.1.Previous studies of intonation in the British IslesA number of English dialects have been investigated previously, including Belfast [2,8,10], Tyneside [3,11], Liverpool [4], Welsh English [5-7], London Jamaican [9], Glasgow [11] and Manchester [12]. The studies on Belfast, in particular, show that the level of variation between intonation systems in the British Isles is considerable. Southern British English speakers produce declaratives with falling intonation and questions without morphosyntactic markers with final rises. Belfast speakers do not appear to make this distinction; both sentence types are produced with rising intonation [8, 19]. Until recently, comparable speech data from other varieties of English have not been available. Consequently, in the IViE project, our aims were (1) to collect comparable speech samples from several English dialects in a range of speaking styles, (2) to make the data available in the public domain, and (3) to provide prosodic annotations and linguistic analyses for a subsection of the data.2.2.The need for directly comparable dataThe methodology employed in the IViE project was taken from previous comparative work on intonation. Grabe [21] showed that apparently conflicting claims about similarities and differences in the intonation systems of the standard varieties of English and German could be resolved if the analysis was based on directly comparable data from near-homogenous speaker groups. These were controlled for dialect, speaking style, age and gender. Grabe showed that her English and German speakers used equivalent sets of intonational building blocks but combined them in different ways. Additionally, cross-linguistic differences emerged in the acoustic-phonetic realisation of intonation. Findings like these can be produced only if comparable data are available: a phonological pattern can be specific to a particular speaking style. In the IViE corpus, high rising terminals [22] occur in free conversation in Cambridge English but not in read speech [18]. Phonetic implementation effects can be dialect-specific also [23] and they can be gender-specific [cf. 24]. Therefore, as a first step, we adopted a controlled approach. We carried out an investigation of intonational variation within one language on the basis of directly comparable data elicited in a range of dialects and speaking styles.3.The IViE corpusThe corpus contains 36 hours of speech data. The dialects in the corpus are ‘modern’ or ‘mainstream’ dialects [25]. Traditional research on dialectology has focussed on speech from older, usually male speakers from rural areas. We recorded adolescent speakers from urban areas. Near-homogeneous groups (six male, six female speakers) were recorded in secondary schools in nine locations: Belfast, Cardiff, Cambridge, Dublin, Leeds, Liverpool, Newcastle, Bradford (British Punjabi English) and London (speakers of West Indian descent). Five speaking styles were recorded:-22 phonetically controlled sentences with a range of grammatical structures,-a read text, the fairy tale Cinderella,-a retold version of the text, assisted by a set of pictures,-a Map task (single sex pairs),-a discussion on a given topic (smoking).A CD-ROM version of the complete corpus was released in June 2001 [18]. An on-line version of the data-base has beenavailable on the internet since August [26]. A range of cross-dialectal analyses using the data have been published in [13,27,28].3.1.Prosodic annotationsA subsection of the data in the corpus are accompanied by a range of machine-readable prosodic annotations. Many machine-readable intonation systems are modelled on ToBI [29] and the IViE system is no exception; H and L symbols are associated with stressed syllables and intonation phrase (IP) boundaries. But unlike ToBI, which is intended for the transcription of standard varieties of English, IViE was developed for inter-dialectal comparisons. Labellers choose transcription symbols for all dialects from a pool of options based on work on British English [21,30]. This approach ensures comparability of transcriptions across dialects. Phonological classifications are based on a range of comparisons of different utterances; intonation patterns produced in identical or comparable contexts are compared within and across dialects [21]. The resulting two-tone phonological transcriptions have the status of hypotheses and they allow for the quantification of particular observations. These hypotheses can be turned into hard evidence through acoustic experimentation and through measurement. Without quantifiable phonological transcriptions, hypotheses are less likely to provide results which are empirically valid.The phonological transcriptions in IViE are underpinned by information on two further transcription tiers: one provides syllable-by-syllable transcriptions of f0 patterns and alignment and the other is intended for the transcription of stressed and accented syllables. A complete IViE transcription includes information on:-the location of the words spoken in the acoustic signal,-the location of stressed and accented syllables,-the pitch movement surrounding accented syllables and IP-boundaries,-and a phonological classification of intonation involving accented syllable and IP-boundaries.A labelling guide for the IViE system is available on the internet [26].4.FindingsLadd [31] proposed a taxonomy of cross-linguistic differences between intonation languages:-Semantic variation: instances where speakers use the same tune for different functions.-Systemic variation: speakers use different tunes for the same function.-Realisational variation: speakers use the same phonological unit, but realise that unit differently.-Phonotactic variation: different restrictions on the way in which phonological units can be combined.We now present evidence for systemic and realisational variation within one language. Evidence for systemic variation is based on the phonological annotations in the corpus. Evidence for realisational variation comes from the acoustic signal [13].The data show that the level of intra-language variation in English is considerable. The variation has two sources. The first is of the type described by Ladd as variation between intonation languages. In different languages, we find different tunes, but in different dialects of English, we find different tunes also. The second source of variation involves the wide range of phonological, phonetic and phonotactic options available to a speaker in a particular context [cf. 20]. Even in productions of relatively tightly controlled texts, speakers from a particular dialect make use of a relatively wide range of prosodic choices.We illustrate these observations with two sets of findings. Following Cruttenden [32], we restrict the presentation to nuclear accents. The first set of data provides evidence for phonological variation. We examine intonation patterns in declaratives and inversion questions in seven dialects and we show that there is intra- as well as inter-dialectal variation. Then we illustrate phonetic (realisational) variation using data from four dialects.4.1.Intonational phonological variationThe following graphs are based on the sentence data in the prosodically annotated section of the IViE corpus. The results are from six speakers from each variety. Each speaker read eight declaratives and three inversion questions. The test sentences were included in a randomised list of sentences with different grammatical structure. Figure 1 shows that in Cambridge declaratives, over 90% of nuclear accents were falls (H*L %). In Belfast, 80% were rise-plateaux (L*H %)1. The remaining Cambridge speakers produced fall-rises (H*L H%), and the remaining Belfast speakers produced rise-plateau-falls (L*H L%) or falls (H*L %).Figure 1. Production of nuclear accents in Cambridge and Belfast. Results for declaratives are shown on the left (N=96) and results for inversion questions on the right (N=36).A comparison of the declaratives in Figure 1 with inversion questions (e.g. May I lean on the railings?) reveals intra-dialectal as well as inter-dialectal differences. The figure shows that speakers’ intonational choices can change if the grammatical function of an utterance changes, but the type of change that takes place is dialect-specific. In Cambridge, a falling nucleus dominates in declaratives, but not in inversion questions. In Belfast, we do not find a comparable difference. Rise-plateaux dominate in declaratives but they also dominate in inversion questions. In Belfast, the only difference between the two types of utterances involves the production of rise-plateau-falls (L*H L%). Belfast speakers produced them in declaratives but not in inversion questions.In sum, the data in Figure 1 show that speakers have a range of intonational options when they produce a particular text. They also reveal cross-linguistic difference in the mapping between grammatical function and intonational phonological choice.Figure 2 shows data from declaratives in seven dialects. In six out of seven dialects, H*L % is dominant. Urban northern rises were produced in the data from Belfast (83%), Newcastle (17%) and Dublin (4%). More information on Urban northern rises in the British Isles is given in [19,28, 31].Figure 3: Nuclear accent production; modal questions.Data from seven dialects of English (N=126).More generally, Figure 3 shows that a falling pattern (H*L %) is dominant in inversion questions only in Dublin.Everywhere else, rising patterns dominate. Urban northern rises are used as far down as Leeds and Bradford. In Cambridge or London, other types of rises dominate.In sum, the data in Figures 2 and 3 reveal broad geographical differences in the distribution of nuclear accent types [cf. 5]. They also show that the picture of cross-dialectal differences changes when the grammatical function of an utterance changes. But note that a change in grammatical function can have a different effect on the production of intonation in different dialects.4.2.Phonetic variation: truncation and compressionGrønnum and Ladd [31,33] have suggested that truncation and compression may be typological parameters in intonation. Grabe [34] showed that in Northern Standard German, on very short IP-final words, speakers truncate falling accents (H*L %). Instead of producing a very rapid fall in f0, they do not complete the pattern. In Southern British English, in an identical context, H*L % is compressed. Grabe’s findings appear to corroborate the view that British English is a compressing language ‘par excellence’ [31]. Dialect differences in the application of truncation and compression, however, have been reported for Swedish [23] and Liverpool English may have truncation [4].As part of the IViE project, we carried out an investigation of truncation and compression in four dialects of English: Cambridge, Leeds, Newcastle, and Belfast. Following [34], the stimuli consisted of the surnames Mr. Sheafer, Mr. Sheaf and Mr. Shift, embedded in identical carrier phrases. The pragmatic intent of the test utterances was cued by identical precursors. Since the surnames exhibit successively less scope for voicing, the speaker is forced either to increase the rate of f0 change from the longest to the shortest word (compression) or to complete less of the pattern (truncation). Details of experimental procedure and measurements are given in [13]. Figure 4 gives the results. Cambridge and Newcastle English compress, but Belfast and Leeds English truncate.Figure 4: Compression and truncation in four varieties of English. C= Compression, T=Truncation.H*L % H*L H%H*L % H*L H% L*H % L*H H% L* H% H* H% H* %T CTCT Liverpool [4]BelfastDublinLeedsNewcastleDublinLeedsNewcastleNewcastleLeedsBelfast5.Summary and ConclusionWe have provided evidence of inter- and intra-dialectal variation in English intonation. Firstly, we showed that there are broad geographical differences in the production of nuclear accents in the British Isles. These do not involve the association of a single type of nuclear accent with a particular utterance type in a particular dialect but a range of possible accent types per dialect. Then we showed that the mapping between grammatical structures and intonational form is dialect specific also. A change in grammatical function can be associated with the production of a different pattern in one dialect but not in another. Finally, we provided evidence of inter-dialect differences in phonetic realisation. In English, truncation and compression are dialect-specific.We conclude that dialect variation is a significant variable in prosodic typology. The intonation systems of dialects of one and the same language can be as different as the intonation systems of different languages. For instance, in read speech, the phonological structure of Cambridge English intonation is closer to that of Northern Standard German than to that of Belfast English. Similarly, Leeds English has truncation, just as Northern Standard German does, but Cambridge English does not. Finally, we conclude that patterns of inter- and intra-speaker variation within and across dialects may be complex but they are not random. Rather, they present a serious challenge to prosodic typologists. We need to consider how models of intonation can accommodate this variation.6.Notes1 The % sign represents an IP boundary without a tone. This approach allows for the transcription of rise-plateaux as L*H %, double-rises as L*H H% and rise-plateau-falls as L*HL%. The three options occur in Belfast English [21].2 Rise-plateau-falls have been described as rise-plateau-‘slumps', e.g. in [19]. In our data, many ‘slumps’ involved drops of around 60 Hz or more in female speakers and drops of around 20 Hz in male speakers.7.References[1]Orton, H., 1962. Survey of English Dialects: AnIntroduction. Arnold.[2]Jarman, E.; Cruttenden, A., 1976. Belfast intonation andthe myth of the fall. JIPA 6: 4-12.[3]Pellowe, J. ; Jones, V., 1978. On intonational variabilityin Tyneside speech. In P. Trudgill (ed.) Sociolinguistic patterns in British Speech. Arnold.[4]Knowles, G.O., 1978. The nature of phonologicalvariables in Scouse. In P. Trudgill (ed.) Sociolinguistic patterns in British English. Arnold.[5]Wells, J.C., 1982. Accents of English: The British Isles(Vol. 2). CUP.[6]Tench, P. 1990. The pronunciation of English inAbercrave. In English in Wales, N. Coupland (ed).Clevedon, Philadelphia: Multilingual Matters.[7]Walters, J.R. 1999. A Study of the segmental andsuprasegmental Phonology of Rhondda Valleys English.PhD: University of Glamorgan.[8]Rahilly, J., 1991. Intonation patterns in normal hearingand postlingually deafened adults in Belfast. PhD: The Queen’s University of Belfast.[9]Sebba, M., 1993. London Jamaican: language systems ininteraction. Longman.[10]Wells, B.; Peppe S., 1996. Ending up in Ulster: prosodyand turn-taking in English dialects. In Prosody and Conversation, Couper-Kuhlen, E.; M. Selting. CUP. [11]Foulkes, P.; Docherty, G., 1999. Urban voices: accentstudies in the British Isles. Arnold.[12]Cruttenden, A., 2001. Mancunian intonation andintonational representation. Phonetica 58: 53-80.[13]Grabe, E.; Post, B.; Nolan; F.; Farrar, F., 2000. Pitchaccent realisation in four varieties of British English.Journal of Phonetics 27, 161-185.[14]Jun, S. (ed), forthcoming. Prosodic Typology andTranscription: A Unified Approach. Oxford: OUP. [15]Fitzpatrick-Cole, J., 1999. The Alpine Intonation of BernSwiss German. Proceedings of the 14th ICPhS, Vol2, 941-944.[16]Auer, P.; Gilles, P.; Peters, J.; Selting, M., 2000.Intonation regionaler Varietäten des Deutschen. In Dialektologie zwischen Tradition und Neuansätzen, D.Stellmacher (ed.). Stuttgart: Steiner, 222-239.[17]Hirst, D.; Di Christo, A., 1998. Intonation Systems. CUP.[18]Grabe, E.; Post, B.; Nolan, F., 2001. IntonationalVariation in English. The IViE Corpus on CD-ROM.Linguistics, Cambridge.[19]Cruttenden, A., 1995. Rises in English. In Studies inGeneral and English Phonetics, J. Windsor-Lewis (ed.).Routledge, 155-173.[20]Peppé, S.; Maxim, J.; Wells, B., 2000. Prosodic Variationin Southern British English, Language and Speech 43(3), 309-334.[21]Grabe, E. 1998a. Comparative intonational phonology:English and German. MPI Series 7, Ponsen en Looien. [22]Warren, P.; Britain, D., 2000. Prosody and intonation ofNew Zealand English. In New Zealand English, A. Bell & K. Kuiper (eds.).Victoria University Press, 146-172. [23]Bannert, R.; Bredvad, A., 1975. Temporal organisation ofSwedish tonal accent: the effect of vowel duration.Working papers 10, Linguistics, Lund, Sweden.[24]Vermillion, P., 2001. The Perception and Production ofIntonational Meaning by British Men and Women.M.Phil, University of London.[25]Trudgill, P., 1998. The dialects of English. Blackwell.[26]/~esther/ivyweb; Labelling guide:/~esther/ivyweb/guide.html[27]Grabe, E.; Post, B.; Nolan, F., forthcoming,downloadable [26]. Modelling intonational variation in English: The IViE system. Proceedings of Prosody 2000, Krakow, Poland.[28]Fletcher, J.; Grabe, E.; Warren, P., forthcoming,downloadable [26]. Intonational variation in four dialects of English: the high rising tune. In [14].[29]Beckman, M.; Ayers, G., 1997. Guidelines for ToBILabelling. /phonetics/E_ToBI [30]Gussenhoven, C., 1984. On the grammar and semanticsof sentence accents, Foris.[31]Ladd, D. R., 1996. Intonational phonology. Cambridge:CUP.[32]Cruttenden, A., 1997. Intonation. Cambridge: CUP.[33]Grønnum, N., 1989. Stress group patterns, sentenceaccents and sentence intonation in Southern Jutland, ARIPUC, 23, 1-85.[34]Grabe, E., 1998b. Pitch accent realisation in English andGerman, Journal of Phonetics, 26, pp.129-144.。
词向量评估方法Word vector evaluation methods play a crucial role in natural language processing and machine learning tasks. 词向量评估方法在自然语言处理和机器学习任务中扮演着至关重要的角色。
With the emergence of various word embedding techniques such asWord2Vec, GloVe, and FastText, it has become essential to have effective evaluation methods to assess the quality of these word vectors. 随着Word2Vec、GloVe和FastText等各种词嵌入技术的出现,发展出有效的评估方法来评估这些词向量的质量已经变得至关重要。
One commonly used method for word vector evaluation is to measure the similarity between word vectors and human judgments. 一个常用的词向量评估方法是通过衡量词向量与人类判断之间的相似性来评估。
This can be done using benchmark datasets such as WordSim-353 and SimLex-999, where human annotators provide similarity scores for pairs of words. 这可以通过使用基准数据集如WordSim-353和SimLex-999来完成,其中人类评价者为单词对提供相似性分数。
The word vectors are then evaluated based on how well they correlate with these human judgments. 然后根据它们与这些人类判断的相关程度来评估词向量的质量。
Improvements in Phrase-Based Statistical Machine TranslationRichard Zens and Hermann NeyChair of Computer Science VIRWTH Aachen Universityzens,ney@cs.rwth-aachen.deAbstractIn statistical machine translation,the currentlybest performing systems are based in some wayon phrases or word groups.We describe thebaseline phrase-based translation system andvarious refinements.We describe a highly ef-ficient monotone search algorithm with a com-plexity linear in the input sentence length.Wepresent translation results for three tasks:Verb-mobil,Xerox and the Canadian Hansards.Forthe Xerox task,it takes less than7seconds totranslate the whole test set consisting of morethan10K words.The translation results forthe Xerox and Canadian Hansards task are verypromising.The system even outperforms thealignment template system.1IntroductionIn statistical machine translation,we are given a sourcelanguage(‘French’)sentence,which is to be translated into a target language(‘English’)sentence Among all possible targetlanguage sentences,we will choose the sentence with thehighest probability:(1)(2)The decomposition into two knowledge sources in Equa-tion2is known as the source-channel approach to statisti-cal machine translation(Brown et al.,1990).It allows anindependent modeling of target language modeland translation model1.The target language2Phrase-Based Translation2.1MotivationOne major disadvantage of single-word based approaches is that contextual information is not taken into account. The lexicon probabilities are based only on single words. For many words,the translation depends heavily on the surrounding words.In the single-word based translation approach,this disambiguation is addressed by the lan-guage model only,which is often not capable of doing this.One way to incorporate the context into the translation model is to learn translations for whole phrases instead of single words.Here,a phrase is simply a sequence of words.So,the basic idea of phrase-based translation is to segment the given source sentence into phrases,then translate each phrase andfinally compose the target sen-tence from these phrase translations.2.2Phrase ExtractionThe system somehow has to learn which phrases are translations of each other.Therefore,we use the follow-ing approach:first,we train statistical alignment models using GIZA++and compute the Viterbi word alignment of the training corpus.This is done for both translation di-rections.We take the union of both alignments to obtain a symmetrized word alignment matrix.This alignment ma-trix is the starting point for the phrase extraction.The fol-lowing criterion defines the set of bilingual phrasesof the sentence pair and the alignment matrix that is used in the translation system.This criterion is identical to the alignment template cri-terion described in(Och et al.,1999).It means that two phrases are considered to be translations of each other,if the words are aligned only within the phrase pair and not to words outside.The phrases have to be contiguous. 2.3Translation ModelTo use phrases in the translation model,we introduce the hidden variable.This is a segmentation of the sentence pair into phrases.We use a one-to-one phrase alignment,i.e.one source phrase is translated by exactly one target phrase.Thus,we obtain:(3)(4)(5)In the preceding step,we used the maximum approxima-tion for the sum over all segmentations.Next,we allow only translations that are monotone at the phrase level. So,the phrase is produced by,the phrase is produced by,and so on.Within the phrases,the re-ordering is learned during training.Therefore,there is no constraint on the reordering within the phrases.(6)(7) Here,we have assumed a zero-order model at the phrase level.Finally,we have to estimate the phrase translation probabilities.This is done via relative frequencies:3RefinementsIn this section,we will describe refinements of the phrase-based translation model.First,we will describe two heuristics:word penalty and phrase penalty.Sec-ond,we will describe a single-word based lexicon model. This will be used to smooth the phrase translation proba-bilities.3.1Simple HeuristicsIn addition to the baseline model,we use two simple heuristics,namely word penalty and phrase penalty:wp(12)pp(13) The word penalty feature is simply the target sentence length.In combination with the scaling factor this re-sults in a constant cost per produced target language word.With this feature,we are able to adjust the sentence length.If we set a negative scaling factor,longer sen-tences are more penalized than shorter ones,and the sys-tem will favor shorter translations.Alternatively,by us-ing a positive scaling factors,the system will favor longer translations.Similar to the word penalty,the phrase penalty feature results in a constant cost per produced phrase.The phrase penalty is used to adjust the average length of the phrases.A negative weight,meaning real costs per phrase,results in a preference for longer phrases.A positive weight, meaning a bonus per phrase,results in a preference for shorter phrases.3.2Word-based LexiconWe are using relative frequencies to estimate the phrase translation probabilities.Most of the longer phrases are seen only once in the training corpus.Therefore,pure relative frequencies overestimate the probability of those phrases.To overcome this problem,we use a word-based lexicon model to smooth the phrase translation probabili-ties.For a source word and a target phrase,we use the following approximation:This models a disjunctive interaction,also called noisy-OR gate(Pearl,1988).The idea is that there are multiple independent causes that can generate an event.It can be easily integrated into the search algorithm.The corresponding feature function is:lex Here,and denote thefinal position of phrase number in the source and the target sentence,respectively,and we define and.To estimate the single-word based translation probabil-ities,we use smoothed relative frequencies.The smoothing method we apply is absolute discounting with interpolation:(14)bigram language model.This is only for notational con-venience.The generalization to a higher order language model is straightforward.For the maximization problem in(11),we define the quantity as the maximum probability of a phrase sequence that ends with the lan-guage word and covers positions to of the source sentence.is the probability of the opti-mum translation.The symbol is the sentence boundary marker.We obtain the following dynamic programming recursion.Here,denotes the maximum phrase length in the source language.During the search,we store back-pointers to the maximizing arguments.After perform-ing the search,we can generate the optimum translation. The resulting algorithm has a worst-case complexity of.Here,denotes the vocabulary size of the target language and denotes the maximum num-ber of phrase translation candidates for a source language ing efficient data structures and taking into ac-count that not all possible target language phrases can oc-cur in translating a specific source language sentence,we can perform a very efficient search.This monotone algorithm is especially useful for lan-guage pairs that have a similar word order,e.g.Spanish-English or French-English.5Corpus StatisticsIn the following sections,we will present results on three tasks:Verbmobil,Xerox and Canadian Hansards.There-fore,we will show the corpus statistics for each of these tasks in this section.The training corpus(Train)of each task is used to train a word alignment and then extract the bilingual phrases and the word-based lexicon.The re-maining free parameters,e.g.the model scaling factors, are optimized on the development corpus(Dev).The re-sulting system is then evaluated on the test corpus(Test). Verbmobil Task.Thefirst task we will present re-sults on is the German–English Verbmobil task(Wahlster, 2000).The domain of this corpus is appointment schedul-ing,travel planning,and hotel reservation.It consists of transcriptions of spontaneous speech.Table1shows the corpus statistics of this task.Xerox task.Additionally,we carried out experiments on the Spanish–English Xerox task.The corpus consists of technical manuals.This is a rather limited domain task. Table2shows the training,development and test corpus statistics.Canadian Hansards task.Further experiments were carried out on the French–English Canadian Hansards Table1:Statistics of training and test corpus for the Verb-mobil task(PP=perplexity).EnglishTrain SentencesWords54992179392763159Trigram PP28.12512628Trigram PP30.5 Table2:Statistics of training and test corpus for the Xe-rox task(PP=perplexity).EnglishTrain SentencesWords66539911050Dev SentencesWords14278–Test SentencesWords8370–Table3:Statistics of training and test corpus for theCanadian Hansards task(PP=perplexity).EnglishTrain SentencesWords22M1002695009043Trigram PP57.7543297646Trigram PP56.7that all these monotonicity consideration are done at the phrase level.Within the phrases arbitrary reorderings are allowed.The only restriction is that they occur in thetraining corpus.Table4shows the percentage of the training corpus that can be generated with monotone and nonmonotonephrase-based search.The number of sentence pairs that can be produced with the nonmonotone search gives an estimate of the upper bound for the sentence error rate ofthe phrase-based system that is trained on the given data. The same considerations hold for the monotone search. The maximum source phrase length for the Verbmobiltask and the Xerox task is12,whereas for the Canadian Hansards task we use a maximum of4,because of the large corpus size.This explains the rather low coverageon the Canadian Hansards task for both the nonmonotone and the monotone search.For the Xerox task,the nonmonotone search can pro-duce of the sentence pairs whereas the mono-tone can produce.The ratio of the two numbers measures how much the system deteriorates by using themonotone search and will be called the degree of mono-tonicity.For the Xerox task,the degree of monotonicity is.This means the monotone search can produce of the sentence pairs that can be produced with the nonmonotone search.We see that for the Spanish-English Xerox task and for the French-English Canadian Hansards task,the degree of monotonicity is rather high. For the German-English Verbmobil task it is significantly lower.This may be caused by the rather free word order in German and the long range reorderings that are neces-sary to translate the verb group.It should be pointed out that in practice the monotonesearch will perform better than what the preceding esti-mates indicate.The reason is that we assumed a perfect nonmonotone search,which is difficult to achieve in prac-tice.This is not only a hard search problem,but also a complicated modeling problem.We will see in the next section that the monotone search will perform very well on both the Xerox task and the Canadian Hansards task.Table4:Degree of monotonicity in the training corpora for all three tasks(numbers in percent).Xerox76.359.7 monotone65.3deg.of mon.87.0Single-Word Based Systems(SWB).First,there is a monotone search variant(Mon)that translates each word of the source sentence from left to right.The second vari-ant allows reordering according to the so-called IBM con-straints(Berger et al.,1996).Thus up to three words may be skipped and translated later.This system will be denoted by IBM.The third variant implements spe-cial German-English reordering constraints.These con-straints are represented by afinite state automaton and optimized to handle the reorderings of the German verb group.The abbreviation for this variant is GE.It is only used for the German-English Verbmobil task.This is just an extremely brief description of these systems.For de-tails,see(Tillmann and Ney,2003).Phrase-Based System(PB).For the phrase-based sys-tem,we use the following feature functions:a trigram language model,the phrase translation model and the word-based lexicon model.The latter two feature func-tions are used for both directions:and. Additionally,we use the word and phrase penalty fea-ture functions.The model scaling factors are optimized on the development corpus with respect to mWER sim-ilar to(Och,2003).We use the Downhill Simplex al-gorithm from(Press et al.,2002).We do not perform the optimization on-best lists but we retranslate the whole development corpus for each iteration of the op-timization algorithm.This is feasible because this system is extremely fast.It takes only a few seconds to translate the whole development corpus for the Verbmobil task and the Xerox task;for details see Section8.In the experi-ments,the Downhill Simplex algorithm converged after about200iterations.This method has the advantage that it is not limited to the model scaling factors as the method described in(Och,2003).It is also possible to optimize any other parameter,e.g.the discounting parameter for the lexicon smoothing.Alignment Template System(AT).The alignment template system(Och et al.,1999)is similar to the sys-tem described in this work.One difference is that the alignment templates are not defined at the word level but at a word class level.In addition to the word-based tri-gram model,the alignment template system uses a class-basedfivegram language model.The search algorithm of the alignment templates allows arbitrary reorderings of the templates.It penalizes reorderings with costs that are linear in the jump width.To make the results as compa-rable as possible,the alignment template system and the phrase-based system start from the same word alignment. The alignment template system uses discriminative train-ing of the model scaling factors as described in(Och and Ney,2002).7.3Verbmobil TaskWe start with the Verbmobil results.We studied smooth-ing the lexicon probabilities as described in Section3.2. The results are summarized in Table5.We see that the Table5:Effect of lexicon smoothing on the translation performance for the German-English Verbmobil task.system mPER NISTunsmoothed21.17.96uniform20.77.99unigram22.37.79 uniform smoothing method improves translation quality. There is only a minor improvement,but it is consistent among all evaluation criteria.It is statistically signifi-cant at the94%level.The unigram method hurts perfor-mance.There is a degradation of the mWER of.In the following,all phrase-based systems use the uniform smoothing method.The translation results of the different systems are shown in Table6.Obviously,the monotone phrase-based system outperforms the monotone single-word based sys-tem.The result of the phrase-based system is comparable to the nonmonotone single-word based search with the IBM constraints.With respect to the mPER,the PB sys-tem clearly outperforms all single-word based systems. If we compare the monotone phrase-based system with the nonmonotone alignment template system,we see that the mPERs are similar.Thus the lexical choice of words is of the same quality.Regarding the other evaluation criteria,which take the word order into account,the non-monotone search of the alignment templates has a clear advantage.This was already indicated by the low degree of monotonicity on this task.The rather free word order in German and the long range dependencies of the verb group make reorderings necessary.Table6:Translation performance for the German-English Verbmobil task(251sentences).system variant mPER NIST SWB Mon29.37.07 IBM25.07.84GE25.37.8337.047.0AT20.68.577.4Xerox taskThe translation results for the Xerox task are shown in Table7.Here,we see that both phrase-based systems clearly outperform the single-word based systems.The PB system performs best on this pared to the AT system,the BLEU score improves by4.1%absolute. The improvement of the PB system with respect to the AT system is statistically significant at the99%level.Table7:Translation performance for the Spanish-English Xerox task(1125sentences).System PER NISTSWB IBM27.68.0026.567.9AT20.18.767.5Canadian Hansards taskThe translation results for the Canadian Hansards task are shown in Table8.As on the Xerox task,the phrase-based systems perform better than the single-word based sys-tems.The monotone phrase-based system yields even better results than the alignment template system.This improvement is consistent among all evaluation criteria and it is statistically significant at the99%level.Table8:Translation performance for the French-English Canadian Hansards task(5432sentences). System Variant PER NIST SWB Mon53.0 5.96IBM51.3 6.2157.827.8AT49.1 6.718EfficiencyIn this section,we analyze the translation speed of the phrase-based translation system.All experiments were carried out on an AMD Athlon with2.2GHz.Note that the systems were not optimized for speed.We used the best performing systems to measure the translation times. The translation speed of the monotone phrase-based system for all three tasks is shown in Table9.For the Xerox task,the translation process takes less than7sec-onds for the whole10K words test set.For the Verbmobil task,the system is even slightly faster.It takes about1.6 seconds to translate the whole test set.For the Canadian Hansards task,the translation process is much slower,but the average time per sentence is still less than1second. We think that this slowdown can be attributed to the large training corpus.The system loads only phrase pairs into memory if the source phrase occurs in the test corpus. Therefore,the large test corpus size for this task also af-fects the translation speed.In Fig.1,we see the average translation time per sen-tence as a function of the sentence length.The translation times were measured for the translation of the5432test sentences of the Canadian Hansards task.We see a clear linear dependency.Even for sentences of thirty words, the translation takes only about1.5seconds.Table9:Translation Speed for all tasks on a AMD Athlon 2.2GHz.Xeroxavg.sentence length13.50.0060.794 words/second1448We described a highly efficient monotone search al-gorithm.The worst-case complexity of this algorithm is linear in the sentence length.This leads to an impressive translation speed of more than1000words per second for the Verbmobil task and for the Xerox task.Even for the Canadian Hansards task the translation of sentences of length30takes only about1.5seconds.The described search is monotone at the phrase level. Within the phrases,there are no constraints on the re-orderings.Therefore,this method is best suited for lan-guage pairs that have a similar order at the level of the phrases learned by the system.Thus,the translation pro-cess should require only local reorderings.As the exper-iments have shown,Spanish-English and French-English are examples of such language pairs.For these pairs, the monotone search was found to be sufficient.The phrase-based approach clearly outperformed the single-word based systems.It showed even better performance than the alignment template system.The experiments on the German-English Verbmobil task outlined the limitations of the monotone search. As the low degree of monotonicity indicated,reordering plays an important role on this task.The rather free word order in German as well as the verb group seems to be dif-ficult to translate.Nevertheless,when ignoring the word order and looking at the mPER only,the monotone search is competitive with the best performing system.For further improvements,we will investigate the use-fulness of additional models,e.g.modeling the segmen-tation probability.Also,slightly relaxing the monotonic-ity constraint in a way that still allows an efficient search is of high interest.In spirit of the IBM reordering con-straints of the single-word based models,we could allow a phrase to be skipped and to be translated later. AcknowledgmentThis work has been partially funded by the EU project TransType2,IST-2001-32091.ReferencesA.L.Berger,P.F.Brown,S.A.D.Pietra,V.J.D.Pietra, J.R.Gillett,A.S.Kehler,and R.L.Mercer.1996. Language translation apparatus and method of using context-based translation models,United States patent, patent number5510981,April.P.F.Brown,J.Cocke,S.A.Della Pietra,V.J.Della Pietra,F.Jelinek,fferty,R.L.Mercer,and P.S.Roossin.1990.A statistical approach to machine putational Linguistics,16(2):79–85, June.G.Doddington.2002.Automatic evaluation of machine translation quality using n-gram co-occurrence statis-tics.In Proc.ARPA Workshop on Human Language Technology.P.Koehn,F.J.Och,and D.Marcu.2003.Statistical phrase-based translation.In Proc.of the Human Lan-guage Technology Conf.(HLT-NAACL),pages127–133,Edmonton,Canada,May/June.D.Marcu and W.Wong.2002.A phrase-based,joint probability model for statistical machine translation. In Proc.Conf.on Empirical Methods for Natural Lan-guage Processing,pages133–139,Philadelphia,PA, July.H.Ney,S.Martin,and F.Wessel.1997.Statistical lan-guage modeling using leaving-one-out.In S.Young and G.Bloothooft,editors,Corpus-Based Methods in Language and Speech Processing,pages174–207. Kluwer.F.J.Och and H.Ney.2002.Discriminative training and maximum entropy models for statistical machine trans-lation.In Proc.of the40th Annual Meeting of the As-sociation for Computational Linguistics(ACL),pages 295–302,Philadelphia,PA,July.F.J.Och,C.Tillmann,and H.Ney.1999.Improved alignment models for statistical machine translation. In Proc.of the Joint SIGDAT Conf.on Empirical Meth-ods in Natural Language Processing and Very Large Corpora,pages20–28,University of Maryland,Col-lege Park,MD,June.F.J.Och.2003.Minimum error rate training in statis-tical machine translation.In Proc.of the41th Annual Meeting of the Association for Computational Linguis-tics(ACL),pages160–167,Sapporo,Japan,July.K.A.Papineni,S.Roukos,T.Ward,and W.J.Zhu.2001. Bleu:a method for automatic evaluation of machine translation.Technical Report RC22176(W0109-022), IBM Research Division,Thomas J.Watson Research Center,September.J.Pearl.1988.Probabilistic Reasoning in Intelligent Systems:Networks of Plausible Inference.Morgan Kaufmann Publishers,Inc.,San Mateo,CA.Revised second printing.W.H.Press,S.A.Teukolsky,W.T.Vetterling,and B.P. Flannery.2002.Numerical Recipes in C++.Cam-bridge University Press,Cambridge,UK.C.Tillmann and H.Ney.2003.Word reordering and a dynamic programming beam search algorithm for sta-tistical machine putational Linguis-tics,29(1):97–133,March.J.Tomas and bining phrase-based and template-based aligned models in statisti-cal translation.In Proc.of the First Iberian Conf.on Pattern Recognition and Image Analysis,pages1020–1031,Mallorca,Spain,June.W.Wahlster,editor.2000.Verbmobil:Foundations of speech-to-speech translations.Springer Verlag, Berlin,Germany,July.R.Zens,F.J.Och,and H.Ney.2002.Phrase-based sta-tistical machine translation.In25th German Confer-ence on Artificial Intelligence(KI2002),pages18–32, Aachen,Germany,September.Springer Verlag.。