Approximate Grammar for Information Extraction
- 格式:pdf
- 大小:322.67 KB
- 文档页数:10
Revision contents:Unit 1 Language and LearningViews on languageViews on language learning1. What are the major views of language? What are their implications to language teaching or learning?2. Some language teachers argue that we should “teach the l anguage”rather than “teach about the language”. What are the major differences between these two approaches to language teaching?3. Audiolingual approach to language learning4.Socio-constructivist theory of language learning emphasizes interaction and engagement with the target language in a social context.5. The quality of a good language teacher includes ethic devotion, professional quality and personal styles.6. One influential idea of cognitive approach to language teaching is that students should be allowed to create their own sentence based on their own understanding of certain rules.Unit 2 Communicative Principles and ActivitiesWhat is communicative compentence? Try to list some of its components.Principles in communicative language teaching/ strong version and week versionList some of the communicative activities.What is a task/its componentsUnit 3The overall language ability required in the 2001 National English Curriculum includes the following aspects language knowledge, language skills, learning strategies, affects and cultural understanding.4. Lesson PlanningWhat is lesson planning?Principles for good lesson planningComponents of a lesson planUnit 5 Classroom ManagementWhat is classroom management?Types of student grouping and their advantages and disadvantagesThe role of the teacher ---- contoller, assessor, organizer, prompter, participant, resource providerThe new curriculum requires the teacher to put on the following new roles: facilitator, guides, and researchers.Classification of questionsHow to deal with errors?Unit 6 Teaching PronunciationCritical Period HypothesisThe goal of teaching pronunciation should be: consistency, intelligibility, and communicative efficiency.List some methods of practicing sounds.Unit 7 Teaching GrammarGrammar presentation methodsGrammar practice is usually divided into two categories, mechanical practice and meaningful practice.Unit 8 Teaching VocabularyWhat does knowing a word involve? Receptive vocabulary and productive vocabulary.List some ways of presenting new wordsHow to consolidate vocabulary?Developing vocabulary building strategiesUnit 9 Teaching ListeningCharacteristics of listening processPrinciples and models for teaching listeningAs far as classroom procedures are concerned, the teaching of listening generally follows three stages: pre-listening stage, while-listening stage, and post-listening stage.Unit 10 Teaching SpeakingWhat are the characteristics of spoken language? Discuss their implications to teaching.Information-gap activitiesList some of the speaking tasks that the students are often asked to do in language classroomUnit 11 Teaching readingThe role of vocabulary in reading: sight vocabularySkills involved in reading comprehensionModels for teaching readingStages involved in Teaching ReadingProblems in reading are often seen as a failure to recognize words that may not exist in the learner’s vocabulary or in understanding grammatical structures that may not have been acquired by the learner. Therefore, the task of teaching reading is seen as teaching vocabulary along with the grammatical structure of the target language. Do you agree with such an opinion? Explain your reasons.In teaching reading, teachers often engage students in pre-reading, while-reading and post-reading activities. What do you think are the major functions of pre-reading activities?Unit 12 Teaching WritingWhat is the main idea of communicative approach to writing?What is the main idea of the process approach to writing?Exercises for the course of English teaching methodologyI. Multiple choiceDirections:Choose the best answer for the following questions and write your answers on the answer sheet.1. What syllabus is designed around grammatical structures, with each lesson teaching a grammar structure, starting with simple ones, and progressing through to more complex ones?A. Structural syllabus.B. Situational syllabus.C. Functional syllabus.2. Which of the following is a communicative activity?A. Listen to the weather broadcast and fill in a form.B. Listen to the weather broadcast and talk about a picnic.C. Transfer the information from the weather broadcast into a table.3. In which of the following situations is the teacher playing the role of a prompter?A. Explain the language points and meanings of words and sentences.B. Give examples of how to do an activity after the explanation and instructions.C. Elicit ideas from students.4. Which of the following is a social interaction activity?A. Information gap.B. Role-play.C. Information transfer.5. What reading approach is based on the assumption of reading as a guessing game?A. The top-down approach.B. The bottom-up approach.C. The interactive approach6. What reading strategy does the following activity help to train?The students were asked to read each paragraph and then match the paragraph with relevant headings.A. Inferring.B. Scanning.C. Skimming.7. Which of the pre-reading activities exemplifies the bottom-up approach?A. The teacher brings in pictures and asks the students to discuss in groups about the life of old people.B. The teacher raises several questions about old people and asks the students to discuss in pairs.C. The teacher presents a picture about the life of old people on the screen and brainstorm vocabulary related to old people’s life.8. What listening skill does the following activity help to train?Listen to the folio-wing text and answer the multiple-choice question.In this dialogue, the speakers are talking about________.A) going to a picnic B) attending a concert C) having a partyA. Listening for gist.B. Listening for specific information.C. Listening for detailed information.9. Which of the following features does spoken English have?A. It is generally produced in fairly simple sentence structures.B. It is produced with little redundancy.C. It is produced with good organization.10. What should a required lesson plan look like?A. a copy of explanation of words and structuresB. a timetable for activitiesC. transcribed procedure of classroom instruction11. For better classroom management, what should the teacher do while the students are doing activities?A. participate in a groupB. prepare for the next procedureC. circulate around the class to monitor, prompt and help12. Which of the following activities can best motivate junior learners?A. gamesB. recitationC. role-play of dialogues13. To cultivate communicative competence, what should correction focus on?A. linguistic formsB. communicative strategiesC. grammatical rules14. Which of the following activity is most productive?A. read the text and then choose the best answer to the questionsB. discuss on the given topic according to the text you have just readC. exchange and edit the writing of your partner15. To help students understand the structure of a text and sentence sequencing, we could use----- for students to rearrange the sentences in the right order.A. cohesive devicesB. a coherent textC. scrambled sentences16. The purpose of the outline------ is to enable the students to have a clear organization of ideas and a structure that can guide them .A. in the actual writingB. in free writingC. in controlled writing17. The grammar rules are often given first and explained to the students and then the students have to apply the rules to given situations. This approach is called .A. deductive grammar teachingB. inductive grammar teachingC. guiding discovery18. It is easier for students to remember new words if they are designed in ------and if they are ------and again and again in situations and contexts.A. context, sameB. context, differentC. concept, difficultII. DefinitionDirections: Define the following terms1. Communicative compentence2. Lesson planning3. Classroom management4. Receptive vocabulary and productive vocabulary.5. Sight vocabulary6. Information-gap activities7. Display questions8. Task9. Audiolingual approach to language learning10.ReadingIII. Blank fillingDirections: fill in blanks according to what you’ve learn in the course of foreign language teaching.1. Socio-constructivist theory of language learning emphasizes interaction and engagement with the target language in a social context.2.The quality of a good language teacher includes ethic devotion, professional quality and personal styles.3.One influential idea of cognitive approach to language teaching is that students should be allowed to create their own sentence based on their own understanding of certain rules.4. The overall language ability required in the 2001 National English Curriculum includes the following aspects language knowledge, language skills, learning strategies, affects and cultural understanding.5. The role of the teacher ---- contoller, assessor, organizer, prompter, participant, resource providerThe new curriculum requires the teacher to put on the following new roles: facilitator, guides, and researchers.6.The goal of teaching pronunciation should be: consistency, intelligibility, and communicative efficiency.7. Grammar practice is usually divided into two categories, mechanical practice and meaningful practice.8. As far as classroom procedures are concerned, the teaching of listening generally follows three stages: pre-listening stage, while-listening stage, and post-listening stage.IV. Problem SolvingDirections: Below are some situations in classroom instruction. Each has at least one problem. First, identify the problem(s). Second, provide your solution (s) according to what you have learned. You should elaborate on the problem(s) and solution(s) properly. Write your answer on the Answer Sheet.1.In one of the lessons. Mr. Li arranged the students into groups to talk about what they want to be when they grow up. To ensure thatthey applied what they learned, he required them to use the expressions in the text. To his surprise, students were not very active and some groups were talking about something else and one group was talking in Chinese.Problems:1) Maybe the topic does not correspond with the students’ current needs. Suppose these students were interested only in getting high scores in examinations, they would not have interest in such a talk.2) The activity is much controlled. They may like to talk about their hobbies, but they have to use the expressions the teacher presents, which to some extent restricts them. That is perhaps why they are not very active.3) If students talk in Chinese, it may be because the talk is a little too demanding for them in terms of language competence. When students have difficulty in expressing themselves in English, they will switch to Chinese.4) Maybe the teacher does not arrange such activities very often in class. The students are not used to such communicative activities and so do not take an active part.Solutions:1)The teacher can ask the students to talk about their hobbies freely without considering the structure2) The teacher can give the task a real purpose. For example, he can ask the students to ask others about their hobbies to form a hobby club.3) It’s better to explain to the students the value of such kind of activity.4) The teacher can circulate around to encourage the students to talk in English.2. To cultivate communicative competence, Mr. Li chose some news reports from China Daily for his middle school students. Problems:1) Authentic materials are desirable in cultivation of communicative competence. But they should correspond to students" ability. News reports from China Daily are too difficult for middle school students.2) The content of news reports may not be relevant to the course requirement of middle school English.Solutions:1) If Mr. Li insists on using the materials from China Daily, it is necessary for him to adapt the material or select those reports which are easier to read and more relevant to students" interests.2) If he can, it is better to select news reports from other newspapers which are relevant to the students" life and study. It is necessaryto bear in mind the students" needs when selecting materials for classroom instruction.(第一项要求写出两点即可,而第二项要求能说出两点。
SCHOOL OF PUBLIC HEALTHWriting GuideThe writing of any report, essay or dissertation is a process of development. It takes time to read, think, make notes and draft a piece of work, but organising your thoughts on paper is an invaluable part of the way we learn complex facts and concepts.The writing processThe first part of the process is to research, read and think about your topic. The next step is to write a rough draft, and then you will spend time working on that to produce the final product. We have briefly covered sources of information if you are not familiar with the writing process or with scientific writing. After that we have provided some hints for you to improve your rough draft. Worrying about your expressions and use of language when you are making notes or writing a rough draft may hold you up. It is when you have your ideas on paper, however rough the form, that you can work on the flow, style and structure of your material. There are many books and websites you can consult if you are not familiar with the process of formal or academic writing. This guide has only distilled a framework of the process for your guidance. The books are listed at the end of this guide, with the best ones marked for you. One of the best introductory websites on writing process is the following:/handouts/general/index.htmlThis site has handout sheets that cover everything from planning and writer’s block through to proofreading. Although the examples may seem banal, the process of planning and writing in covered in detail and may be helpful to you if you feel completely lost about how to write at this level.If you are new to scientific writing there are many articles and websites to help you. The first url will take you to a classic article on scientific writing which is well worth your while studying. It includes some cogitation on how readers interpret texts, as well as some very practical pointers to improving your sentence structure./~andreas/sci.htmlHere is one that covers scientific writing in general, including getting published:/science/writ.htmlWorking with your draftGrammarThis is not a grammar guide. If you are not sure of the elements of grammar, there are many resources available to you on the Web. Here is one for you to browse if you need to brush up on the basics:/handouts/grammar/index.htmlThe next site is the most comprehensive searchable guide to English grammar on the net at time of writing. You will find everything you ever need to know about grammar here, explained clearly and simply:/grammar/Here is another useful simple guide, which has little self tests at the end of each section: /The next guide is an excellent and comprehensive one, specifically for technical and scientific writers:/sp7084/index.htmlFinally, here is another good site that lists some common errors and problems alphabetically. It also has a helpful list of further on-line resources.:8080/~brians/errors/errors.htmlHave a look at each of these guides. You may want to print one, or parts of one, for future reference. The decision about which one will best suit your purpose will depend on your previous knowledge of and confidence with English grammar.Other tipsAssuming you are fairly confident about grammar, or at least know where to find the information you need to correct your errors, here are some other tips for improving your writing.Be simple and conciseExpress yourself in the simplest possible terms. Check each word, phrase and sentence. Can it be shortened or simplified? Is it really necessary? Can a word or expression be deleted without affecting the meaning?Aim for a simple styleA fancy literary or overblown style is not appropriate in scientific writing. Avoid the use of literary allusions, pompous words instead of simple ones, metaphorical expressions and hyperbole (exaggeration for effect). Use short words instead of long ones. For example: do not accomplishcause not aetiologypart not componentuse not utilisationuse not utilisebegin not commence, initiate or inauguratebefore not prior toChoose the concrete over the abstract; words over symbols, initials or abbreviations; English words over foreign, familiar words over unfamiliar. Compare the following pairs of examples:In studies pertaining to the identification of phenolic derivatives, drying of the paper gives lesssatisfactory results.Phenolic derivatives are mo re easily seen and identified if the paper is left wet.It has been a mooted question in the minds of microbiologists whether the gonococcus possesses acapsule.Microbiologists question whether the gonococcus has a capsule.In the present report the results of a series of experiments are described in which wine and beer drinkers were tested to see whether...We tested wine and beer drinkers to see whether…The aetiology of dependency in elderly people, and therefore the need to seek residential care, ismu ltifactorial.There are many reasons why elderly people become dependent and need to seek residential care.Various types of household appliances were found to have differing amounts of sales attraction indifferent areas of Australia.Refrigerators sold better in Queensland, radiators in Victoria.Avoid verbosityVerbosity wastes space. Short articles are more likely to be accepted by journals, whose space is always limited. When unnecessary words are deleted, the grammatical construction and style of expression are improved and the work is easier to read and understand.Some superfluous noun phrases:the amount ofthe case ofthe character ofthe situation wherethe extent ofthe concept ofthe magnitude ofthe purpose ofSuperfluous verbsthe numbers shown in −the numbers inthe data taken/extracted from−the data fromSuperfluous relative pronouns, articles and prepositionsmany of the subjects −many subjectspeople who had been interviewed −people interviewedThe openings `It…' and ‘There…”It was Dr X who first described −Dr X was first to describeThere are some people who think −Some people thinkIt appears that }It goes without saying that } are usually superfluousIt should be noted that }Overweight prepositionsin regard to −on, aboutwith respect to −on, aboutwith reference to −of, on, for, aboutWeak modifiersactually fortunately ratheravailable hopefully severalcertainly particularlyfairly quiteAvoid jargonJargon complicates expression, is often inaccurately used, and may make meanings ambiguous. Use technical words only where necessary and appropriate. Consider this phrase: 'Optimal reaction conditions are approximated when...'. The word 'approximate', a mathematical expression meaning 'approach', is used incorrectly. The phrase is simple and precise when translated into 'The reaction goes most quickly when...'.Fishbein gives this delightful example of what he calls 'gobbledygook': Can you understand it easily? How would you improve it?There is little heuristic value in erecting a specific hierarchy of the neuroses and psychoses as correlated to a multiple stage account of the development and vicissitudes of the libido. Instead it can becontended on both theoretical and clinical grounds (a) that the so-called libidinal stages of development have a temporal existence only in so far as the growing infant become physiologically capable ofexpressing them, and (b) that in later life the 'psychological' counterparts of these libidinal organ-cathexes overlap and mingle in terms of simple 'orality', 'anality' or 'genitality' analogous to committing mayhem on the facts.Avoid ambiguityAre you sure that what you have written means to others what it means to you? If in doubt, ask others to read your work and tell you what it means to th em. Ask yourself: will a reader without experience in this field understand my work? Can it be understood by someone whose English is their second language?Choose the correct wordUse general and technical dictionaries whenever you have doubts about the correct meaning. Guides to accepted usage, such as Fowler’s Modern English Usage or Strunk and White’s The Elements of Style, an early edition of which can be found at this address:http://sut1.sut.ac.th/strunk/are also helpful tools in writing well.Make sure that you are using words correctly. Does the context change the meaning of the word? The following words are often used incorrectly as synonyms:amount alternate minimal equalconcentration alternative negligible equivalentcontent slightlevelvarious imply delete anticipatevarying infer exclude expectvaried eliminatea variety ofdifferentdifferingThe following commonly used words often have an imprecise meaning, depending on their context. If you have used them, check that the meaning you intend is clear and that haven’t used them as ‘filler’ words, substituting for a more precise word.area level situation characternature structure conditions problemsystem fiel d processCommon grammatical problemsAs these are common problems it would be worthwhile for you to check them out on one of the grammar sites listed above if you are not already familiar with them:Dangling or misrelated participlesAfter completing the questionnaires, blood pressures were measured.(Dangling − the test subjects/patients have disappeared completely and there is no indication of whocompleted the questionnaires or whose blood pressure was measured).After completing the questionnaires, the blood pressure of subjects was measured. (Misrelated −although the subjects are there, the participle ‘completing’ is here related to their blood pressure, not to themselves.)After completing the questionnaires, subjects had their blood pressures measured.(Correct)Unidentified or ambiguous antecedents of pronounsFailure of treatment with penicillin could not have been predicted because of the defective assay method used. Unfortunately, this occurs in many hospitals.(Does the 'this' refer to the failure of treatment, the defective assay method, or the inability to predict?) Cimetidine is highly effective in suppressing gastric acid secretion in such cases. It is unfortunate that it is not prescribed more often.(The first it is an ‘opening it’ that has nothing to relate back to, and the second is intended to relate back to c imetidine, but the reader must search past ‘gastric acid secretion’ to find its antecedent.)Unfortunately, this drug is not prescribed often enough.(This is clearer.)Relative pronounsThat, which and who are often misused. If you aren’t clear which one to use where, look it up on one of the grammar web sites listed above.Abstract nouns instead of verbsScientific writing often contains too many abstract nouns that have been derived from verbs. Avoid writing with these; using the original verb is clearer and more vivid. A tip: look out for words ending in ‘-ation’ − they are often abstract nouns derived from verbs.The identification and classification of the various histologic types of lymphomas are vital stepstowards the introduction of new therapies and the reduction of mortality.Identifying and classifying the histologic types of lymphomas are vital steps towards introducing new therapies and reducing mortality.An investigation of the underlying causes was carried out.The underlying causes were investigated. (Or: We investigated the underlying causes.)The following words − ‘empty’ verbs − should raise suspicion that a construction with an abstract noun is lurking:accomplished experienced obtainedachieved facilitated occurredattained given performedcarried out proceeded implementedconducted indicated produceddone involved requiredeffected madeNoun clusters and stacked modifiersIn scientific writing, groups of words are often used together to describe an effect or process. When more than two words are used together the meaning may become ambiguous; it is often unclear what is modifying what. The hyphen can be a great help in clarifying these modifiers.a radium containing argon ionisation chamberThis implies a chamber containing radium that ionises argon. What it is in fact is a chamber that contains both argon and radium. Organic vapours are introduced into the chamber, react with the argon and are ionised. Using hyphens can make this clearer:a radium-containing-argon ionisation chamberHere are some non-scientific examples that make the use of the hyphen clear:A dutch cheese importer is someone from the Netherlands who imports cheese from anywhere.A dutch-cheese importer is anyone who imports cheese from the Netherlands.A small arms dealer is a short person who will sell you any kind of gun.A small-arms dealer is a person of any height who will sell you a handgun.Consistency of styleBe consistent throughout your document with the following:SpellingCheck words with more than one accepted spelling, plural forms and hyphenated forms.CapitalsCheck how you capitalise proper nouns, names of organisations, units, etc for consistency. Abbreviations and InitialsIt is common to use acronyms in reports and articles. Use the words the first time with the acronym in a bracket, then the acronym thereafter. If your report contains many acronyms make a list in an appendix (preferably Appendix A, to make it easy to refer to). Alternatively, when a term is repeated (for example the setting of your study) it can be abbreviated without being reduced to initials. For example, South Western Sydney Area Health Service may be referred to the first time in full, then as 'the Area Health Service' or even 'the Area' or 'the Service', whichever seems appropriate, or as 'South Western Sydney', if the context makes it unambiguous.NumbersGenerally, measurements, percentages and any numbers that are not whole numbers (integers) should be in figures. However, numbers starting a sentence should be expressed in words (or you could rearrange the sentence), as should numbers under 10:Ninety-two patients were followed over a period of three months by 12 researchers.Make sure all abbreviations for units are correct SI unit symbols.PunctuationIf you’re not clear about correct punctuation there is a good basic guide at/One check that writers commonly overlook is to make sure every quotation has a beginning and an end.Other style errors to avoidImpersonal constructions and the passive voiceUse the active voice where possible. It is a common misapprehension that the first person should not be used in scientific writing. It is not unscientific to report what you (or you and your fellow investigators) did in the first person.The present study was undertaken in order to investigate…(Wordy and impersonal)We undertook this study to investigate…(Better)We investigated…(Even better, if the investigators did what they set out to do.)'In the opinion of the present author' is just pompous. Say: 'In my opinion', or leave the phrase out altogether.Using two words instead of oneA laparotomy was performed on the next day and the diagnosis of appendicitis was confirmed.(Clumsy.)A laparotomy the next day confirmed the diagnosis of appendicitis.(Better)Overused phrases, fad words and slangTry and remove these from your writing wherever you can. They weigh your writing down and make it flaccid, unoriginal and uninteresting.to impact −to affect, to have an effectto interface −to work together, to meetthe bottom line is−what this means isrationale−reasonContractionsExcept in rare cases where they are used for rhetorical effect, contractions are too informal for dissertations and essays.don't - do not it's - it isDehumanising wordsperson not individualpatient (or client) not casewoman not female; man not malepatient with diabetes not a diabeticSummaryThe process of getting your thoughts onto paper and structuring them into a coherent piece of work is important in academic work. It is a skill that comes more easily with practice. You will learn from the comments that lecturers make on your work where you need to improve your writing.Other resourcesAs well as the websites listed in this guide, there are many books on writing style and scientific/medical writing. In this guide, we have 'borrowed' ideas and examples from some of these. Those marked with an asterisk are particularly useful.Australian Government Publishing Service. Style manual for authors, editors and printers. 5th ed. Canberra: AGPS Press; 1994.Burnard P. Writing for Health Professionals. 2nd ed. London: Chapman and Hall; 1996*Clanchy J, Ballard B. Essay writing for students. 3rd ed. Melbourne: Longman Cheshire; 1997.*Day RA. How to write & publish a scientific paper. 5th ed. Phoenix, Az. Oryx Press; 1998.. *Eagleson RD, Jones G, Hassall S. Writing in plain English. Canberra: AGPS Press; 1992. Fishbein M. Medical writing: the technic and art. 4th ed. Springfield, Ill: Thomas; 1972. Huth EJ. Writing and publishing in medicine. 3rd ed. Baltimore : Williams & Wilkins; 1999. Lock S. Thorne's better medical writing. 2nd ed. Tunbridge Wells: Pitm an Medical; 1977. Peters P. Strategies for student writers: a guide to writing essays, tutorial papers, exam paper and reports. Brisbane: John Wiley & Sons; 1985.Ross-Larson B. Edit yourself: a manual for everyone who works with words. New York & London: Norton; 1982.Strunk W, White EB. The elements of style. 3rd ed. New York: Macmillan; 1979. Woodford FP (ed). Scientific writing for graduate students. Bethesda, Md: Council of Biology Editors; 1986.*Zeiger M. Essentials of writing biomedical research papers. New York: McGraw Hill; 1991. Susan QuineJuliet RichtersAugust 1993 (updated 1998)Updated January 2002 by Mary-Helen WardUpdated October 2004 by NA。
英语语法——英语时间表达方法总结归纳及练习题答案English Grammar: XXXWhen it comes to expressing time in English。
there are several methods to do so。
Here are some of the most common ways:1.Time Point nsAll times can be read as "hour + minute":6:10 - six ten8:30 - eight XXX2:40 - two forty2.XXXIf the time is within half an hour。
you can use "minute + past + hour":6:10 - ten past six4:20 - XXX10:25 - XXX-XXX3.XXX Half an HourIf the time is beyond half an hour。
you can use "minute + to + hour":10:35 - XXX XXX5:50 - ten to six9:49 - XXX4.Half an HourIf the time is exactly half an hour。
you can use "half + past + hour":11:30 - half past XXX2:30 - half past two5.Quarter HourIf the time is related to 15 minutes。
there are three ways to express it:9:15 - nine XXX3:45 - three forty-five / fifteen to four / a quarter to four6.On the HourTo express that it's exactly on the hour。
Lesson Three1. Translate 1) into Chinese(1) 专业的历史工作者(2) 基于常识的反应(3) 事物的这种状况(4) 意见不一的历史学家(5) 已经准备好了的现成的东西(6) 一个个人喜好不同的问题(7) 截然不同的观点(8) 民间故事(9) 书面文件(10) 过去的遗留物(11) 人的动机和行为(12) 复杂和精细(13) 商船(14) 一旦发生潜艇战(15) 一个粗糙的理论(16) 好战的行为;战争行为(17) 宣传机器(18) 德国外交部长(19) 实力平衡(20) (事物的)因果(21) 海岸炮兵(22) 终极关怀(23) (事物的)近因(24) 人们常说的一句话(25) 不会出错的解释(26) 绝对有效的模式(27) 永不停止的探索(28) 一个难以达到但又十分诱人的目标2)into English1) to gain new insights2) to revise one’s ideas3) to trace the cause4) to begin from this premise5) to to open fire on/at6) to give equal weight to sth7) to support a certain view8) to influence the government9) to destroy the balance of power10) to form a alliance11) to repay the loans12) to contemplate war13) to fill in the gaps14) to conclude the quest15) to view sth. from a certainperspective 16) to benefit from the comparison17) to eliminate differences18) to dig into the problem19) to be immersed in a vast sea20) to stem from a different point ofview21) to be destined to do sth22) to ignore the fact23) to make an assumption24) to defeat the enemy25) to win back one’s lost territory26) to sink a boat27) to intercept the secret message28) to piece together evidence29) to approximate the truth30) to master new techniques2. Give synonyms and antonyms of the following.1) synonyms(1) puzzling, baffling(2) fascinating(3) clear, plain, obvious, noticeable(4) final, last(5) to correct, to change, to alter, to modify, to rewrite(6) to dig into, to investigate, to look into(7) warlike, warring, aggressive, hostile(8) besides, apart from, plus(9) in the case of, should sth occur(10) nevertheless, in spite of that, just the same(11) to end (the search / probing)(12) to refer to(13) convincingly(14) on the whole, generally speaking(15) in addition, besides, apart from that, what’s more(16) through2) antonyms(1) to be praised(2) depressing(3) doubtful(4) unsoundness, weakness(5) conclusion(6) effect(7) disproportionate(8) anti-British(9) to take into account: to ignore(10) a well-developed (theory), a sophisticated (theory)(11) clear, clear-minded(12) non-professional, amateurish3. Replace the words in bold type with words and phrases you knowthat convey more or less the same meaning.1) simple/primitive; told; thick2) pondering/thinking about; future/fate; insignificant3) completely/entirely; different/opposite4) consider/regard; look at; angles/points of view5) knows very well; growing/increasing; complaints6) besides/apart from; easy; in the case of/if there is7) purposely paid no attention to8) generally/on the whole; however/but; come from/originate from9) absolutely reliable; wipe out/get rid of; bound to4. Translate1) The cause of the aircraft crash is so far unknown.2) The cause of global warming is still hotly debated among scientists.3) He devoted all his life to the cause of environmental protection.4) The river has caused us a lot of trouble in history.5) What do you think caused the upsurge in international terrorism?6) We must try and unite with those who have opposed us.7) There is always opposition to any progress and reform.8) Some people are always opposed to new things.9) A lot of those loans were never repaid. That high ratio of bad debts finally led tothe financial crisis in this second economic power in the world.10) The Busine ss Bank now offers a special loan to students who can’t pay for theireducation.11) The boy asked Mrs. Stow for the loan of her binoculars.12) She concluded her speech by saying that she hoped she could come againsome day and see more of the country.13) As soon as they concluded the investigation, they were to report to the SecurityCouncil.14) During his visit, he will conclude a new trade agreement with India.15) Based on those reasonable doubts, the jury had to conclude that the boy wasnot guilty.16) She is flying to New York by way of Tokyo.17) I’d like to say a few words about the situation in the sixties of the last century byway of an introduction to the movie.18) They decided to recall their ambassador by way of protest.5. Put the most appropriate words in the blanks.1) while/although; to2) over/about; with3) to; in4) led to/resulted in/caused5) with; lends/gives/brings6) from; in7) as to; of8) in the even of; survive9) to; to; study10) out; in11) in; weight/priority12) denote; of13) immersed/buried; more or less14) rather; the more15) at; stems from/comes from/originates from/results from16) base; filled in6. Show the difference between the following pairs or groups of works.1) (1) historic (2) historical (3) historical;historical 4) historic ( 5) historic2) (1) economic;economic (2) economical (3) economical (4) economical(5) economic3) (1) limited (2) limit (3) restricted (4) restricted (5) limited (6) restrict4) (1) cause (2) courses (3) course (4) course (5) cause;course5) (1) source (2) sauce (3) source (4) source;sauce6) (1) suppose/assume (2) supposed (3) presume/assume (4) Suppose(5) presumed7. Choose the best word or phrase for each blank from the four supplied in brackets.(1) only(2) tirelessly(3) motion(4) federal(5) Drastic (6) as(7) argued(8) which(9) the 1920s(10) sound(11) a(12) traced(13) lay(14) prepared(15) resulted1.Grammar in context1) (1) therefore: to show result(2) and: to add; however: to contrast(3) yet: to contrast(4) finally: to end a series(5) rather: to introduce a fact that is different from what has just been mentioned(6) at the same time: to indicate time(7) in other words: to explain(8) in the end: to conclude(9) but: to contrast; first: or enumerate; furthermore: to add(10) that is: to explain; but: to contrast; on the other hand: to contrast2. Put in appropriate transitional words or phrases.1) therefore2) therefore3) For example4) on the other hand5) furthermore 6) on the contrary7) In short8) At first, however9) Consequently10) even so11) moreover12) therefore13) however14) in fact15) that is3. Complete the sentences below with “one” or “a/an”.1) a,a2) one, a3) one4) a5) One, a 6) one7) a8) a, a9) One10) One, a11) A12) one13) a, one14) An, a15) a, an16) a, a17) One, a18) one19) One20) one4.translate the sentences1)Heroes and heroines are people with unusual qualities.2)Celebrities are people who become famous because of publicity through the media.3)In China's mainland, "sweetheart" often refers to a person's husband or wife.4)A fair-weather friend is one who will desert you as soon as you are in trouble.5)Broadly speaking, money refers to anything generally accepted in exchange for other goods and services.6)An armchair revolutionary is one who talks about revolution, but who doesn’t put what he says into practice.7)Professor Lu says that a good teacher is one who does all he/she can to make himself/herself unnecessary for the students.8)Economics is defined as the social science that deals with the production, distribution and consumption of goods and services.9)DVD is a disk on which large amounts of information, esp. photographs and video can be stored in a computer.10)The Oxford Advanced Learner's Dictionary defines "workaholic" as "a person who works most of the time and finds it difficult to stop working and do other things".5. Put in a preposition or an adverb in the blanks.1) after, in, of, from, of, down, in, for2) During/In, on3) In,except, of, with4)) for, by, with, At, to, by, With, in, up6. CCDAB DCBAC DCCAB。
Learning to communicate in another language may be challenging, but it is also a very rewarding and enriching experience. It is the best passport to discovering another culture. Here are a few tips we’ve put together to help you make the most of the experience.1. Find out what kind of learner you areAsk yourself, do I learn something better when I see it written down? Do I only need to hear something a few times before I know it? Do I like to learn grammar rules by heart? Are there other strategies that work well for me? If you recognize your strengths, you can use them to work more effectively.When learning a language, it is important to use a variety of strategies (using the book, listening to the recording, rehearsing dialogues, learning vocabulary, writing things down, listing verb forms, etc.) to practice the different skills of listening, speaking, reading and writing. Work out what your preferred learning style is and use it to your advantage.2. Work at your own paceMake the most of the time you have available to study. As a general rule, don’t try and do too much at once. You will often find you can learn more effectively if you study for half an hour or so at regular intervals, rather than try to do a whole unit in one sitting.With the best will in the world, there will be times when you don’t feel like studying. Try not to skip your session, though, and organize your work according to your mood. If you’re tired, choose less demanding tasks such as repeating activities; if you lack concentration, read a foreign magazine article about a subject which really interests you.3. Why not learn with someone else?It helps if you can learn with someone else. If you can persuade a friend or family member to study with you, it will give you extra impetus to keep working. Agree times to meet and set goals for the week, and test each other regularly.4. Remember that you can go a long way with just a little languageEven if you feel unsure about your ability to form correct, complete sentences, you’ll find that it is possible to communicate with just a few words. Above all, don’t worry about getting things wrong:people will still be able to understand you. They will alsoappreciate that you are making the effort to speak their language and will be more receptive. The more confidence you gain in actually communicating, the more fluent you’ll become.5. Don’t get stuck by a word you don’t knowPractice improvising ways of getting your meaning across when speaking spontaneously, even if you don’t know the exact words or phrases. Think of things you might want to say whenever you have spare time – while you’re traveling, for instance. A basic example is the use of tenses. If you don’t know the past tense but want to talk about yesterday, use the verb in the present tense and use the word for ‘yesterday’. With practice, you’ll find that you will improve your ability to approximate and to describe things, even if you are aware that you do not have the exact vocabulary or specific phrases. Use facial expressions, hand movements, anything to get your meaning across. The important thing is to build up your confidence so you’re not afraid of getting involved in a conversation.6. Language learning is also about intuitionGuesswork is an important strategy in learning a new language and you will probably be pleasantly surprised at how often you’re right.When listening to recorded material, you aren’t expected to understand everything first time round. If you play the same piece several times, you will most probably understand something new each time. Learn to make maximum use of all the clues you can pick up. For example, what do the speakers sound like? Happy? Angry? Calm? etc.Also, in most European languages a lot of words have a common origin, which will help you build your knowledge more quickly. After a while you should also be able to identify common patterns between English and the language you’re learning, for example French equivalents of English words ending in –ly often end in –ment.7. Speak, speak, speak!Practice speaking as often as you can – even speaking to yourself is good practice.Try recording yourself whenever you can –especially when doing the pronunciation practice activities. When you listen to it afterwards, don’t worry if you sound hesitant or have made mistakes. It is important to evaluate your performance. Compare yourpronunciation with the master version, see how you can do better and have another go. If you do this several times, you will find that each version is better than the last.Read aloud whenever possible:it will help you memorize vocabulary and structures. Going through the same dialogue several times is a good idea too.8. Build up your vocabularyA wide vocabulary is the key to successful language learning but don’t try to learn too much at once. It’s best to study frequently, for short periods of time. Take a maximum of six or seven items of vocabulary and learn them. Put them into sentences to fix them in your mind, then come back to them later. Much of the vocabulary in the course is presented by topic. Learning vocabulary in this way is usually very effective.9. Get the right toolsYou may find you want a bilingual dictionary to help translate new words and expressions as you expand your vocabulary. When choosing a dictionary, make sure you pick one that gives you plenty of information on usage through illustrative examples, rather than one which only gives translations for each word with no guidance as to which ones to use in which contexts.Alternatively, you may find a vocabulary builder useful. These are usually arranged thematically and allow you to learn lists of words and expressions associated with a particular topic.10. Get used to hearing everyday language at normal speedApart from listening to the course recordings, you could watch films with subtitles, listen to a foreign radio station, or watch foreign language TV stations if you have satellite or cable TV. Even if you don’t understand much of what is being said, it is a good way of getting used to sounds and intonations. Choose programs according to your own interests, you’ll learn much more effectively about subjects you’re keen on. International news is a good thing to listen to, particularly if you have already heard the news in English that day. Pictures will give you clues. You’ll find that you are picking up a lot of vocabulary by making use of the subtitles especially for expressions that occurregularly.11. And most of all, have fun!。
abbreviation 缩写 [省略语]ablative 夺格(的)abrupt 突发音accent 口音/{Phonetics}重音accusative 受格(的)acoustic phonetics 声学语音学acquisition 习得action verb 动作动词active 主动语态active chart parser 活动图句法剖析程序active knowledge 主动知识active verb 主动动词actor-action-goal 施事(者)-动作-目标actualization 实现(化)acute 锐音address 地址{信息科学}/称呼(语){语言学} adequacy 妥善性adjacency pair 邻对adjective 形容词adjunct 附加语 [附加修饰语]adjunction 加接adverb 副词adverbial idiom 副词词组affective 影响的affirmative 肯定(的;式)affix 词缀affixation 加缀affricate 塞擦音agent 施事agentive-action verb 施事动作动词agglutinative 胶着(性)agreement 对谐AI (artificial intelligence) 人工智能 [人工智能] AI language 人工智能语言 [人工智能语言]Algebraic Linguistics 代数语言学algorithm 算法 [算法]alienable 可分割的alignment 对照 [多国语言文章词;词组;句子翻译的] allo- 同位-allomorph 同位语素allophone 同位音位alpha notation alpha 标记alphabetic writing 拼音文字alternation 交替alveolar 齿龈音ambiguity 歧义ambiguity resolution 歧义消解ambiguous 歧义American structuralism 美国结构主义analogy 类推analyzable 可分析的anaphor 照应语 [前方照应词]animate 有生的A-not-A question 正反问句antecedent 先行词anterior 舌前音anticipation 预期 (音变)antonym 反义词antonymy 反义A-over-A A-上-A 原则apposition 同位语appositive construction 同位结构appropriate 恰当的approximant 无擦通音approximate match 近似匹配arbitrariness 任意性archiphoneme 大音位argument 论元 [变元]argument structure 论元结构 [变元结构] arrangement 配列array 数组articulatory configuration 发音结构articulatory phonetics 发音语音学artificial intelligence (AI) 人工智能 [人工智能] artificial language 人工语言ASCII 美国标准信息交换码aspect 态 [体]aspirant 气音aspiration 送气assign 指派assimilation 同化association 关联associative phrase 联想词组asterisk 标星号ATN (augmented transition network) 扩充转移网络attested 经证实的attribute 属性attributive 属性auditory phonetics 听觉语音学augmented transition network 扩充转移网络automatic document classification 自动文件分类automatic indexing 自动索引automatic segmentation 自动切分automatic training 自动训练automatic word segmentation 自动分词automaton 自动机autonomous 自主的auxiliary 助动词axiom 公理baby-talk 儿语back-formation 逆生构词(法)backtrack 回溯Backus-Naur Form 巴科斯诺尔形式 [巴科斯诺尔范式] backward deletion 逆向删略ba-construction 把─字句balanced corpus 平衡语料库base 词基Bayesian learning 贝式学习Bayesian statistics 贝式统计behaviorism 行为主义belief system 信念系统benefactive 受益(格;的)best first parser 最佳优先句法剖析器bidirectional linked list 双向串行bigram 双连词bilabial 双唇音bilateral 双边的bilingual concordancer 双语关键词前后文排序程序binary feature 双向特征[二分征性]binding 约束bit 位 [二进制制;比特]biuniqueness 双向唯一性blade 舌叶blend 省并词block 封阻[封杀]Bloomfieldian 布隆菲尔德(学派)的body language 肢体语言Boolean lattice 布尔网格 [布尔网格]borrow 借移Bottom-up 由下而上bottom-up parsing 由下而上剖析bound 附着(的)bound morpheme 附着语素 [黏着语素]boundary marker 界线标记boundary symbol 界线符号bracketing 方括号法branching 分枝法breadth-first search 广度优先搜寻 [宽度优先搜索]breath group 换气单位breathy 气息音的buffer 缓冲区byte 字节CAI (Computer Assisted Instruction) 计算机辅助教学CALL (computer assisted language learning) 计算机辅助语言学习canonical 典范的capacity 能力cardinal 基数的cardinal vowels 基本元音case 格位case frame 格位框架Case Grammar 格位语法case marking 格位标志CAT (computer assisted translation) 计算机辅助翻译cataphora 下指Categorial Grammar 范畴语法Categorial Unification Grammar 范畴连并语法 [范畴合一语法] causative 使动causative verb 使役动词causativity 使役性centralization 央元音化chain 炼chart parsing 表式剖析 [图表句法分析]checked 受阻的checking 验证Chinese character code 中文编码 [汉字代码]Chinese character code for information interchange 中文信息交换码[汉字交换码]Chinese character coding input method 中文输入法 [汉字编码输入] choice 选择Chomsky hierarchy 杭士基阶层 [Chomsky 层次结构]citation form 基本形式CKY algorithm (Cocke-Kasami-Younger) CKY 算法classifier 类别词cleft sentence 分裂句click 啧音clitic 附着词closed world assumption 封闭世界假说cluster 音群Cocke-Kasami-Younger algorithm CKY 算法coda 音节尾code conversion 代码变换cognate 同源(的;词)Cognitive Linguistics 认知语言学coherence 一致性cohesion 凝结性 [黏着性;结合力]collapse 合并collective 集合的collocation 连用语 [同现;搭配]combinatorial construction 合并结构combinatorial insertion 合并中插combinatorial word 合并词Combinatory Categorial Grammar 组合范畴语法comment 评论commissive 许诺[语行]common sense semantics 常识语意学Communication Theory 通讯理论 [通讯论;信息论]Comparative Linguistics 比较语言学comparison 比较competence 语言知能compiler 编译器complement 补语complementary 互补complementary distribution 互补分布complementizer 补语标记complex predicate 复杂谓语complex stative construction 复杂状态结构complex symbol 复杂符号complexity 复杂度component 成分compositionality 语意合成性 [合成性]compound word 复合词Computational Lexical Semantics 计算词汇语意学Computational Lexicography 计算词典编纂学Computational Linguistics 计算语言学Computational Phonetics 计算语音学Computational Phonology 计算声韵学Computational Pragmatics 计算语用学Computational Semantics 计算语意学Computational Syntax 计算句法学computer language 计算器语言computer-aided translation 计算机辅助翻译 [计算器辅助翻译]computer-assisted instruction (CAI) 计算机辅助教学computer-assisted language learning 计算机辅助语言学习[计算器辅助语言学习] concatenation 串联concept classification 概念分类concept dependency 概念依存conceptual hierarchy 概念阶层concord 谐和concordance 关键词 (前后文) 排序concordancer 关键词 (前后文) 排序的程序concurrent parsing 并行句法剖析conditional decision 条件决定 [条件决策]conjoin 连接conjunction 连接词 (合取;逻辑积;"与";连词)conjunctive 连接的connected speech 连续语言Connectionist model 类神经网络模型Connectionist model for natural language 自然语言类神经网络模型[自然语言连接模型]connotation 隐涵意义consonant 子音 [辅音]constituent 成分constituent structure tree 词组结构树constraint 限制constraint propagation 限制条件的传递 [限定因素增殖]constraint-based grammar formalism 限制为本的语法形式Construct Grammar 句构语法content word 实词context 语境context-free language 语境自由语言 [上下文无关语言]context-sensitive language 语境限定语言 [上下文有关语言;上下文敏感语言] continuant 连续音continuous speech recognition 连续语音识别contraction 缩约control agreement principle 控制一致原理control structure 控制结构control theory 控制论convention 约定俗成[规约]convergence 收敛[趋同现象]conversational implicature 会话含义converse 相反(词;的)cooccurrence relation 共现关系 [同现关系]co-operative principle 合作原则coordination 对称连接词 [同等;并列连接]copula 系词co-reference 同指涉 [互指]co-referential 同指涉coronal 前舌音corpora 语料库corpus 语料库Corpus Linguistics 语料库语言学corpus-based learning 语料库为本的学习correlation 相关性counter-intuitive 违反语感的courseware 课程软件 [课件]coverb 动介词C-structure 成分结构data compression 数据压缩 [数据压缩]data driven analysis 资料驱动型分析 [数据驱动型分析]data structure 数据结构 [数据结构]database 数据库 [数据库]database knowledge representation 数据库知识表示 [数据库知识表示]data-driven 资料驱动 [数据驱动]dative 与格declarative knowledge 陈述性知识decomposition 分解deductive database 演译数据库 [演译数据库]default 默认值 [默认;缺省]definite 定指Definite Clause Grammar 确定子句语法definite state automaton 有限状态自动机Definite State Grammar 有限状态语法definiteness 定指degree adverb 程度副词degree of freedom 自由度deixis 指示delimiter 定界符号 [定界符]denotation 外延denotic logic 符号逻辑dependency 依存关系Dependency Grammar 依存关系语法dependency relation 依存关系depth-first search 深度优先搜寻derivation 派生derivational bound morpheme 派生性附着语素Descriptive Grammar 描述型语法 [描写语法]Descriptive Linguistics 描述语言学 [描写语言学] desiderative 意愿的determiner 限定词deterministic algorithm 决定型算法 [确定性算法] deterministic finite state automaton 决定型有限状态机deterministic parser 决定型语法剖析器 [确定性句法剖析程序] developmental psychology 发展心理学Diachronic Linguistics 历时语言学diacritic 附加符号dialectology 方言学dictionary database 辞典数据库 [词点数据库]dictionary entry 辞典条目digital processing 数字处理 [数值处理]diglossia 双言digraph 二合字母diminutive 指小词diphone 双连音directed acyclic graph 有向非循环图disambiguation 消除歧义 [歧义消除]discourse 篇章discourse analysis 篇章分析 [言谈分析]discourse planning 篇章规划Discourse Representation Theory 篇章表征理论 [言谈表示理论] discourse strategy 言谈策略discourse structure 言谈结构discrete 离散的disjunction 选言dissimilation 异化distributed 分布式的distributed cooperative reasoning 分布协调型推理distributed text parsing 分布式文本剖析disyllabic 双音节的ditransitive verb 双宾动词 [双宾语动词;双及物动词] divergence 扩散[分化]D-M (Determiner-Measure) construction 定量结构D-N (determiner-noun) construction 定名结构document retrieval system 文件检索系统 [文献检索系统] domain dependency 领域依存性 [领域依存关系]double insertion 交互中插double-base 双基downgrading 降级dummy 虚位duration 音长{语音学}/时段{语法学/语意学}dynamic programming 动态规划Earley algorithm Earley 算法echo 回声句egressive 呼气音ejective 紧喉音electronic dictionary 电子词典elementary string 基本字符串 [基本单词串]ellipsis 省略EM algorithm EM算法embedding 崁入emic 功能关系的empiricism 经验论Empty Category Principle 虚范畴原则 [空范畴原理]empty word 虚词enclitics 后接成份end user 终端用户 [最终用户]endocentric 同心的endophora 语境照应entailment 蕴涵entity 实体entropy 熵entry 条目episodic memory 情节性记忆epistemological network 认识论网络ergative verb 作格动词ergativity 作格性Esperando 世界语etic 无功能关系etymology 词源学event 事件event driven control 事件驱动型控制example-based machine translation 以例句为本的机器翻译exclamation 感叹exclusive disjunction 排它性逻辑 “或”experiencer case 经验者格expert system 专家系统extension 外延external argument 域外论元extraposition 移外变形 [外置转换]facility value 易度值feature 特征feature bundle 特征束feature co-occurrence restriction 特征同现限制 [特性同现限制] feature instantiation 特征体现feature structure 特征结构 [特性结构]feature unification 特征连并 [特性合一]feedback 回馈felicity condition 妥适条件file structure 档案结构finite automaton 有限状态机 [有限自动机]finite state 有限状态Finite State Morphology 有限状态构词法 [有限状态词法]finite-state automata 有限状态自动机finite-state language 有限状态语言finite-state machine 有限状态机finite-state transducer 有限状态置换器flap 闪音flat 降音foreground information 前景讯息 [前景信息]Formal Language Theory 形式语言理论Formal Linguistics 形式语言学Formal Semantics 形式语意学forward inference 前向推理 [向前推理]forward-backward algorithm 前前后后算法frame 框架frame based knowledge representation 框架型知识表示Frame Theory 框架理论free morpheme 自由语素Fregean principle Fregean 原则fricative 擦音F-structure 功能结构full text searching 全文检索function word 功能词Functional Grammar 功能语法functional programming 函数型程序设计 [函数型程序设计]functional sentence perspective 功能句子观functional structure 功能结构functional unification 功能连并 [功能合一]functor 功能符fundamental frequency 基频garden path sentence 花园路径句GB (Government and Binding) 管辖约束geminate 重叠音gender 性Generalized Phrase Structure Grammar 概化词组结构语法 [广义短语结构语法] Generative Grammar 衍生语法Generative Linguistics 衍生语言学 [生成语言学]generic 泛指genetic epistemology 发生认识论genetive marker 属格标记genitive 属格gerund 动名词Government and Binding Theory 管辖约束理论GPSG (Generalized Phrase Structure Grammar) 概化词组结构语法[广义短语结构语法]gradability 可分级性grammar checker 文法检查器grammatical affix 语法词缀grammatical category 语法范畴grammatical function 语法功能grammatical inference 文法推论grammatical relation 语法关系grapheme 字素haplology 类音删略head 中心语head driven phrase structure 中心语驱动词组结构 [中心词驱动词组结构] head feature convention 中心语特征继承原理 [中心词特性继承原理] Head-Driven Phrase Structure Grammar 中心语驱动词组结构律heteronym 同形heuristic parsing 经验式句法剖析Heuristics 经验知识hidden Markov model 隐式马可夫模型hierarchical structure 阶层结构 [层次结构]holophrase 单词句homograph 同形异义词homonym 同音异义词homophone 同音词homophony 同音异义homorganic 同部位音的Horn clause Horn 子句HPSG (Head-Driven Phrase Structure Grammar) 中心语驱动词组结构语法human-machine interface 人机界面hypernym 上位词hypertext 超文件 [超文本]hyponym 下位词hypotactic 主从结构的IC (immediate constituent) 直接成份ICG (Information-based Case Grammar) 讯息为本的格位语法idiom 成语 [熟语]idiosyncrasy 特异性illocutionary 施为性immediate constituent 直接成份imperative 祈使句implicative predicate 蕴含谓词implicature 含意indexical 标引的indirect object 间接宾语indirect speech act 间接言谈行动 [间接言语行为]Indo-European language 印欧语言inductional inference 归纳推理inference machine 推理机器infinitive 不定词 [to 不定式]infix 中缀inflection/inflexion 屈折变化inflectional affix 屈折词缀information extraction 信息撷取information processing 信息处理 [信息处理]information retrieval 信息检索Information Science 信息科学 [信息科学; 情报科学] Information Theory 信息论 [信息论]inherent feature 固有特征inherit 继承inheritance 继承inheritance hierarchy 继承阶层 [继承层次]inheritance of attribute 属性继承innateness position 语法天生假说insertion 中插inside-outside algorithm 里里外外算法instantiation 体现instrumental (case) 工具格integrated parser 集成句法剖析程序integrated theory of discourse analysis 篇章分析综合理论[言谈分析综合理论]intelligence intensive production 知识密集型生产intensifier 加强成分intensional logic 内含逻辑Intensional Semantics 内涵语意学intensional type 内含类型interjection/exclamation 感叹词inter-level 中间成分interlingua 中介语言interlingual 中介语(的)interlocutor 对话者internalise 内化International Phonetic Association (IPA) 国际语音学会internet 网际网络Interpretive Semantics 诠释性语意学intonation 语调intonation unit (IU) 语调单位IPA (International Phonetic Association) 国际语音学会IR (information retrieval) 信息检索IS-A relation IS-A 关系isomorphism 同形现象IU (intonation unit) 语调单位junction 连接keyword in context 上下文中关键词[上下文内关键词] kinesics 体势学knowledge acquisition 知识习得knowledge base 知识库knowledge based machine translation 知识为本之机器翻译knowledge extraction 知识撷取 [知识题取]knowledge representation 知识表示KWIC (keyword in context) 关键词前后文 [上下文内关键词] label 卷标labial 唇音labio-dental 唇齿音labio-velar 软颚唇音LAD (language acquisition device) 语言习得装置lag 发声延迟language acquisition 语言习得language acquisition device 语言习得装置language engineering 语言工程language generation 语言生成language intuition 语感language model 语言模型language technology 语言科技left-corner parsing 左角落剖析 [左角句法剖析]lemma 词元lenis 弱辅音letter-to-phone 字转音lexeme 词汇单位lexical ambiguity 词汇歧义lexical category 词类lexical conceptual structure 词汇概念结构lexical entry 词项lexical entry selection standard 选词标准lexical integrity 词语完整性Lexical Semantics 词汇语意学Lexical-Functional Grammar 词汇功能语法Lexicography 词典学Lexicology 词汇学lexicon 词汇库 [词典;词库]lexis 词汇层LF (logical form) 逻辑形式LFG (Lexical-Functional Grammar) 词汇功能语法liaison 连音linear bounded automaton 线性有限自主机linear precedence 线性次序lingua franca 共通语linguistic decoding 语言译码linguistic unit 语言单位linked list 串行loan 外来语local 局部的localism 方位主义localizer 方位词locus model 轨迹模型locution 惯用语logic 逻辑logic array network 逻辑数组网络logic programming 逻辑程序设计 [逻辑程序设计] logical form 逻辑形式logical operator 逻辑算子 [逻辑算符]Logic-Based Grammar 逻辑为本语法 [基于逻辑的语法] long term memory 长期记忆longest match principle 最长匹配原则 [最长一致法] LR (left-right) parsing LR 剖析machine dictionary 机器词典machine language 机器语言machine learning 机器学习machine translation 机器翻译machine-readable dictionary (MRD) 机读辞典Macrolinguistics 宏观语言学Markov chart 马可夫图Mathematical Linguistics 数理语言学maximum entropy 最大熵M-D (modifier-head) construction 偏正结构mean length of utterance (MLU) 语句平均长度measure of information 讯习测度 [信息测度] memory based 根据记忆的mental lexicon 心理词汇库mental model 心理模型mental process 心理过程 [智力过程;智力处理] metalanguage 超语言metaphor 隐喻metaphorical extension 隐喻扩展metarule 律上律 [元规则]metathesis 语音易位Microlinguistics 微观语言学middle structure 中间式结构minimal pair 最小对Minimalist Program 微言主义MLU (mean length of utterance) 语句平均长度modal 情态词modal auxiliary 情态助动词modal logic 情态逻辑modifier 修饰语Modular Logic Grammar 模块化逻辑语法modular parsing system 模块化句法剖析系统modularity 模块性(理论)module 模块monophthong 单元音monotonic 单调monotonicity 单调性Montague Grammar 蒙泰究语法 [蒙塔格语法]mood 语气morpheme 词素morphological affix 构词词缀morphological decomposition 语素分解morphological pattern 词型morphological processing 词素处理morphological rule 构词律 [词法规则] morphological segmentation 语素切分Morphology 构词学Morphophonemics 词音学 [形态音位学;语素音位学] morphophonological rule 形态音位规则Morphosyntax 词句法Motor Theory 肌动理论movement 移位MRD (machine-readable dictionary) 机读辞典MT (machine translation) 机器翻译multilingual processing system 多语讯息处理系统multilingual translation 多语翻译multimedia 多媒体multi-media communication 多媒体通讯multiple inheritance 多重继承multistate logic 多态逻辑mutation 语音转换mutual exclusion 互斥mutual information 相互讯息nativist position 语法天生假说natural language 自然语言natural language processing (NLP) 自然语言处理natural language understanding 自然语言理解negation 否定negative sentence 否定句neologism 新词语nested structure 套结构network 网络neural network 类神经网络Neurolinguistics 神经语言学neutralization 中立化n-gram n-连词n-gram modeling n-连词模型NLP (natural language processing) 自然语言处理node 节点nominalization 名物化nonce 暂用的non-finite 非限定non-finite clause 非限定式子句non-monotonic reasoning 非单调推理normal distribution 常态分布noun 名词noun phrase 名词组NP (noun phrase) completeness 名词组完全性object 宾语{语言学}/对象{信息科学}object oriented programming 对象导向程序设计 [面向对向的程序设计] official language 官方语言one-place predicate 一元述语on-line dictionary 线上查询词典 [联机词点]onomatopoeia 拟声词onset 节首音ontogeny 个体发生Ontology 本体论open set 开放集operand 操作数 [操作对象]optimization 最佳化 [最优化]overgeneralization 过度概化overgeneration 过度衍生paradigmatic relation 聚合关系paralanguage 附语言parallel construction 并列结构Parallel Corpus 平行语料库parallel distributed processing (PDP) 平行分布处理paraphrase 转述 [释意;意译;同意互训]parole 言语parser 剖析器 [句法剖析程序]parsing 剖析part of speech (POS) 词类particle 语助词PART-OF relation PART-OF 关系part-of-speech tagging 词类标注pattern recognition 型样识别P-C (predicate-complement) insertion 述补中插PDP (parallel distributed processing) 平行分布处理perception 知觉perceptron 感觉器 [感知器]perceptual strategy 感知策略performative 行为句periphrasis 用独立词表达perlocutionary 语效性的permutation 移位Petri Net Grammar Petri 网语法philology 语文学phone 语音phoneme 音素phonemic analysis 因素分析phonemic stratum 音素层Phonetics 语音学phonogram 音标Phonology 声韵学 [音位学;广义语音学]Phonotactics 音位排列理论phrasal verb 词组动词 [短语动词]phrase 词组 [短语]phrase marker 词组标记 [短语标记]pitch 音调pitch contour 调形变化Pivot Grammar 枢轴语法pivotal construction 承轴结构plausibility function 可能性函数PM (phrase marker) 词组标记 [短语标记]polysemy 多义性POS-tagging 词类标记postposition 方位词PP (preposition phrase) attachment 介词依附Pragmatics 语用学Precedence Grammar 优先级语法precision 精确度predicate 述词predicate calculus 述词计算predicate logic 述词逻辑 [谓词逻辑]predicate-argument structure 述词论元结构prefix 前缀premodification 前置修饰preposition 介词Prescriptive Linguistics 规定语言学 [规范语言学]presentative sentence 引介句presupposition 前提Principle of Compositionality 语意合成性原理privative 二元对立的probabilistic parser 概率句法剖析程序problem solving 解决问题program 程序programming language 程序设计语言 [程序设计语言]proofreading system 校对系统proper name 专有名词prosody 节律prototype 原型pseudo-cleft sentence 准分裂句Psycholinguistics 心理语言学punctuation 标点符号pushdown automata 下推自动机pushdown transducer 下推转换器qualification 后置修饰quantification 量化quantifier 范域词Quantitative Linguistics 计量语言学question answering system 问答系统queue 队列radical 字根 [词干;词根;部首;偏旁]radix of tuple 元组数基random access 随机存取rationalism 理性论rationalist (position) 理性论立场 [唯理论观点]reading laboratory 阅读实验室real time 实时real time control 实时控制 [实时控制]recursive transition network 递归转移网络reduplication 重叠词 [重复]reference 指涉referent 指称对象referential indices 指针referring expression 指涉词 [指示短语]register 缓存器 [寄存器]{信息科学}/调高{语音学}/语言的场合层级{社会语言学} regular language 正规语言 [正则语言]relational database 关系型数据库 [关系数据库]relative clause 关系子句relaxation method 松弛法relevance 相关性Restricted Logic Grammar 受限逻辑语法resumptive pronouns 复指代词retroactive inhibition 逆抑制rewriting rule 重写规则rheme 述位rhetorical structure 修辞结构rhetorics 修辞学robust 强健性robust processing 强健性处理robustness 强健性schema 基朴school grammar 教学语法scope 范域 [作用域;范围]script 脚本search mechanism 检索机制search space 检索空间searching route 检索路径 [搜索路径]second order predicate 二阶述词segmentation 分词segmentation marker 分段标志selectional restriction 选择限制semantic field 语意场semantic frame 语意架构semantic network 语意网络semantic representation 语意表征 [语义表示]semantic representation language 语意表征语言semantic restriction 语意限制semantic structure 语意结构Semantics 语意学sememe 意素Semiotics 符号学sender 发送者sensorimotor stage 感觉运动期sensory information 感官讯息 [感觉信息]sentence 句子sentence generator 句子产生器 [句子生成程序]sentence pattern 句型separation of homonyms 同音词区分sequence 序列serial order learning 顺序学习serial verb construction 连动结构set oriented semantic network 集合导向型语意网络 [面向集合型语意网络] SGML (Standard Generalized Markup Language) 结构化通用标记语言shift-reduce parsing 替换简化式剖析short term memory 短程记忆sign 信号signal processing technology 信号处理技术simple word 单纯词situation 情境Situation Semantics 情境语意学situational type 情境类型social context 社会环境sociolinguistics 社会语言学software engineering 软件工程 [软件工程]sort 排序speaker-independent speech recognition 非特定语者语音识别spectrum 频谱speech 口语speech act assignment 言语行为指定speech continuum 言语连续体speech disorder 语言失序 [言语缺失]speech recognition 语音辨识speech retrieval 语音检索speech situation 言谈情境 [言语情境]speech synthesis 语音合成speech translation system 语音翻译系统speech understanding system 语音理解系统spreading activation model 扩散激发模型standard deviation 标准差Standard Generalized Markup Language 标准通用标示语言start-bound complement 接头词state of affairs algebra 事态代数state transition diagram 状态转移图statement kernel 句核static attribute list 静态属性表statistical analysis 统计分析Statistical Linguistics 统计语言学statistical significance 统计意义stem 词干stimulus-response theory 刺激反应理论stochastic approach to parsing 概率式句法剖析 [句法剖析的随机方法] stop 爆破音Stratificational Grammar 阶层语法 [层级语法]string 字符串[串;字符串]string manipulation language 字符串操作语言string matching 字符串匹配 [字符串]structural ambiguity 结构歧义Structural Linguistics 结构语言学structural relation 结构关系structural transfer 结构转换structuralism 结构主义structure 结构structure sharing representation 结构共享表征subcategorization 次类划分 [下位范畴化]subjunctive 假设的sublanguage 子语言subordinate 从属关系subordinate clause 从属子句 [从句;子句]subordination 从属substitution rule 代换规则 [置换规则]substrate 底层语言suffix 后缀superordinate 上位的superstratum 上层语言suppletion 异型[不规则词型变化] suprasegmental 超音段的syllabification 音节划分syllable 音节syllable structure constraint 音节结构限制symbolization and verbalization 符号化与字句化synchronic 同步的synonym 同义词syntactic category 句法类别syntactic constituent 句法成分syntactic rule 语法规律 [句法规则]Syntactic Semantics 句法语意学syntagm 句段syntagmatic 组合关系 [结构段的;组合的]Syntax 句法Systemic Grammar 系统语法tag 标记target language 目标语言 [目标语言]task sharing 课题分享 [任务共享]tautology 套套逻辑 [恒真式;重言式;同义反复] taxonomical hierarchy 分类阶层 [分类层次] telescopic compound 套装合并template 模板temporal inference 循序推理 [时序推理] temporal logic 时间逻辑 [时序逻辑]temporal marker 时貌标记tense 时态terminology 术语text 文本text analyzing 文本分析text coherence 文本一致性text generation 文本生成 [篇章生成]Text Linguistics 文本语言学text planning 文本规划text proofreading 文本校对text retrieval 文本检索text structure 文本结构 [篇章结构]text summarization 文本自动摘要 [篇章摘要]text understanding 文本理解text-to-speech 文本转语音thematic role 题旨角色thematic structure 题旨结构theorem 定理thesaurus 同义词辞典theta role 题旨角色theta-grid 题旨网格token 实类 [标记项]tone 音调tone language 音调语言tone sandhi 连调变换top-down 由上而下 [自顶向下]topic 主题topicalization 主题化 [话题化]trace 痕迹Trace Theory 痕迹理论training 训练transaction 异动 [处理单位]transcription 转写 [抄写;速记翻译]transducer 转换器transfer 转移transfer approach 转换方法transfer framework 转换框架transformation 变形 [转换]Transformational Grammar 变形语法 [转换语法]transitional state term set 转移状态项集合transitivity 及物性translation 翻译translation equivalence 翻译等值性translation memory 翻译记忆transparency 透明性tree 树状结构 [树]Tree Adjoining Grammar 树形加接语法 [树连接语法]treebank 树图数据库[语法关系树库]trigram 三连词t-score t-数turing machine 杜林机 [图灵机]turing test 杜林测试 [图灵试验]type 类型type/token node 标记类型/实类节点type-feature structure 类型特征结构typology 类型学ultimate constituent 终端成分unbounded dependency 无界限依存underlying form 基底型式underlying structure 基底结构unification 连并 [合一]Unification-based Grammar 连并为本的语法 [基于合一的语法] Universal Grammar 普遍性语法universal instantiation 普遍例式universal quantifier 全称范域词unknown word 未知词 [未定义词]unrestricted grammar 非限制型语法usage flag 使用旗标user interface 使用者界面 [用户界面]Valence Grammar 结合价语法Valence Theory 结合价理论valency 结合价variance 变异数 [方差]verb 动词verb phrase 动词组 [动词短语]verb resultative compound 动补复合词verbal association 词语联想verbal phrase 动词组verbal production 言语生成vernacular 本地话V-O construction (verb-object) 动宾结构vocabulary 字汇vocabulary entry 词条vocal track 声道vocative 呼格voice recognition 声音辨识 [语音识别]vowel 元音vowel harmony 元音和谐 [元音和谐]waveform 波形weak verb 弱化动词Whorfian hypothesis Whorfian 假说word 词word frequency 词频word frequency distribution 词频分布word order 词序word segmentation 分词word segmentation standard for Chinese 中文分词规范word segmentation unit 分词单位 [切词单位]word set 词集working memory 工作记忆 [工作存储区]world knowledge 世界知识writing system 书写系统X-Bar Theory X标杠理论 ["x"阶理论]Zipf's Law 利夫规律 [齐普夫定律]阅读。
Approximate Grammar for Information ExtractionAbstractIn this paper, we present the concept of Approximate grammar and how it can be used to extract information from a document. As the structure of informational strings cannot be defined well in a document, we cannot use the conventional grammar rules to represent the information. Hence, the need arises to design an approximate grammar than can be used effectively to accomplish the task of Information extraction. Approximate grammars are a novel step in this direction. The rules of an approximate grammar can be given by a user or the machine can learn the rules from an annotated document. We have performed our experiments in both the above areas and the results have been impressive.IntroductionThis paper proposes the idea of approximate grammars and its usage towards information extraction. An informational string is a sequence of words that convey definite information. For example: The string “India won by 4 wickets at Old Trafford” tells us about a match result. A document contains several informational strings that should be extracted. For example: a document on cricket would contain several informational strings like “the match would be held tomorrow at Lords”, “the India-Pakistan match ended in a draw”, “Sachin Tendulkar made 67 in 60 balls”, etc.The information extraction system consists of the following two stages,1. Extraction of atomic information or probe. This atomic information consists of annotated texts, which fall into categories specific to the domain in consideration. For example: player-name, runs, wickets, venue, etc. could be the probes for Cricket.2. Getting informational strings from the atomicinformation or grouping. A document from which information is to be extracted is given to the probe, it first tags atomic information. (Sangal and Bansal, 2001)(Fig1. Information extraction system)The second stage groups the atomic information into several groups, which represent the extracted information. To group these atomic units, we are required to know the structure of the document and have the knowledge of all the possible informational strings. The permitted informational strings can be captured by a grammar.When a document is parsed according to the rules of the grammar, several parses are possible. A normal context-free grammar does not provide a mechanism to select the most appropriate parse from all thepossible parses. To make that possible, we need to attach a probability with every rule of our grammar. Also, in an information string we can have redundant words, words that do not contribute to the informational string. These words can be called noise. In an informational string, apart from the normal atomic information or probes, we can also have patterns that can help us to extract the informational string. These patterns can be called atomic patterns.In a document the informational units are usually close to each other. The maximum length of noise gives us the maximum possible distance between two informational units.From all the above specifications that are useful to extract information, arises the need to have a new type of grammar. Approximate grammar incorporates all the above features and gives us the power, which can be used to do information extraction efficiently.Structure of Approximate GrammarAt this point, one could visualize approximate grammar as a set of probabilistic CFG rules with additional features, which are explained in the later part of the paper. The rule of an approximate grammar can take the following form.non-terminal = terminals | non-terminals | noise | probability of the ruleterminal: probe | string | root“|” denotes OR.Features of Approximate GrammarProbability of a Rule:Probability associated with a particular rule helps the information extraction system to resolve the ambiguity that arises because of several possible parses. For example, let’s take a part of a document,“<name> Sachin Tendulkar </name> made <runs> 55 runs </runs> in that match played at <location> Eden Gardens </location> .The match saw a total of <runs> 300 runs</runs> being made and <wickets> 5 wickets </wickets> falling. “Say, we have two rules:1> IMP -> {name} made {runs} {location}2> IMP -> {location} saw {runs} {wickets}IMP: Important informational stringThe above rules would give us two parses.The probabilities of the rules attached to each of the above rule would give us the score of the final parse. The parse with the highest score would be selected to give the appropriate informational strings.Atomic strings or Words:The atomic strings or words make a rule more specific and thus increases the precision with which an informational string is extracted.For example, considering a part of the document: “<name> Kapil Dev </name> bowled <balls> 5 </balls>”The above string would be extracted using both the following rules,IMP -> {name} {balls}IMP -> {name} bowled {balls}But the second rule is stricter than the first rule and represents the information more precisely.Noise:Noise is defined as the pattern in an informational string, which has no relevance other than fact that it appears with the atomic units. So, between any two atomic units of an informational string, noise is permitted by the grammar.For example:Let, SN -> noiseThen, we give a rule asIMP -> {name} SN bowled SN {balls}We can specify the minimum and maximum lengthsof the noise; this tells us how near or far can the two atomic units or atomic patterns in an informational string are. Roughly speaking, noise cushions the occurrence of the terminals and non-terminals of a grammar production.So, ifSN -> noise; minlength=0, maxlength=20Then, there could be a maximum of 20 characters between {name} and “bowled” as well as between “bowled” and {balls} in the above rule as shown seperated by a semicolon.Terminals:The terminals can be divided into either of the following categories. The attribute ‘cat’ denotes the category.a) Probe: Type of the annotated stringsrepresenting the atomic information.Example: <probe cat=”player_name”>b) String: A particular string the user isexpecting to be in the extracted information.Example: <string cat=”cent” word=”century”>c) Root: A special type of type “string” wherethe user can specify the root-word. Words inthe target document having themorphological root as mentioned fall into thiscategory.Example: <root cat=”score” r-word=”make” |”hit” | ”score” > Then the rule: IMP-> player_name score cent; where ‘player_name’, ‘score’ and ‘cent’ are the category names of the terminals as shown above, would capture the informational string,“<player_name>Tendulkar</player_name><score>made</score> a <cent>century</cent>.Note: For the sake of simplicity, NOISE is not mentioned in the above rule. When NOISE is not mentioned, it is assumed to be there by the parser.Scoring a Parse:Each terminal is associated with a particular confidence with which it has been extracted. Scoring a parse tree takes into account the probability of the rule and the scores of its child nodes.Score of a terminal =Confidence with which a terminal is extracted.Score of a node of a parse tree =Prob. Of rule * Average score of Child nodesExample:Given P-CFG’sS-> NP VP ; prob=0.9NP -> det n ; prob=0.7VP -> v det n ; prob=0.6And for the sentence“The old man the boats”Rules from the dictionaryDet -> “the” ; conf= 0.9Adj -> “old” ; conf=0.6N -> “old” ; conf=0.3N -> “man” ; conf=0.7V -> “man” ; conf=0.25N -> “boats”; conf=0.8Generated Tree(Figure2. Generated Parse Tree for the given example) Advantages of using Approximate Grammars Resolving Ambiguity:The score of a given parse gives the information extraction system, the power to select the appropriate parse from the several possible parses.The following examples explain the above:• Consider a piece of document,“B.A. Ambedkar is highly honored in India.”The probes would tag the above in twopossible ways.One interpretation is,“<degree> B.A</degree> Ambedkar ishighly honored in <country> India</country>”Note that the production,IMP -> {degree} honored {country} ; is nota faulty production because it is valid for thefollowing sentence: “<degree> M.Tech</degree> is honored in <country> India</country> .Another interpretation is,“<name>B.A.Ambedkar</name> is highlyhonored in <country>India</country>”After training a sufficiently large corpus, onewould find out that the probability of theproduction: IMP->{name}honored {country}would have a higher probability than theproduction: IMP->{degree}honored{country}Then, the first interpretation of the probes istaken to be more appropriate, assuming thatthe confidence of extraction of atomic units isthe same.• Given the rules,IMP -> {country} {name}Prob=0.1IMP -> {name} {country}Prob=0.3Then for the piece of document,“<name> Sachin Tendulkar </name> playsfor <country> India </country> that is beingsponsored by <name> Mr. Sahara </name>”.The informational string would beSachin Tendulkar plays for India.Proximity of Atomic Units:In a document, the smaller units that build a bigger unit should be close to each other. The measure of maximum length of noise helps the system to find for units that are close to one another.For example:Given the rules,SN -> noise; minlength=0, maxlength=10IMP -> {name} SN {runs} SN {location}And the piece of a document,“<name> Sachin Tendulkar </name> made <runs> a duck </runs>. Hope he does well in <location> Calcutta </location>.”The above rule will not extract any string from the above piece of document because the length of noise between the runs and location is 22 while the maximum permitted distance is 10.Machine Learning:The machine can be used to learn the rules of approximate grammar by analyzing an annotated text. The probability of the rules and the maximum possible noise between any two atomic units is also learned. These learned rules are then applied to an unannotated text and the informational strings corresponding to the learnt rules are extracted. Implementation of Approximate GrammarsChart Parsing: Chart Parsing is employed to parse the text according to the rules mentioned. A bottom-up approach is implemented. The algorithm of chart parsing provides an efficient way of obtaining multiple parsesThe advantages of using Chart Parsing are:• It avoids multiplication of effort.• It provides a compact representation for ‘local ambiguity’• It provides a representation for ‘partial parses’Moreover, chart parsing provides a general framework in which alternative parsing and search strategies can be compared.Machine Learning to generate approximategrammarsIn this part of the paper, we discuss how we achieved the generation of an approximate grammar by learning patterns from a set of tagged documents. Depending on the value of a parameter, the generated rules can be mapped to the extent of generality of the patterns exhibited in the document.a) Clustering of the patternsAs a primary step, the patterns of the informational strings given are clustered on the same sequence of their tags. Now, for each individual cluster, we aimto generate probabilistic grammar rules whose sum adds up to 1.b) Generating grammars for a specific clusterNow that we have the set of informational strings following a specific pattern of occurrence of the atomic probes, we compute the TRI-data for the cluster. This data contains the TRI-frequency of every consecutive atomic probes. The TRI-frequency can be defined as the number of times the sub-pattern <probe1> "string" <probe2> has occurred in the cluster. The new set of stricter rules is generatedby taking combinations of the various tag offsets found in the cluster. Note that , in the above mentioned general pattern, “string” can be null. Consider the following example,1. <IMP> <name>Dravid</name> hit <runs>67 runs</runs> in the match </IMP>.2. <IMP> <name>Sachin</name> made <runs>56</runs> before dawn </IMP>.3. <IMP> <name>Laxman</name> made <runs>34</runs> at the end of the innings </IMP>. Corresponding TRI data.<name> "hit" <runs> - 1<name> "made" <runs> - 2<runs> "match" </IMP> - 1<runs> "dawn" </IMP> - 1<runs> "innings" </IMP> - 1Generated CFG’s1.IMP -> <name> NOISE <runs> prob=0.52.IMP -> <name> "hit" <runs> "match“ prob=0.05553.IMP -> <name> "hit" <runs> "dawn“ prob=0.05554.IMP -> <name> "hit" <runs> "innings“ prob=0.05555.IMP -> <name> "made" <runs> "match" prob=0.11116.IMP -> <name> "made" <runs> "dawn" prob=0.11117.IMP -> <name> "made" <runs> "innings" prob=0.1111Note: For the sake of simplicity, NOISE is not mentioned in the above rule. When NOISE is not mentioned, it is assumed to be there by the parser. (Fig 3. Generating P-CFG’s by machine learning) RHO: An important parameter in P-CFG generation, which defines a threshold below which the observed patterns in question will be discarded because of their low frequency. Disposal of such patterns is important because these patterns give rise to ambiguous productions, thereby increasing the execution time.We can extract the CFG’s corresponding to the patterns observed based on a scale of generality. This can be done by filtering the TRI’s above a threshold “RHO”. Increasing the value of this parameter results in the generation of the patterns, which are more prevalent in the texts observed. Based on the tagged documents fed as input and the nature of the extracted information, we can fine-tune this parameter to achieve more concise and relevant results.Manually specifying the rulesThe user has the flexibility to express his idea of the information, which is relevant to him through this framework of approximate grammars. In this way, the user can instruct the machine to specifically extract a particular pattern with ease.Experiments and Results Experiments were carried out in the domain of Cricket. The texts were taken from website. For obtaining the annotated texts, we have made use of filters which were developed on the cricket domain. The different filters used are : player_name, runs , wickets, balls , place, tournament, team. Training was done on 80%, and testing on the remaining 20%.We were particularly interested in the outcome and other statistical details of the matches (individual scores of individual players, bowling performancesof bowlers, partnerships, results etc.) and so we considered only the annotated parts where such information was present. Comments and other trivial informational strings regarding players were considered unimportant in our experimentation.EvaluationMachine generated rules captured the essential information in the domain of cricket. Table 1. illustrates the rules specifying just the probes as non-terminals on the right-hand side of the production. Stricter versions of some of the rules might be having strings/roots as the terminals on the right hand side of the production and are listed at the end of the table 1. The probabilities of the rules mentioned are not stated because they assume different values depending on the value of RHO. Note: For the sake of simplicity, NOISE is not mentioned between the probe non-terminals of the rules given below, but it is assumed to be there by the parser.IMP: Extracted rules.IMP-> team team runs runs tment venueIMP-> team player wickets runs player team runsIMP-> player runs balls balls team playerIMP-> runs balls player wickets player player wicketsIMP-> team runs wickets player runs player runsIMP-> team runs team venueIMP-> team venue team runs player wickets runs player wickets runsIMP-> team runs player runs player runsIMP-> player runs runs team runsIMP-> team team wickets runs player runs runsIMP-> player wickets runs player wickets runs team runs IMP-> team runs wickets team player runsIMP-> player runs player runs team ballsIMP-> player runs runs player runs team runs wickets team tment venueIMP-> player runs team wickets player playerIMP-> player runs player runs team runs runs runsIMP-> player team tment runs balls team runs team runs IMP-> player team runs wickets player player wicketsIMP-> player balls runsIMP-> player runs player runsIMP-> player wicketsIMP-> player “haul” team “beat” teamIMP-> player “help” team “beat” teamIMP-> team “regroup” player “bowl” player “ball”IMP-> team “regroup” player “paceman” player “ball”IMP-> team “open” player “bowl” player “ball”IMP-> player “hit” runs “keeper” player “wack” runs (Table 1: Machine generated rules for the domain of cricket) Evaluation Metric:Correctly extracted INFPrecision = ------------------------------Total extracted INFCorrectlyextractedINFRecall = --------------------------------------Total number of correct INFINF: Informational StringsRHO Precision Recall0.30 78 98.60.25 77.8 1000.20 76.4 99.10.15 75.4 99.3(Table 2: Results of the experiments)Table.2 shows the results obtained after extracting the informational strings in the rest 20% of the corpus.Error AnalysisThe misclassification of the probes in the training set brought down the precision to some extent.Consider the piece of misclassified text:<player_name> Hosts Hyderabad </player_name> bundled out <place> Kerala </place> at the <player_name>Gymkhana </player_name> grounds.The above annotated text would generate the production:IMP->player_name place player_name;And eventually, the above production would extract the informational string: <player_name> Patel </player_name> stunning debut at the <place> Lords </place> came as a blow to <player_name> Nasser Hussain </player_name> .The extracted informational string was not considered to be a valid one because we were specifically looking for outcome and other statistical details in a match played.By increasing the value of RHO, we could eliminate the faulty productions generated after machine learning of the training set. Unfortunately, in this process we also lost some important productions because the frequency of some patterns in the training set which occurred in the testing set were very less.Cases where RHO value was less, some valid informational strings were later discarded in the parsing process because of their low score value in the parsing process. This is because another ambiguous competitor of that production had higher frequency in the training set.ConclusionIn this paper, we have proposed a framework to extract information efficiently and conveniently, considering the user’s specifications and the loosely structured document in view. The machine learning algorithm to learn the rules of approximate grammars is giving good results as experimented. Future WorkWe would like to categorize the informational strings into separate categories and train the system to obtain the Approximate Grammar rules for each of the categories. This would enable us to identify the category of the resultant Parse and thus would be more precise. AcknowledgementDetails to be given after blind reviewing is over.References• Briscoe, T. & Carroll, J. [1996], ``AutomaticExtraction of Subcategorization fromCorpora,''MS , /users/ejb/.• Carroll, Glenn and Eugene Charniak, Split-Corpus Grammar Learning from TaggedTexts, Computer Science Department TechReport CS-94-08.• Carroll, Glenn and Eugene Charniak,Learning Probabilistic DependencyGrammars from Labeled Text, Proceedingsfrom AAAI Fall Symposium on ProbabilisticApproaches to Natural Language, AAAIPress, Cambridge, MA 1992.• Carroll, Glenn and Eugene Charniak, TwoExperiments on Learning ProbabilisticDependency Grammars from Corpora,Proceedings from AAAI-92 WorkshopProgram, AAAI Press, San Jose, CA, 1992.• Charniak, E.[1993], Statistical LanguageLearning,MIT, Cambridge, MA."Parsingwith Context-free Grammars and WordStatistics", Department of ComputerScience,Brown University, Technical ReportCS-95-28.• S.Soderland “CRYSTAL: Learning Domain-Specific Text Analysis Rules”.• Dan Klein and Christopher D. Manning. AnO(n3) Agenda-Based Chart Parser forArbitrary Probabilistic Context-FreeGrammars. Stanford Technical Reportdbpubs/2001-16, 2001.• Daniel H. Younger. 1967. Recognition andparsing of context free languages in time n3 .Information and Control, 10:189--208.• Doug Arnold- “Chart Parsing”• W.G.Lehnert, C.Cardie, D.Fisher,J.McCarthy, E.Riloff, S.Soderland -“Evaluating an Information ExtractionSystem”.• Fabio Ciravegna ,“Learning to tag forInformation Extraction from Text” .• S.Soderland- “Learning Text Analysis Rules for Domain Specific Natural LanguageProcessing” .• Roman Yangarber and Ralph Grishman. -“Machine Learning of Extraction Patternsfrom Unannotated corpora:Position statement”• Manning, C. [1993], ``Automatic acquisition of a large subcategorization dictionary fromcorpora,'' Proceedings of the 31st AnnualMeeting of the ACL.• Sangal and Bansal, “A Modular Scalable Architecture for Text Processing : from Textto Trees to Text using XML”。