1 A Morphological Analyzer for Verbal Aspect in American Sign Language

格式：pdf
大小：269.53 KB
文档页数：14

下载文档原格式

Russianmorphemeanalysis（俄语词素分析）

Russian morpheme analysis（俄语词素分析）Morpheme analysis method refers to the modern Russian view into all the living that morpheme, semantic analysis part of all the words, such as root, prefix, suffix and infix and suffix suffix, etc.. Especially in some Russian words, the structure is longer, learners generally reflect memory difficulties, such as can take it apart, indicating its connotation, memory difficulties will be solved, so as to achieve the purpose of shorthand. However, when we analyze morphemes in words, we should pay attention to the following points:(L) before you analyze a word, you must first determine the lexical meaning and grammatical meaning of the word. For example, when the die is moving with the E are long (DP has long die are from me, me t pie) and nouns (P and C and E are long. The energy-saving stove, the Russian die) classification is different.For example, when k increased, a long. "Said kappa a bone" is a derivative, is increased by kappa, die (bones) derivative, we can draw from our suffix - (alpha) K. But when it comes to "the hard core of fruit", it is a non derivative word, and naturally there is no suffix. In a series of P C, C were the same to me and are a c c e Le m from the scene of a society from allus m (Moonlight, everything is beautiful) in a sentence, the use of C were from a neutral word, describe short tailed form words, from me, were from von x that is the ending - - C. In the scene of a t e had to alpha kappa alpha kappa PI C allus o, C, C, C were (never so good) in a sentence, the use of C to C were adverbs, - C - is part of the stem and the suffix. A word that is written exactly the same, and has different methods ofdivision because of its different meanings. If m e, alpha Pi Pi t y p e me long energy-saving in 3. The state from me (to the metallurgical plant in 3). The state of C and T, and me to use energy-saving. C 3, C pgfla to m (you can wind up toys) in 3. C to the state. The former is the original words in modern Russian, namely non derived words, while the latter is composed of 3. To me, and a die (first) remove the suffix - me, die structure, and can be divided into 3 alpha and root prefix to a state.(2) the decomposition of a word, should first remove the suffix (if any) and stemming. But there is one point to note that changes the word generally have a suffix, and does not change the word is not ending, and into the stem, and are like pgfla, C (coat), and the state, which were (up), and from the scene (down), and long (Advanced Studies. Yesterday, etc.).(3) if we remove the suffix, and then remove some morphemes, you can gradually find the root, but also reached the limit of derivative classification. Finding a root is the last link in the whole process of morpheme decomposition.(4) morphemes that represent grammatical meaning may sometimes have no outward representation. As to m (house), to o (P to o p < > allus in the kind of short tailed), kappa (kappa, alpha, alpha, from me, from me. "" second painting complex lattice) etc. the grammatical meaning of the word (gender, number and case) is said in the case of no suffix, this phenomenon we call zero tail word.(5) the suffix may also be zero. As to the same 3 permitsat rootmorpheme, verb past tense forms of suffix - pi here disappeared, only in the state. The state has 3 PI, PI C E 3, the state has not only appeared in the scene of pi.(6) an effective way is divided between cognate morphemes. This approach is very helpful for us to divide the living morphemes. For example, in our analysis of the C C studies. - kappa alpha, die (shortening), can be compared to their cognate studies. The kappa kappa me (in short), analysis o y p - E - E, C, - from the scene can be compared (kappa petrels) by cognate words y P and E C (storm) and the state, deals (news). This article comes from: (7) in morphological analysis, be sure to do well, must not be mindless, mechanical division. For example, at first glance, to raise a series of 3 long (little star), alpha kappa alpha kappa pi from the same long (small spruce), and C to long kappa (the boat),the swallow (燕子) 和сорочка(衬衫) 同属一种构词模式.但这是感觉上的相象; 在звёдочка一词中, 后缀为 -очк(a), 试比较звезда; 在ёлочка一词中, 可划出两个后缀, 一个是 - och, 一个是 - (a), 因为俄语中有ель和ёлка.而在лодочка一词中只能划分出后缀 - (a), 因为在现代俄语中有лодка一词, 而无лода.但ласточка和сорочка在现代俄语中已无同根词, 故而均为非派生词.(8) 通过去掉词素, 一级级找出最近的亲属词, 也是分析词汇的一种重要方法.比如, 我们试着分析一下учительство(教师们) 这个词:第一步: 从учительство中划出词尾 - on;第二步: 划出具有集合意义的后缀 ing -.第三步: 划出表人名词后缀 tel;第四步: 划出动词后缀 n, 该词便被划分成учand tel pts - о.找出词根yч, 也就完成了该词的词素分析.(9) 无论在俄语中, 还是在世界其它语言中, 外来词与国际通用词占有重要地位.在解析这些词时, 须注意: 转自: 第一, 应从俄语的构词结构出发, 而不是从它的原生语言的构词结构出发.因为任何外来词进入俄语后, 都会受到俄语规律的约束.第二, 应该把某些外来河或国际通用词作为独立的单词或词根来看待.如лингвист (语言学家), doctor (博士),професср (教授) 等词在俄语中为非派生词, 尽管它们在来源语中为派生词.因为俄语中没有带有词根, лингв -, state - 的词.但是, 有一部分外来词及国际通用词也是可以划分的.如авто- a portrait (自画像), astro - physics (天体物理学), aero fleet (民航) 等.因为这些词的后半部分是独立的俄语词.(10) 词在历史发展过程中, 会发生一些变化.从现代俄语的角度和用历史的眼光去划分会得出不同的结果.有些词在历史上是可以划分的, 但在现代俄语中却不行.如народ (人民), природ(自然), country (祖国) harvest (收成) 等词, 如进行历史分析, 均有词根род, 但在现代俄语中却当作异根词处理, 它们都拥有自已的构词族.(11) 在分析单词时还应注意, 并非所有的前缀和后缀都同样常用.一部分词素积极参与构词过程, 而另有一部分却相反, 构成的词很有限.这样, 词素又分为能产型司素和非能产型词素.比如, 俄语中用以表示男性的名词后缀有50 多个, 而最能产的只有4 个: man "(buck), ir (nick), ec, east.(12) 俄语单词中的一些词素存在变体形式.有词根变体: to bear with me - - bohr; 后缀变体: collection - man "- years - buck; 前缀变体: - бирать- is about to take in бироать.词素变体间意义相同, 发音不同.总之, 学会对单词进行词素分析, 不仅可以帮助我们认识词的构造, 了解词的含义, 正确地书写单词, 而且将大大提高我们记忆俄语单词的速度和质量.。

酸碱浓度变送器AX43说明书

Health and Safety To ensure that our products are safe and without risk to health, the following points must be noted: 1. The relevant sections of these instructions must be read carefully before proceeding. 2. Warning labels on containers and packages must be observed. 3. Installation, operation, maintenance and servicing must only be carried out by suitably trained personnel and in accordance with the information given. 4. Normal safety precautions must be taken to avoid the possibility of an accident occurring when operating in conditions of high pressure and/ or temperature. 5. Chemicals must be stored away from heat, protected from temperature extremes and powders kept dry. Normal safe handling procedures must be used. 6. When disposing of chemicals ensure that no two chemicals are mixed. Safety advice concerning the use of the equipment described in this manual or any relevant hazard data sheets (where applicable) may be obtained from the Company address on the back cover, together with servicing and spares information.

印度碗状红菇——一个中国新纪录种（英文）

热带作物学报2021, 42(9): 2542 2548 Chinese Journal of Tropical Crops收稿日期 2021-02-23；修回日期 2021-03-20基金项目国家自然科学基金项目（No. 31770657，No. 31570544，No. 31900016）。

作者简介陈彬（1990—），男，博士研究生，研究方向：森林微生物资源遗传多样性。

*通信作者（Corresponding author ）：梁俊峰（Liang Junfeng ），E-mail ：*******************。

Russula indocatillus , a New Record Species in ChinaCHEN Bin 1, 2, SONG Jie 1, WANG Qian 1, LIANG Junfeng 1*1. Research Institute of Tropical Forestry, Chinese Academy of Forestry, Guangzhou, Guangdong 510520, China;2. Nanjing For-estry University, Nanjing, Jiangsu 210037, ChinaAbstract: Russula indocatillus was reported as new species to China. A detailed morphological description, illustrations and phylogeny are provided, and comparisons with related species are made. It is morphologically characterized by a brownish orange to yellow ochre pileus center with butter yellow to pale yellow margin, white to cream spore print, subglobose to broadly ellipsoid to ellipsoid basidiospores with bluntly conical to subcylindrical isolated warts, always one-celled pileocystidia, and short, slender, furcated and septated terminal elements of pileipellis. The combination of detailed morphological features and phylogenetic analysis based on ITS-nrLSU-RPB2 sequences dataset indicated that the species belonged to Russula subg. Heterphyllidia sect. Ingratae . Keywords: Russulaceae; new record species; phylogeny; taxonomy DOI 10.3969/j.issn.1000-2561.2021.09.014印度碗状红菇——一个中国新纪录种陈彬1,2，宋杰1，王倩1，梁俊峰1*1. 中国林业科学研究院热带林业研究所，广东广州 510520；2. 南京林业大学，江苏南京 210037摘要：本研究报道一个中国红菇属新记录种——印度碗状红菇（Russula indocatillus ）。

1.1 Overview......................................... 5

The Durm German LemmatizerPraharshana Perera and Ren´e WitteUniversit¨a t KarlsruheInstitut f¨ur Programmstrukturen und Datenorganisation(IPD)Karlsruhe,Germanywitte@a.deMay28,2006Contents1The Durm Lemmatizer51.1Overview (5)1.2Setup (6)2German Case Tagger92.1Overview (9)2.2Usage (9)2.2.1Initialization Paramaters (9)2.2.2Runtime parameters (9)2.2.3Output Annotations (10)2.3Implementation notes (10)2.3.1Probability Files (10)3German POS-based Number Tagger133.1Overview (13)3.2Usage (13)3.2.1Runtime parameters (13)3.2.2Output Annotations (13)3.3Implementation notes (13)4German Morphological Analyzer154.1Overview (15)4.2Usage (15)4.2.1Runtime parameters (15)4.2.2Output Annotations (15)4.3Implementation notes (16)5German Lemmatizer175.1Overview (17)5.2Durm Lexicon (17)5.2.1Evolving the Lexicon (18)5.2.2Manual Correction of Entries (19)5.3Usage (19)5.3.1Runtime parameters (19)5.3.2Output Annotations (19)5.4Implementation notes (19)About this documentThis document contains documentation for the Durm German Lemmatization system.You can get the latest version from a.de/˜durm/tm/lemma/.3ContentsAcknowledgmentsDevelopment of the German Lemmatizer has been supported by the German research foun-dation(DFG)within the project”Entstehungswissen”(LO296/18-1).The TIGER Treebank (Version2)has been used for training and evaluation of the Case Tagger.41The Durm LemmatizerThe Durm Lemmatization System performs morphological analysis and lemmatization for German nouns.Figure1.1:Annotations generated by the Durm Lemmatizer running in the GATE environ-ment1.1OverviewThe Durm Lemmatization system consists of a number of GATE components and resources, which have to be used within the GATE(/)architecture.It includes the following components to perform morphological analysis and lemmatization:•The Case Tagger,which adds case information(Nominativ,Genitiv,Dativ,Akkusativ) to nouns;•The POS-based Number Tagger,which adds number information(singular,plural);51The Durm Lemmatizer•The Morphological Analyser,which classiﬁes nouns into morphological classes;•The German Lemmatizer,which annotates nouns with their lemma.Additionaly,it uses information provided by two other components:•A POS-tagger for German(currently we only support the STTS tagset as used by the TreeTagger1)•The MuNPEx2noun phrase chunker for German.The Durm German lexicon is a main resource in the lemmatization system.It is an automat-ically created and updated German lexicon containing lemma,number,and case information for nouns.For a more detailed motivation,as well as the theoretical background,you should read our paper on German lemmatization(Perera and Witte,2005).1.2SetupYou should have a working GATE installation including the TreeTagger.Then set up a pipeline with the following components(Figure1.2):3Figure1.2:Sample pipeline conﬁguration for the lemmatization system1.Load GATE’s sample application for German:german+tagger.gapp(you canﬁnd itin your GATE installation under gate/plugins/german/resources/).Note:this pipeline works on the annotation set“NE,”so you’ll either have to(a)also use“NE”as the input/output annotation set for all downstream components or(b)(perhaps sim-pler)remove all references to“NE”within the GATE pipeline to make it work on the default annotation set.1http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html2a.de/˜durm/tm/munpex/3If you do not know how to add a new CREOLE repository or load new components into a pipeline,please read the GATE’s user guideﬁrst at /sale/tao/index.html.61.2Setup2.Add the MuNPEx4noun phrase chunker for German(using the main grammarﬁlede-np main.jape)3.Add a JAPE-Transducer component with the grammarﬁle DeLem/de morph main.jape4.Add the Case Tagger component(CaseTagger/build).Here you’ll have to set theinitialization parameter to the CaseProbs directory containing the probabilityﬁles.Note:To add this and the next three components from the GUI,use File→Manage CREOLE plugins→Add a new CREOLE repository and then select the indicated build directory.5.Add the Number Tagger component(Number/build)6.Add the German Morphological Analyzer component(GermanMorphologicalAnalyzer/build)7.Add the main German Lemmatizer component(GermanLemmatizer/build).Here you’llhave to set the initialization parameter to theﬁle containing the German lexicon:DE-Lexicon/delexicon.txt.8.Optionally:add a“Document Reset”component to remove the temp annotations NPFNP,PosNumber,Preposition,PN,Case,Num,and Gender(see Figure1.3)Figure1.3:You can remove temporary annotations generated by the Durm lemmatizer witha Document Reset PR componentNow load some German texts and run the pipeline.Enjoy the new annotations for Lemma and DE-Morph.4a.de/˜durm/tm/munpex/71The Durm Lemmatizer 82German Case TaggerThe German case tagger assigns a grammatical case(Nom,Gen,Dat,Akk)for each noun ina document.2.1OverviewThe case tagger is developed as an additional resource to support the Durm lemmatization system.It is a main component in the Durm lemmatizer since the lemmatizer requires the grammatical case for nouns in order to determine their base forms.It uses information pro-vided by a POS-tagger1.It takes sentences tagged for part-of-speech as input and attempts to produce the best case tag for each noun or pronoun in the sentence.The underlying tagging algorithm of the case tagger is based on the stochastic tagging algorithm generally known as the Hidden Markov Model or HMM tagger.For a morphologically complex language like German,assigning the correct case for each noun is a very difﬁcult task.For example, given the following sentence:Sie/NOM AKK trinken Wasser/NOM AKK DATAutomatically assigning a case tag to each noun or pronoun is not trivial,because the grammatical case for Sie and Wasser is ambiguous.That is,they have more than one possible grammatical case.The pronoun Sie can either be nominative or accusative and the noun Wasser can take3possible values for case,nominative,accusative,and dative respectively.The task of case tagging is to resolve these ambiguities,choosing the proper tag for the context.2.2Usage2.2.1Initialization ParamatersWhen initializing the case tagger component,you’ll have to set the following parameters: Name the component’s name,as it appears under P ROCESSING R ESOURCE.You can leave it empty,it will then default to C ASE T AGGER.probabilityFiles path to the directory,where the probabilityﬁles,listed in section2.3.1,are stored for the calculation of probabilities for stochastic case tagging.This parameter should be set when initializing this component.2.2.2Runtime parametersinputASName input annotation setoutputASName output annotation setextractAnnotations these are annotations to extract in addition to tokens for the case tagger.Since this component requires NPs in addition to tokens,set this parameter to NP.1Currently,only the TreeTagger with the STTS tagset is supported92German Case Tagger2.2.3Output AnnotationsCase The case(Nom,Gen,Dat,Akk)for every noun in a document.2.3Implementation notesThe implementation of the case tagger is based on an HMM tagger.Like other stochastic taggers,it picks the most likely tag for the noun based on learned probabilities of a training corpus.The required probabilities are located asﬁles in the directory given as a parameter when initializing the component.HMM taggers choose the tag sequence that maximizes the following formula:P(word|tag)∗P(tag|previous n tags)(2.1) HMM taggers generally choose a tag sequence for a whole sentence rather than for a single word.The case tagger is based on a trigram-HMM and chooses the tag t i for word w i that is most probable given the previous two tags t i−1and t i−2and the current word w i:P(t j|t i−1t i−2)P(w i|t i)(2.2)t i=argmaxjThe interface to the case tagger deﬁnes one method,String[]getCaseTags(Document gate-doc,ArrayList tokens),it accepts a GATE document and an ArrayList of tokens as parameters. The list of tokens contains a sentence,which contains tokens that are annotated for part-of-speech.This method returns an array of strings that contains the grammatical case for each noun in the sentence in the same order as the nouns in the sentence.The underlying implementation of the case tagger calls the methods in the Viterbi interface,which deﬁnes the functionality of the Viterbi algorithm2,whichﬁnds the best sequence of case tags for the nouns in the sentence.2.3.1Probability Filesn1counts.txt Unigram countsn2counts.txt Bigram countsn3counts.txt Trigram countscontext2.txt Bigram probabilitiescontext3.txt Trigram probabilitiesWordStat.txt Lexical probabilities P(NN∨NE|tag)for normales Nomen or Eigennamen adjstat.txt Lexical probabilites P(ADJA|tag)for attributives Adjektivapprstat.txt Lexical probabilites P(APPR|tag)for Pr¨a positionapprartstat.txt Lexical probabilites P(APPRARTS|tag)for Pr¨a position mit Artikelartstat.txt Lexical probabilites P(ART|tag)for bestimmter oder unbestimmter Artikel pdatstat.txt Lexical probabilites P(PDAT|tag)for attribuierendes Demonstrativepronomen pidatstat.txt Lexical probabilities P(PIDAT|tag)for attribuierendes Indeﬁnitepronomen mit Determiner2/wiki/Viterbi_algorithm102.3Implementation notes piatstat.txt Lexical probabilities P(PIAT|tag)for attribuierendes Indeﬁnitepronomen ohne Determinerpposatstat.txt Lexical probabilities P(PPOSAT|tag)for attribuierendes Possesivepronomen pperstat.txt Lexical probabilities P(PPER|tag)for irreﬂexives Personalpronomenprfstat.txt Lexical probabilities P(PRF|tag)for reﬂexives Personalpronomen Visit the JavaDoc Documentation../../../Gate/CaseTagger/doc/javadoc/index.html112German Case Tagger 123German POS-based Number TaggerThe German POS-based number tagger determines the number of the subject noun of a sentence.3.1OverviewThe POS-based number tagger has been developed as an additional resource to support the Durm lemmatization system.It analyzes a whole sentence using the part-of-speech information provided by a German POS-tagger1and the case information provided by the case tagger in order to determine the number of the subject noun of a sentence. Determining the number of the subject is based on a basic grammatical rule in German that says,the number of the subject should agree with the number of the main verb in the sentence.With the help of the case tagger weﬁnd the subject(case:Nom)of the sentence and with the help of the part-of-speech tagger weﬁnd the main verb.Since we have both the main verb and the subject,we apply a small heuristic to determine the number of the main verb.This is done by checking the sufﬁx of the main verb,since in German most of the plural verbs have the sufﬁx-en or-n.In this way,we determine the number of the main verb and in turn the number of the subject noun.3.2UsageIn a pipeline,this component must be inserted after a POS-tagger and the Case tagger,since it uses information given by these two components.3.2.1Runtime parametersinputASName input annotation setoutputASName output annotation setextractAnnotations these are annotations to extract in addition to tokens for the POS-based number tagger.Since this component requires NPs in addition to tokens,set this parameter to NP.3.2.2Output AnnotationsPosNumber The number(Sg or Pl)for the subject in a sentence.3.3Implementation notesInitially this component determines the main verb of the sentence by looking at the POS tags.This is done by iterating through each token of a sentence and examining the POS 1Currently,only the TreeTagger with the STTS tagset is supported133German POS-based Number Taggerof each token until a token with the POS VVFIN2,VAFIN3,or VMFIN4is found.When the programﬁnds a token with one of these POS tags,this token is assigned as the main verb of the sentence,since in German the POS of a main verb is VVFIN,VAFIN,or VMFIN.After determining the main verb,the program determines the subject of the sentence by looking at the grammatical case tag for each noun.This is again done by iterating through the tokens in a sentence until a match is found for POS=NN and Case=Nom,i.e.,a noun with the case tagged as nominative.Afterﬁnding the subject,it applies the heuristic explained above, looking at the sufﬁx of the verb,in order to determine the number of the subject.The interface to the POS-Number tagger deﬁnes the method public ng.String get-Number(java.Util.ArrayList tokens)method,where it accepts an ArrayList of tokens as its arguments to the method and returns a string representing the position of the subject of the sentence and whether the subject is singular or plural.For example,the string”30”means that the3rd noun of the sentence is singular.The last digit of the string deﬁnes the number, i.e.,0=singular and1=plural.Visit the JavaDoc Documentation../../../Gate/Number/doc/javadoc/index.html2ﬁnites Verb,voll3Inﬁnitive,aux4ﬁnites Verb,modal144German Morphological AnalyzerThe German morphological analyzer assigns number and gender for nouns in a document.4.1OverviewThe morphological analyzer considers the context of nouns by analyzing NPs given by the multi-lingual NP chunker(MuNPE)in order to provide the morphological classiﬁcation re-quired for lemmatization.The Durm lemmatizer processes nouns considering their morpho-logical features such as number,gender,and their surrounding context.Since information required for lemmatization regarding number and gender cannot be solely determined from the word form itself,the lemmatization algorithm captures the context of nouns by analyzing NP chunks.The algorithm processes NPs with:•determiners,•determiners and modiﬁers,•modiﬁers only,•without determiners or modiﬁers,in order to compute the features number and gender.This component uses rules and heuristics based on the German grammar.4.2UsageSince the main input to this component are NP chunks,this component must be inserted to the pipeline after an NP chunker.4.2.1Runtime parametersinputASName input annotation setoutputASName output annotation setextractAnnotations these are annotations to extract in addition to tokens for the German morphological analyzer.Since this component requires NPs in addition to tokens,set this parameter to NP.4.2.2Output AnnotationsNumber The number for nouns in the document.Gender The gender for nouns in the document.154German Morphological Analyzer4.3Implementation notesThis component processes nouns in different ways with respect to their context information given by the NP chunker.To facilitate this kind of processing,the interface to the lemmati-zation algorithm deﬁnes one method,public MorphologyImpl classifyMorphology(Annotation token,gate.Document doc),which takes a token as an argument within the currently pro-cessing document and returns an object of type MorphologyImpl,which holds information regarding number,gender etc.This method is then implemented in different ways in differ-ent subclasses in the inheritance hierarchy.In order to decouple the interface from its implementation so that the two can vary independently,we have employed the Bridge design pattern.The class Morphology de-ﬁnes the abstraction interface,which maintains a reference to an object of type imple-mentor and the class GermanMorphology extends the interface deﬁned by the Abstraction. The class MorphologyImpl deﬁnes the interface for implementation classes and its sub-classes implement the Implementor interface and deﬁnes their concrete implementation. Visit the JavaDoc Documentation../../../Gate/LemmatizationAlg/doc/javadoc/index.html165German LemmatizerThe German Lemmatizer lemmatizes nouns in a document based on the morphological fea-tures number,gender,and case based on the morphological classes generated by the German Morphological Analyzer and lookups in the Durm lexicon.Additionally,it inserts new entries into the Durm lexicon and updates existing entries,allowing the lexicon to evolve in both coverage and accuracy.5.1OverviewThis component is the last component within the Durm lemmatization system,where the actual lemmatization and lexicon generation take place.It determines the lemma of nouns or possible lemma candidates for them by applying a simple algorithm,depending on their morphological classes as given by the German Morphological Analyzer.It then uses the lemma given by the lemmatizer to update the lexicon.When it updates the lexicon it also tries to correct the existing entries in the lexicon as well as the entry that is currently entered to the lexicon from the entries that are already in the lexicon.The input to this component is a noun with its morphological features number,gender,and case.Based on these features the lemmatizer determines the lemma or lemma candidates1. Afterwards the noun with the lemma including other morphological information is used to update the lexicon.5.2Durm LexiconThe Durm lexicon is generated automatically from nouns processed by the German Lemma-tizer.It grows by updating itself,learning correct values for the lexical entries.The lexicon stores full forms of words with their base form or lemma and other morphological features such as number,gender,and case.Additionally,it also holds information on:•The number of times that the entry has been found when generating the lexicon•Inserted time•Modiﬁed time•Reference to aﬁle,which speciﬁes documents,where the entry has been found•A lock to specify,whether the entry’s lemma is correct or manually corrected,and therefore,needs not to be updated.An example entry in the lexicon is shown below:Kinder Pl Masc Nom.Akk.Dat Kind10227/7/200520:14:5221/10/20058:47:1936248locked1Lemma candidates are generated for nouns with irregular morphological features,for example,nouns with um-lauts.The correct lemma for these nouns can be identiﬁed,when the same noun appears again in a different context175German LemmatizerTheﬁle reference(36248)points to the an entry in an auxiliaryﬁle holding information about the documents containing the word:36248file:/home/user/Testdata/TestCorpus3/Test100.txtfile:/home/user/Testdata/TestCorpus3/Test102.txt..file:/home/user/Downloads/Spiegel/12.05.2005wwwww.txtCurrently the lexicon is available in plain text format.We are also working on making it available in XML format.5.2.1Evolving the LexiconThe lexicon has the capability of self-correction.This feature has been implemented by lexicon update procedures.These procedures are illustrated using examples.Updating LemmasIf a new word to be inserted has more than one lemma candidate,the lexicon tries to assign the correct lemma for this new word by looking at the lemmas that are already in the lexicon:Current state of the lexicon(lemma only)Land LandLandes LandNew EntryL¨a nder L¨a nde.L¨a ndState of the lexicon after updateLand LandLandes LandL¨a nder LandIn the same way,if a new word to be inserted has the correct lemma,the lexicon tries to update the words in the lexicon that have more than one lemma using the lemma of the new word:Current state of the lexicon(lemma only)L¨a nder L¨a nde.L¨a ndL¨a ndern L¨a nder.L¨a nde.L¨a ndNew EntryLandes LandState of the lexicon after updateLandes LandL¨a nder LandL¨a ndern LandAutomatic Error CorrectionThe lemmatization algorithm may produce errors,for example,a plural noun wrongly tagged as singular may not be lemmatized,resulting in a wrong entry.While the lexicon evolves, such errors produced by the algorithm are corrected automatically.If a word that has a wrong entry in the lexicon is entered again with the correct lemma,the word itself and all its inﬂectional forms will be updated with the correct lemma:185.3UsageCurrent state of the lexicon(lemma only)Jahr JahrJahre Jahre(wrong)New EntryJahren Jahre.JahrState of the lexicon after updateJahr JahrJahre Jahre(wrong)Jahren Jahre.Jahr(two possibilities)New EntryJahre Jahr(correct lemmatization)State of the lexicon after updateJahr JahrJahre JahrJahren Jahr5.2.2Manual Correction of EntriesSome entries in the lexion may need to be corrected manually or some manual inspection may be done to avoid the system from updating correct lemmas.Those entries that have been manually corrected or have been determined as correct will be locked.Locked entries are not be processed by lexicon update algorithms.5.3UsageThis component must run after case tagger,POS-based number tagger,and the morphologi-cal analyzer components.When initializing the case tagger component,you’ll have to set the following parameters: lexiconPath path to the directory,where the Durm lexicon is stored.This parameter must be set when initializing this component.5.3.1Runtime parametersinputASName input annotation setoutputASName output annotation set5.3.2Output AnnotationsLemma The lemma produced by the lemmatizer for nouns in a document.DE-Morph An annotation containing values for number,gender,and case for nouns in a document(Figure5.1).5.4Implementation notesThe lexicon is loaded when the component is initialized.The updates to the lexicon are written to the lexicon during run-time.The lexicon entries are stored in hash tables.In order to support self-correction and fast updates,three hash tables are employed.The ﬁrst hash table lexiconEntries stores lexicon entries,pointing to all its features like number, gender,case,lemma etc.,the next hash table entriesLemma only to the lemma,and the last195German LemmatizerFigure5.1:Example output annotation generated by the Durm lemmatizerhash table lemmaEntries contains lemmas in the lexicon pointing to their respective entries. The lexicon update algorithms are coupled with these hash tables.Due to its higher accuracy,the lemma produced by the lexicon has precedence over the lemma produced by the algorithm,if both are bale to determine the lemma.Visit the JavaDoc Documentation../../../Gate/GermanLemmatizer/doc/javadoc/index.html20BibliographyPraharshana Perera and Ren´e Witte.A Self-Learning Context-Aware Lemmatizer for German. In Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing(HLT/EMNLP2005),pages636–643,Vancouver, British Columbia,Canada,October6–82005.Association for Computational Linguistics. /anthology/H/H05/H05-1080.21。

词汇学试题及答案

词汇学试题及答案【篇一：词汇学试题】ss=txt>i choose the best answer from the four choices. (30‘)1. the M sll in —drumsll is ___ .a. a free morphemeb.a stemc. a rootd.an inflectional affix2. a word is the combi nation of form and _______ ・a. spellingb. writingc. meaningd. denoting3. trumpet is a(n) _____ motivated word・a. morphologically b semanticallyc. phoneticallyd. etymologicall4. ____ i s a pair of emotive synonyms・a. —dadll and —fatherllb.—flatll and —apartmentllb. c.—meanil and —frugallld.—chargell and —accusell5. the word —Ianguagell is sometimes used to refer to the whole of a person's language.this is called _______ ・a. scientific Ianguageb.idiolectb. c.colloquial language d.formal language6. the meaning of the word fond changed from foolish to affectionate by mode of ______ .a. extensionb. narrowingc. elevationd. degradation7. degradation can be illustrated by the followingexample ____ .b. a. lewd —> ignorant b. silly —> foolishc. c・ Iast —> pleasured・ knave —> boy8. english lexicology embraces morphology, semantics, etymology, stylistics and _____ ・a. linguisticsb・ pragmaticsc・ Iexicographyd・ phonology 9. which of the following is incorrect?a —airmailll means —mail by airllb. —reading-lampll means —lamp for readingllc. —green hornil is the horn green in colord. —hopelessll is —without hopeIIlO.which group of the following are perfect homonyms?a. dear (a loved person)—deer (a kind of animal)b. bow (bending the head as a greeting)—bow(the device used forshooting)c. bank(the edge of the river)—bank (an establishment for money business)d. right (correct)—write (put down on paper with a pen)11. the following are the main sources of homonyms excepta. change in meaningb. change in sound c ・change in spelling d. borrowing42. antonyms can be classified into three major groups except ・a. evaluative termsb. contrary termsc. complementary termsd. conversive terms13. —parent/child, husband/wife, predecessor/ successorllarea. contrary termsb. contradictory termsc. conversive terms d・ complementary terms14, _________________________________________ there are2 main process of sense -shift except ______________ .a. radiationb. concatenationc. borrowing45. according to morphology, there are 2 types of classifications except ____ ・a. root antonymsb. derivative antonymsc・ contraries46. there are derivative antonyms except____ ・a pleasant-—unpleasant b. polite---impolitec. war-一antiwar d・ large一一small17. there are complementary antonyms except ___ ・a. child----girlb. single—marriedc. dead-™alived. brother—sister48. there are 3classifications of homonyms except _____ ・a. perfect homonymsb. homographsc. homophonesd. contrary homonyms・19. modern english is derived from the Ianguage of earlytribes.a. greekb. romanc・ italiand. germanic20. the prehistoric indo-europea n pare nt Ian guage is thoughtto be a highly _____ Ianguage・a. inflected b・ derivedc・ developedd・ analyzed4.in modern english one may find some words whose soundssuggest their ____ ・2」exical meaning itself has two components : conceptual meaning and _______________ .3. ___ t he meanings of many words often relate directly to their _____ ・ in the words the history of the word explains the meaning of the word・4. part of speech of words, singular and plural meaning of nouns, tense meaning of verbs all belong to _______ meaning.5.1 exicology is a branch of linguistics, inquiring into the origins and _______________ of words・6. generally speaking,linguistics is the ______ study ofIan guage ・7. there are two main approaches to study of english lexicology,that is ___ and ____ ・8・“tulip”and “rose”，are ____ of <<flower,,.u flower,,is the superordinate term and u tulip,,,u rose n are the _______ term.8. at the beginning of the fifth century britain was invaded by three tribes from the northern europe:angles, _____ and _____ 9. four group of loanword s ______ , ________ , _____ a n d _______ .iii. put the following words into the appropriate blanks.(4O') flock herd school troop pride1. a __ of cattle2.a ____ o f monkeys3. a __ of lions4.a ___ of sheep5. a __ offishiv. judge whether each of the following statements is true or false.(24. 『elations between meanings of words can be synonymy, antonymy or hyponymy.2. in semantics, meaning of Ianguage is considered as the intrinsic and inherent relation to the physical world of experie nee.3. grammatical meaning refers to the part of the word-meaning which indicates grammatical concepts・4. the connotative meaning is also known as connotations, which are generally found in the dictionary.5. —male/female, present/absentllare contrary terms・v. define the following terms.(2,+4,=6,)1. word2. motivatio nvi. answer the following questions .(6,+6,+8,=20,) 1 ・ what is the d iff ere nee betwee n homonyms and polysemy? how to differentiate them?2. how do linguists divide the history of the english language for analysis?3. discuss some of the characteristics of antonyms.答案1. d.2.c3.c4.c5.b6.c 7b 8.c 9.c 10 c 11.a12. a 13.c 14.c 15.c 16.d 17.a 18.d 19.d 2O.aii.1. meaning2.associated meaning3.origins4.grammatical5. meani ngs6. s cientific7.synchronlc,diachronic 8,hyponymys, superordinate8. sax on s,jutes9. aliens, denizens,translation・loans,semantic borrowings • • •IIIherd troopprideflockschooliv1. t2.f3.t4.f5.tV・1. a word is a minimum free form,that is to say,the smallest form that may appear in isolation・2. motivation acco unts for the conn ection betwee n the linguistic symbol and its meaning.most words can said to benon-motivated.that is,the conn ection of the sign and meaning does not have logical connectionexplanation.neverthelss,english does have words whose meanings can be explained to a certain extent.vi.1 ・ homonyms refer to d iff ere nt words which happe n to share the same form and polysemy refer to the fact that the same word has several distinguishable meanings・ by seeing their etymology, we can distinguish them, i. e. homonyms are from d iff ere nt sources while a polysemy is from the same source which has acquireddifferent meanings in the course of development. the secondprincipal consideration is semantic relatedness・ the various meaning of a polysemy are correlated and connected to do with one another, additionally, In dictionary, a polysemy has its meanings all listed under one headword whereas homonyms are listed as separate entries・2. three periods in the development of english language (vocabulary)1) old english or anglo-saxon period (449-1100)1 much of the old english vocabulary was borrowed from latin 如bargain, cheap, inch, pound; cup, dish, wall, wine, etc2 old english was a highly in fleeted language .it has a complete system of declensions of words2) middle english period ( 1100-1500 )1 french influence and norman conquest in 1066law and government administration: military affairs> religion、art 2 middle english is becoming from highly inflected language to analytic language3) modern english period (1500-)【篇二：词汇学考试题目】1.in old english there was ______ agreement between sound form.()a moreb. littlec. lessd. gradual2. both Idee and cceld are ______ ・()a. general dictionariesb monolingual dictionariesc. both a and bd. neither a and b3. the word miniskirt is ______ ・()a. morphologically motivatedb etymologically motivatedc. semantically motivatedd. none of the above4. the most important way of vocabulary development in present-dayenglish is _______ ・()a borrowingb. semantic changec. creation of new wordsd. all the above5. beneralization is a process by which a word that originallyhad a specialized meaning has now become ________ ・()a. generalizedb. expandedc. elevatedd・ degraded6. _________________________ some morphemes have as they are realized by morethan one morph according to their position in word.()a. alter native morphsb. single morphsc. abstract unitsd. discrete units7.old english vocabulary was essentially _______ with a number of borrowings from latin and Scandinavian・()a. italicb germanicc. Celticd. hellenica. semanticsb. grammarc. phoneticsd. Iexicology9.if two main constituents of an idiom share the same initial sound, it is called ____________ .()a. repetitionb. alliterationc. rhymed. none of the abovelO.which of the following words is a functional word?()a. oftenb. neverc. althoughd. desk41. _______________________________ rhetorical features are shown in such respects of phonetic and lexical manipulation as well as _____________________________ ・()a. semantic unityb. structural stabilityc. idiomatic variationd. figure of speech12.the advantage of classifying idioms according to grammatical functions is to _________________ .()a. use idioms correctly and appropriatelyb understand idioms correctlyc. remember idioms quicklyd. try a new method of classification13. borrowing as a source of homonymy in english can be illustrated by _______ .()a. long (not short)b. ball (a dancing party)c. rock (rocknroll)d. ad (advertisement)14. the change of word meaning is brought about by the following internal factors except _______ .()a. the influx of borrowingb. repetitionc. analogyd ・ shortening15. w hich of the following is not a comp orient of linguistic context?()a. words and phrases ・b. sentencesc. text or passaged. time and placeii. match the words or expressions in column a with those in column b according to 1 )types of meaning changes; 2)types of meaning;3)language branches and 4)meaning and context. (10%)16. seandinavian ( ) l (place where things are made) 22. participants ( ) g.determined23. difference in denotation ( ) h.pigheaded24. appreciative ( ) i.non-linguistic25. pejorative ( ) j.iron (a device for smoothing clothes)iii. study the following words or expressions and identify 1) types of bound morphemes underlined, and 2) types of word formation or prefixes. (20%))17. germanic () 18. extension () 49.narrowing () 21. ambiguity () b. grammaticalc.d ouble meaning d.s wedish f. dutch27. mote I ()()29. blueprint ()30. preliminaries ()31. southward ()32. demilitarize ()33. hypersensitive ()34. retell ()35. multi-purposes ()iv. define the following terms. (10%)36. acr onymy37. native words38. elevatio n39. stylistic meaning40. monolingral dictionaryV・answer the following questions. your answers should the clear and short・ write your answers in the space given below・(10%)41 ・how many types of motivation are there in english? give one example for each type・42. what are the major sources of english synonyms? illustrate your points・Vl.analyze and comment on the following. write your answers in the space given below・(20%)43. a nalyze the morphological structures of following words and point out the types of the morphemes.recollection, nationalist, unearthly英语词汇学试题参考答案I. (30%)1. a2.c3.a4.c5.a6.a7.b8.d9.b 10.c 11.d 12.a 13.b 14.b 15.d II. (10%)16. d17. f18. a19. j20. b21. c22.i23. e24. g25. hm.(2o%)26. bound root27. (head+tail) blending28.inflectional affix/morpheme30. full conversion31. derivational suffix32. derivation33. prefix of degree34. derivational prefix35. number prefixIV. (10%)36. the process of forming new words by joining the initial letters of names of organizations or special noun phrases and technical terms・37. n ative words, also known as anglo-saxon words, are words brought to britian in the 5th century by the germanic tribes・38. the process by which words rise from humble beginnings to positions of importanee.39. the distinctlve stylistic features of words which make them appropriate for different context.40. a dictio nary writte n in one language, or a dicti on ary in which entries are defined in the same Ianguage.V. (10%)41. there are four types of motivation:1) onomatopoeic motivation, e.g. cuckoo, squeak, quack, etc.2) morphological motivation, e.g. airmail, reading-lamp, etc.3) semantic motivation, e.g. the mouth of the river, the foot of the mountain, etc.4) etymological motivation, e.g. pen, laconic, etc.42. key points:borrowing; dialects and regional english; figurative and euphemistic use of words; coincidenee with idiomatic expressions.VL(20%)43.1) each of the three words consists of three morphemes, recollection (re+collect+ion) ,nationalist(nation+al+ist) ,unearthly (un+earth+ly).2) of the nine morphemes, only collect,nation and earth are free morphemes as they can exist by themselves・3) all the rest re-,-ion，-al,-ist,un・ and -ly are bound as none of them can stand alone as words・【篇三：英语词汇学试题】write the terms in the blanks according to the definitions・(20 points)4. a minimal meaningful unit of a language ()2. one of the variants that realize a morpheme ()3. a morpheme that occurs with at least one other morpheme ()4. a morpheme that can stand alone ()5. a morpheme attached to a stem alone ()6. an affix that indicates grammatical relations ()7. an affix that forms new words with a stem or root ()8. what remains of a word after the removal of all affixes ()9. a form to which affixes of any kind can be added ()40. the study of the origins and history of the form and meaning of words () ii. form negatives pf each of the following words by using one of these prefixes dis-, il-, im-, in-, ir-, non・，un-. (40 points) smoker capablepractical obey security relevant mature ability officially willingnesslegal agreement logicalloyal convenientathleic moral regularhonest likeiii. decide whether the following statements are true or false・ (20 points)english is more closely related to german than french.2. old english was a highly inflected Ianguage・3. middle english absorbed a tremendous number of foreign words but withlittle change in word endings・4. conversions refers to the use of words of one class as that of a different class・5. words mainly invoIved in conversation are nouns, verbs, and adverbs.6. motivation explains why a particular form has a particular meaning ・7. unlike conceptual meaning, associative meaning is unstableandin determinate.8. perfect homonyms share the same spelling and pronounciation ・9. contradictory terms do not show degrees・10. antonyms should be opposites of similar intensity.iv. study the sentences below and give and antonyms to the word in bold type in each context. (20 points)4. the discussion enabled them to have a clear idea of the nature of the problem.2. they are faced with clear alter natives ・3. his grandfather's mind was not clear during the time he made the will.4. i'd like to get a clear plastic bag to carry this・5. wash the substances with clear cold water.6. the singefs voice remai ned pure and clear throughout the eveni ng.7. all colors were clear, the river below her was brilliant blue・8. her eyes behind the huge spectacles are clear andun troubled ・9. now that Pve told her everyth!ng, i can leave with a clear con scie nee.10. he is a shortish man of clear complexion.参考答案英语词汇学i. 1.morpheme 2. allomorph 3. bound morpheme 4. free morpheme 5. affix6. inflectional affix7. derivational affix8. root9. stem 10. etymology11. n onsmoker, in capable, impractical, discovery, insecurity, irrelevant, immature,inability/disability, unofficially, unwillingness, illegal, disagreeme nt, illogical, disloyal, inco nv enient, non athletic, immoral, irregular, dishonest, dislikeiii. l.t 2.t 3.f 4.t 5.f 6.t 7.t 8.t 9.t 10.tiv. 1. confusing 2. ambiguous 3. muddled 4. opaque 5. dirty6. harsh7. dull8. shifty9. guiltylO. blemished。

专业英语八级英语语言学知识(形态学)模拟试卷1(题后含答案及解析)

专业英语八级英语语言学知识（形态学）模拟试卷1(题后含答案及解析)题型有： 3. GENERAL KNOWLEDGEPART III GENERAL KNOWLEDGE (10 MIN)Directions: There are ten multiple-choice questions in this section. Choose the best answer to each question.1．______is the study of the way in how morphemes, representation of sounds, are arranged and combined to form words.A．LexicologyB．MorphologyC．PhonologyD．Morphological rule正确答案：B解析：题干是对形态学的解释。

A项为词典学，C项为音位学，D项为词素音位规则。

知识模块：形态学2．Which of the following is CORRECT?A．Content words of a language are sometimes called closed class words.B．New words can be added to content words regularly.C．Open class words consist of “grammatical” or “functional” words.D．The number of such words as conjunctions, prepositions, articles and pronouns is large and unstable, since many new words are added.正确答案：B解析：因为经常有很多新词能增加进人实义词范畴，它们有时也称为开放性词类，故B项正确；而连词、介词、冠词和代词等由“语法性的”或“功能性的”词构成，而此类词相对量少，由于通常不添加新词，所以它们也被称为封闭性词类。

GATE功能介绍(对外)

Noun Phrase Chunker Marking noun phrases in text.
功能介绍
OntoText Gazetteer
与 ANNIE Gazetteer 结果相似，但是算法不同。
Flexible Gazetteer The Flexible Gazetteer provides users with the exibility to choose their own customized input and an external Gazetteer. Gazetteer List Collector
功能介绍
RASP Parser RASP (Robust Accurate Statistical Parsing) is a robust parsing system for English. 包括以下四个PR: RASP2 Tokenizer RASP2 POS Tagger RASP2 Morphological Analyser RASP2 Parser: creates multiple dependency annotations to represent a parse of each sentence. RASP is only supported for Linux operating systems. SUPPLE Parser SUPPLE is a bottom-up parser that constructs syntax trees and logical forms for English sentences. Need a Prolog interpreter. Stanford Parser
与 standard JAPE transducer类似 Plugin

分析全自动血细胞分析仪联合血涂片细胞形态学检测在血常规检验中的应用

分析全自动血细胞分析仪联合血涂片细胞形态学检测在血常规检验中的应用彭伟香彭伟香，，付仰红付仰红，，杨雪河北燕达陆道培医院检验科临检微生物室，河北廊坊 065201摘要目的探析在血常规检验中应用全自动血细胞分析仪、血涂片细胞形态学技术联合检测的临床价值。

方法运用简单随机抽样法选取2022年1—12月河北燕达陆道培医院检验科临检微生物室接受血常规检验的90例感染性疾病患者作为研究对象，所选患者在血常规检验时均应用全自动血细胞分析仪、血涂片形态学检测，记录与对比全自动血细胞分析仪、血涂片形态学单一检测与联合检测的常规血细胞阳性检出率、异型淋巴细胞阳性检出率、有核红细胞阳性检出率、中毒颗粒/空泡变性阳性检出率。

结果全自动血细胞分析仪联合血涂片细胞形态学对中性粒细胞、嗜酸性细胞、淋巴细胞、嗜碱性细胞、单核细胞阳性检出率高于单一检测，差异有统计学意义（P <0.05）。

全自动血细胞分析仪联合血涂片形态学检测对异型淋巴细胞、有核红细胞、中毒颗粒/空泡变性阳性检出率分别为97.78%、91.11%、94.44%，高于全自动血细胞分析仪的68.89%、63.33%、75.56%、血涂片形态学的78.89%、71.11%、78.89%单一检测，组间对比，差异有统计学意义（χ2=26.080、19.812、12.944，P <0.05）。

结论应用全自动血细胞分析仪联合血涂片细胞形态学检测开展血常规检验，检测的阳性率显著高于单一检测，可为患者病情的诊断与治疗提供参考依据。

关键词全自动血细胞分析仪；血涂片细胞形态学检测；血常规检验中图分类号 R 446 文献标志码 Adoi10.11966/j.issn.2095-994X.2023.09.07.09Analysis the Application of Automatic Blood Cell Analyzer Combined with Blood Smear Cell Morphology Detection in Blood Routine ExaminationPENG Weixiang, FU Yanghong, YANG XueMicrobiology Laboratory, Department of Clinical Laboratory, Yanda Ludaopei Hospital, Langfang, Hebei Province, 065201 ChinaAbstract Objective To explore the clinical value of automatic blood cell analyzer combined with blood smear cell morphology in blood routine examination. Methods A total of 90 patients with infectious diseases were selected by simple random sampling from January to December 2022 in the clinical microbiology room of Yanda Ludaopei Hospital, Hebei Province. All the selected patients were tested by automatic blood cell analyzer and blood smear morphology during blood routine examination. The positive rates of conventional blood cells, heterologous lym‐phocytes, nucleated red cells and toxic particles/vacuolar degeneration were recorded and compared by automatic blood cell analyzer, blood smear morphology single test and combined test. Results The positive detection rates of neutrophil, eosinophile, lymphocyte, basophilic cell and monocyte by automatic blood cell analyzer combined with blood smear cell morphology was higher than that of single detection, and thedifference was statistically significant (P <0.05). The positive rates of the automatic blood cell analyzer combined with the morphological test of blood smear were 97.78%, 91.11% and 94.44%, respectively, for heterologous lymphocyte, nucleated red cell, toxic particle/vacuolar de‐generation, were higher than that those of automatic blood cell analyzer 68.89%, 63.33%, 75.56% and blood smear morphology 78.89%, 71.11%, 78.89%, and the difference was statistically significant (χ2=26.080, 19.812, 12.944, P <0.05). Conclusion The positive rate of blood routine test by automatic blood cell analyzer combined with morphological detection of blood smear is significantly higher than that by single detection, which can provide reference for diagnosis and treatment of patients' disease.* 论著 *收稿日期：2023-05-03；修回日期：2023-05-24作者简介：彭伟香（1981-），女，本科，主管检验师，研究方向为血液病临床指标。

英文玉米淀粉的性质

Some properties of corn starches II:Physicochemical,gelatinization,retrogradation,pasting and gel textural propertiesKawaljit Singh Sandhu,Narpinder Singh*Department of Food Science and Technology,Guru Nanak Dev University,Amritsar,Punjab 143005,IndiaReceived 29August 2005;accepted 30January 2006AbstractThe physicochemical,thermal,pasting and gel textural properties of corn starches from diﬀerent corn varieties (African Tall,Ageti,Early Composite,Girja,Navjot,Parbhat,Partap,Pb Sathi and Vijay)were studied.Amylose content and swelling power of corn starches ranged from 16.9%to 21.3%and 13.7to 20.7g/g,respectively.The enthalpy of gelatinization (D H gel )and percentage of retrogradation (%R )for various corn starches ranged from 11.2to 12.7J/g and 37.6%to 56.5%,respectively.The range for peak viscosity among dif-ferent varieties was between 804and 1252cP.The hardness of starch gels ranged from 21.5to 32.3g.African Tall and Early Composite showed higher swelling power,peak,trough,breakdown,ﬁnal and setback viscosity,and lower D H gel and range of gelatinization.Pear-son correlations among various properties of starches were observed.Gelatinization onset temperature (T o )was negatively correlated to peak-,breakdown-,ﬁnal-and setback viscosity (r =À0.809,À0.774,À0.721and À0.686,respectively,p <0.01)and positively correlated to pasting temperature (r =0.657,p <0.01).D H gel was observed to be positively correlated with T o ,peak gelatinization temperature and (T p )and gelatinization conclusion temperature T c (r =0.900,0.902and 0.828,respectively,p <0.01)whereas,it was negatively corre-lated to peak-and breakdown-(r =À0.743and À0.733,respectively,p <0.01),ﬁnal-and setback viscosity (r =À0.623and À0.611,respectively,p <0.05).Amylose was positively correlated to hardness (r =0.511,p <0.05)and gumminess (r =0.792,p <0.01)of starch gels.Ó2006Elsevier Ltd.All rights reserved.Keywords:Corn starch;Physicochemical;Thermal;Pasting;Gel texture1.IntroductionCorn starch is a valuable ingredient to the food industry,being widely used as a thickener,gelling agent,bulking agent and water retention agent (Singh,Singh,Kaur,Sodhi,&Gill,2003).In India,corn has become the third most important food grain after wheat and rice.The demand for corn is increasing in India with the setting up of food processing units involved in the processing of corn.The production of corn in India was 14,000,000Mt against the total world production of 721,000,000Mt (FAO,2004).On the basis of amylose and amylopectin ratio,corn can be separated into normal,waxy and high amylose.In addi-tion,sugary type corn,with high sugar content,also exits (Singh,Sandhu,&Kaur,2005).Normal starch consists of about 75wt%branched amylopectin and about 25wt%amylose,that is linear or slightly branched.Starch granules swell when heated in excess water and their vol-ume fraction and morphology play important roles in the rheological behaviour of starch dispersions (Bagley &Christiansen,1982;Da Silva,Oliveira,&Rao,1997;Evans &Haisman,1979).Starch retrogradation has been deﬁned as the process,which occurs when the molecular chains in gelatinized starches begin to reassociate in an ordered structure (Atwell,Hood,Lineback,Varriano Marston,&Zobel,1988).During retrogradation,amylose forms dou-ble-helical associations of 40–70glucose units (Jane &Robyt,1984)whereas amylopectin crystallization occurs by reassociation of the outermost short branches (Ring0308-8146/$-see front matter Ó2006Elsevier Ltd.All rights reserved.doi:10.1016/j.foodchem.2006.01.060*Corresponding author.Fax:+91183258820.E-mail address:narpinders@ (N.Singh)./locate/foodchemFood Chemistry 101(2007)1499–1507Food Chemistryet al.,1987).Although both amylose and amylopectin are capable of retrograding,the amylopectin component appears to be more responsible for long-term quality changes in foods(Miles,Morris,Orford,&Ring,1985; Ring et al.,1987).Several workers have characterized the pasting properties of starches from diﬀerent corn types(Ji et al.,2003;Seetharaman et al.,2001;Yamin,Lee,Pollak, &White,1999)and observed considerable variability in these properties.The viscosity parameters during pasting are cooperatively controlled by the properties of the swol-len granules and the soluble materials leached out from the granules(Doublier,Llamas,&Meur,1987;Eliasson, 1986).Sandhu,Singh,and Kaur(2004)studied the eﬀect of corn types on the physicochemical,thermal,morpholog-ical and rheological properties of corn starches.Textural properties of starch gels are very important criteria,used to evaluate the performance of starch in a food system.Ji et al.(2003)used a texture analyzer for studying the gel properties of starches from selected corn lines and found signiﬁcant diﬀerences among them.Seetharaman et al. (2001)studied the textural properties of13selected Argen-tinian corn landraces and found signiﬁcant variability in hardness between them after storage.The objective of this study was to characterize the corn varieties grown in India on the basis of the physicochemical,thermal,pasting and gel textural properties of their starch.This will be useful in selecting the appropriate variety for end use suitability.2.Materials and methods2.1.MaterialsSix improved corn varieties,viz.,Ageti,Navjot,Parb-hat,Partap,Pb Sathi and Vijay from the2003harvest were obtained from Punjab Agricultural University,Ludhiana, India.Three improved corn varieties,viz.,African Tall, Early Composite and Girja from the2003harvest were obtained from Chaudhary Sarwan Kumar Himachal Pra-desh Agricultural University,Palampur,India.2.2.Starch isolationStarch was isolated from corn grains following the method of Sandhu,Singh,and Malhi(2005).2.3.Physicochemical properties of starch2.3.1.Amylose content(%)Amylose content of the isolated starch was determined by using the method of Williams,Kuzina,and Hlynka (1970).A starch sample(20mg)was taken and10ml of 0.5N KOH was added to it.The suspension was thor-oughly mixed.The dispersed sample was transferred to a 100ml volumetricﬂask and diluted to the mark with dis-tilled water.An aliquot of test starch solution(10ml) was pipetted into a50ml volumetricﬂask and5ml of 0.1N HCL was added followed by0.5ml of iodine reagent.The volume was diluted to50ml and the absorbance was measured at625nm.The measurement of the amylose was determined from a standard curve developed using amylose and amylopectin blends.2.3.2.Swelling power(g/g)and solubility(%)Swelling power and solubility of starches were deter-mined in triplicate using the method of Leach,McCowen, and Schoch(1959).2.3.3.TurbidityTurbidity of starch pastes from diﬀerent corn varieties was measured as described by Perera and Hoover(1999). A1%aqueous suspension of starch from each corn variety was heated in a water bath at90°C for1h with constant stirring.The starch paste was cooled for1h at30°C. The samples were stored for5days at4°C and turbidity was determined every24h by measuring absorbance at 640nm against a water blank with a Shimadzu UV-1601 spectrophotometer(Shimadzu Corporation,Kyoto, Japan).2.3.4.Water binding capacity(WBC)WBC of the starches from the diﬀerent corn varieties was determined using the method described by Yamazaki (1953),as modiﬁed by Medcalf and Gilles(1965).A sus-pension of5g starch(dry weight)in75ml distilled water was agitated for1h and centrifuged(3000g)for10min. The free water was removed from the wet starch,which was then drained for10min.The wet starch was then weighed.2.4.Thermal properties of starchesThe thermal characteristics of the isolated starches were studied by using a diﬀerential scanning calorimeter(DSC, model821e,Mettler Toledo,Switzerland),equipped with a thermal analysis data station.Starch(3.5mg,dry weight) was loaded into a40l l capacity aluminium pan(Mettler, ME-27331)and distilled water was added by Hamilton microsyringe,to achieve a starch-water suspension contain-ing70%water.Samples were hermetically sealed and allowed to stand for1h at room temperature before heat-ing in the DSC.The DSC analyzer was calibrated using indium and an empty aluminium pan was used as a refer-ence.Sample pans were heated at a rate of10°C/min from 20to100°C.Thermal transitions of starch samples were deﬁned as T o(onset temperature),T p(peak of gelatiniza-tion temperature)and T c(conclusion temperature)and D H gel referred to the enthalpy of gelatinization.Enthalpies were calculated on a starch dry weight basis.These were calculated automatically.The gelatinization temperature range(R)and peak height index(PHI),was calculated as 2(T pÀT o)and D H/(T pÀT o),as described by Krueger, Knutson,Inglett,and Walker(1987).After conducting thermal analysis,the samples were stored at4°C for7days, for retrogradation studies.The sample pans containing1500K.S.Sandhu,N.Singh/Food Chemistry101(2007)1499–1507the starches were reheated at the rate of10°C/min from25 to100°C after7days to measure retrogradation.The enthalpies of retrogradation(D H gel)were evaluated auto-matically and percentage of retrogradation(%R)was cal-culated as%R¼enthalpy of retrogradation enthalpy of gelatinizationÂ100:2.5.Pasting properties of starchesThe pasting properties of the starches were evaluated with the Rapid Visco Analyzer(RAV-4,Newport Scien-tiﬁc,Warriewood,Australia).Viscosity proﬁles of starches from diﬀerent corn varieties were recorded using starch suspensions(6%,w/w;28g total weight).A programmed heating and cooling cycle was used,where the samples were held at50°C for1min,heated to95°C at6°C/min,held at95°C for2.7min,before cooling from95to50°C at 6°C/min and holding at50°C for2min.Parameters recorded were pasting temperature,peak viscosity,trough viscosity(minimum viscosity at95°C),ﬁnal viscosity(vis-cosity at50°C),breakdown viscosity(peak-trough viscos-ity)and setback viscosity(ﬁnal-trough viscosity).All measurements were replicated thrice.2.6.Textural properties of starch gelsThe starch prepared in the RVA were poured into small aluminum canisters and stored at4°C to cause gelation. The gel formed in the canisters was evaluated for their tex-tural properties by texture proﬁle analysis(TPA)using the TA/XT2texture analyzer(Stable MicroSystems,Surrey, England).Each canister was placed upright on the metal plate and the gel was compressed at a speed of0.5mm/s to a distance of10mm with a cylindrical plunger(diame-ter=5mm).The compression was repeated twice to gener-ate a force–time curve from which hardness(height ofﬁrst peak)and springiness(ratio between recovered height after theﬁrst compression and the original gel height)was deter-mined.The negative area of the curve during retraction of the probe was termed adhesiveness.Cohesiveness was cal-culated as the ratio between the area under the second peak and the area under theﬁrst peak(Bourne,1968;Friedman,Whitney,&Szczesniak,1968).Gumminess was determined by multiplying hardness and cohesiveness.Chewiness was derived from gumminess and springiness and was obtained by multiplying these two.Five repeated measurements were performed for each sample and their average was taken.2.7.Statistical analysisThe data reported in all of the tables are an average of triplicate observations and were subjected to one-way anal-ysis of variance(ANOVA).Pearson correlation coeﬃcients (r)for the relationships between all properties were also calculated using Minitab Statistical Software version13 (Minitab Inc.,USA).3.Results and discussion3.1.Physicochemical properties of starchesAmylose content of starches from diﬀerent corn varieties diﬀered signiﬁcantly(Table1).Amylose content of various corn starches ranged between16.9%and21.3%,the lowest was observed for African tall and the highest for Vijay. Seetharaman et al.(2001)reported amylose content in the range of16.1–23.3%for35corn landraces.The ability of starches to swell in excess water and their solubility also diﬀered signiﬁcantly(Table1).Swelling power(SP)and solubility can be used to assess the extent of interaction between starch chains,within the amorphous and crystal-line domains of the starch granule(Ratnayake,Hoover, &Warkentin,2002).SP was observed to be the highest for Early Composite(20.7g/g)and the lowest for Parbhat starch(13.7g/g).Starch swelling occurs concomitantly with loss of birefringence and precedes solubilization (Singh,Sandhu,&Kaur,2004).Solubility of various corn starches ranged from9.7%to15.0%(Table1).Water bind-ing capacity(WBC)of starches from diﬀerent corn varieties ranged from82.1%to97.7%(Table1).WBC of starches from Parbhat and Partap were similar(91.1%).The diﬀer-ence in the degree of availability of water binding sites among the starches may have contributed to the variation in WBC among diﬀerent starches(Wotton&Bamunu-arachchi,1978).The turbidity values of gelatinized starch suspensions from diﬀerent corn varieties are depicted inTable1Physicochemical properties of starches from diﬀerent corn varietiesVariety Amylose content(%)Swelling power(g/g)Solubility(%)WBC(%) African Tall16.9a19.4d13.5c84.9ab Ageti19.4c16.8bc15.0d82.3a Early Composite17.9b20.7e12.6bc90.5c Girja16.9a17.8c11.3b82.1a Navjot19.6c14.9ab11.6b97.7d Parbhat19.5c13.7a10.1a91.1c Partab18.5b13.8a9.7a91.1c PbSathi20.9d15.9b12.0bc86.0b Vijay21.3d16.8bc13.9c83.0ab Values with similar letters in the same column do not diﬀer signiﬁcantly(p<0.05).K.S.Sandhu,N.Singh/Food Chemistry101(2007)1499–15071501Fig.1.Turbidity values of all starch suspensions increased progressively during storage of starch gels at 4°C.Early Composite starch showed the lowest turbidity whereas Parbhat starch showed the highest.Turbidity development in starches during storage has been attributed to the inter-action of several factors,such as granule swelling,granule remnants,leached amylose and amylopectin,amylose and amylopectin chain length,intra or interbonding,lipid and cross-linking substitution (Jacobson,Obanni,&BeMiller,1997).3.2.Gelatinization properties of starchesThe gelatinization temperatures (onset,T o ;peak,T p ;and conclusion,T c ),enthalpy of gelatinization (D H gel ),peak height index (PHI)and gelatinization temperature range (R )for starches from diﬀerent corn starches,mea-sured using DSC are presented in Table 2.Signiﬁcant dif-ference was observed in T o ,T p and T c among starches from diﬀerent corn varieties.The lowest T o ,T p and T c of 65.6,69.9and 75.1°C,respectively,were observed for Girja starch,whereas Parbhat starch showed the highest value for the same (Fig.2).These values are in agreementwith those observed for normal corn starches (Ng,Duvick,&White,1997;Seetharaman et al.,2001).The higher gela-tinization temperatures for Parbhat starch indicated that more energy is required to initiate starch gelatinization.D H gel for various corn starches ranged between 11.2and 12.7J/g (Table 2).Li,Berke,and Glover (1994)reported D H gel in the range from 8.2to 12.3J/g for starches from tropical maize germplasm.The diﬀerence in D H gel reﬂects melting of amylopectin crystallites.The variations in D H gel could represent diﬀerences in bonding forces between the double helices that form the amylopectin crystallites,which,resulted in diﬀerent alignment of hydrogen bonds within starch molecules (McPherson &Jane,1999).PHI,a measure of uniformity in gelatinization,was found to be the lowest for Partap (2.34)starch,whereas it was found to be the highest for Pb Sathi (2.98).The R value wasfoundFig.1.Eﬀect of storage duration on the turbidity of starch pastes from diﬀerent corn varieties.Table 2Gelatinization properties of starches from diﬀerent corn varieties VarietyT o (°C)T p (°C)T c (°C)D H gel (J/g)PHI R African Tall 67.5c 71.5b 76.5b 11.6ab 2.90b 8.0a Ageti68.3d 73.1cd 79.3d 12.2b 2.54ab 9.6b Early Composite 66.3b 70.6a 75.9b 11.2a 2.60ab 8.6a Girja 65.6a 69.9a 75.1a 11.3a 2.63ab 8.6a Navjot 68.9e 73.8de 79.2d 12.4b 2.53ab 9.8b Parbhat 69.0e 74.0e 79.7d 12.7c 2.54ab 10.0b Partap 68.3d 73.3d 79.3d 11.7ab 2.34a 10.0b Pb Sathi 68.6de 72.7c 77.8c 12.2b 2.98b 8.2a Vijay67.0c71.9b77.9c11.7ab2.39a9.8bT o ,onset temperature;T p ,peak temperature;T c ,conclusion temperature;D H gel ,enthalpy of gelatinization (dwb,based on starch weight);R ,gelatinization range 2(T p ÀT o );PHI,peak height index D H gel /(T p ÀT o ).Values with similar letters in the same column do not diﬀer signiﬁcantly (p <0.05).Fig.2.DSC endotherms of gelatinization of starches from diﬀerent corn varieties:(A)African Tall;(B)Ageti;(C)Early Composite;(D)Girja;(E)Navjot;(F)Partap;(G)Parbhat;(H)Pb Sathi;(I)Vijay.1502K.S.Sandhu,N.Singh /Food Chemistry 101(2007)1499–1507to be the lowest for African Tall and the highest for Parb-hat and Partap starches.The high R values of Parbhat and Partap corn starches suggests the presence of crystallites of varying stability within the crystalline domains of its gran-ule (Hoover,Li,Hynes,&Senanayake,1997).3.3.Retrogradation properties of starchesThe molecular interactions (hydrogen bonding between starch chains)that occur after cooling of the gelatinized starch paste are known as retrogradation (Hoover,2000).The retrogradation properties of various corn starches are presented in Table 3.Retrogradation (%)of starches from diﬀerent corn varieties were %40–60%(Fig.3).Yamin,Svendsen,and White (1997)reported retrograda-tion (%)values between 50%and 60%,for Oh 43normal corn starches inbreds.Retrograded corn starches showed lower enthalpy than their native counterparts.This may be due to the weaker starch crystallinity of retrograded starch (Sasaki,Yasui,&Matsuki,2000).D H ret for corn starches ranged from 4.4to 6.9J/g,the lowest for Vijay and the highest for Ageti starch.D H ret of 4.6–6.9J/g has been reported in selected corn lines by Ji et al.(2003).The diﬀerence in D H ret among various corn starches sug-gested diﬀerences in their tendency towards retrogradation.The transition temperatures of retrogradation were found to be lower than the gelatinization temperatures.This might be due to the fact that recrystallization of amylopec-tin branched chains occurred in a less ordered manner in stored gels,as it is present in native form.T o for retrogra-dation ranged between 41.5and 43.1°C,the lowest for Ageti and the highest for Partap starch was observed.T o values of retrogradation in the range between 42.9and 48.1°C for exotic corn inbred lines have been reported by Pollak and White (1997).Girja starch showed the lowest value for T p of retrogradation whereas Partap had the highest value.The range for retrogradation temperature was found to be greater than the gelatinization temperature range.Similar observations have been reported earlier (Karim,Norziah,&Seow,2000).African Tall and Early Composite starches showed the lowest R of retrogradation,whereas Partap and Ageti had the highest values.3.4.Pasting properties of starchesPasting properties of various corn starches have been summarized in Table 4.Signiﬁcant diﬀerence in the pasting properties among diﬀerent corn varieties was observed.All corn starches showed gradual increase in viscosity with increase in temperature (Fig.4).The increase in viscosity with temperature may be attributed to the removal of water from the exuded amylose by the granules as they swell (Ghiasi,Varriano-Marston,&Hoseney,1982).Peak vis-cosity (PV)for various corn starches ranged between 804and 1252cP,the lowest for Partap and the highest for Afri-can Tall and Early Composite starches.Ji et al.(2003)reported PV in the range between 152and 222RVU for selected corn lines.Trough viscosity (TV)was found to be the lowest for Pb Sathi (594cP)and the highest for Parbhat (727cP).Breakdown viscosity (BV)(measure ofTable 3Retrogradation properties of starches from diﬀerent corn varieties VarietyT o (°C)T p (°C)T c (°C)D H ret (J/g)R %R African Tall 42.5b 52.4a 62.3ab 5.0ab 19.8a 43.1c Ageti41.5a 52.9ab 63.6bc 6.9c 22.8c 56.5e Early Composite 42.7bc 52.6a 62.0a 4.9ab 19.8a 43.7c Girja 42.4b 52.4a 62.1a 5.0ab 20.0a 44.2cd Navjot 42.5b 53.6bc 63.1b 5.4b 22.2bc 43.5c Parbhat 43.0c 53.6bc 63.4bc 5.2b 21.2b 40.9b Partap 43.1c 54.5c 64.3c 4.9ab 22.8c 41.9bc Pb Sathi 42.5b 53.3b 62.9b 5.7bc 21.6b 46.7d Vijay43.0c53.1b62.4ab4.4a20.2a37.6aT o ,onset temperature;T p ,peak temperature;T c ,conclusion temperature;D H ret ,enthalpy of retrogradation (dwb,based on starch weight);R ,retro-gradation range 2(T p ÀT o );%R ,ratio of enthalpy of retrogradation to enthalpy of gelatinization ·100.Values with similar letters in the same column do not diﬀer signiﬁcantly (p <0.05).Fig.3.DSC endotherms of retrogradation of starches from diﬀerent corn varieties:(A)African Tall;(B)Ageti;(C)Early Composite;(D)Girja;(E)Navjot;(F)Partap;(G)Parbhat;(H)Pb Sathi;(I)Vijay.K.S.Sandhu,N.Singh /Food Chemistry 101(2007)1499–15071503the cooked starch to disintegration)was found to be the lowest for Parbhat and the highest for African Tall starch. Final viscosity(FV)(indicates the ability of the starch to form a viscous paste)for diﬀerent corn starches ranged from824to1388cP,the lowest shown by Partap and the highest by African es et al.(1985)reported that increase inﬁnal viscosity might be due to the aggregation of the amylose molecules.Setback viscosity(SV)(measure of synaeresis of starch upon cooling of the cooked starch pastes)for various corn starches diﬀered signiﬁcantly.Par-tap exhibited the lowest setback of141cP,whereas it was found to be the highest for African Tall(726cP).The low PV,BV,FV and SV of Navjot,Parbhat and Partap starches correlate well with their low SP in water.Pasting properties are dependent on the rigidity of starch granules, which in turn aﬀect the granule swelling potential(Sandhya Rani&Bhattacharaya,1989)and amount of amylose leaching out in the solution(Morris,1990).Pasting temper-ature(PT)(temperature at the onset of rise in viscosity)for various corn starches ranged between75.9and83.8°C,the lowest shown by African Tall and Early Composite and the highest by Parbhat starch.The high pasting temperature of Parbhat and Partap starch indicated their higher resistance towards swelling.Seetharaman et al.(2001)reported past-ing temperatures in the range of74.9–84.7°C for Argentin-ian corn landraces.3.5.Gel texture properties of starch gelsThe textural properties of gels from diﬀerent corn starches determined using the texture analyzer are shown in Table5.The textural parameters of corn starch gels from diﬀerent corn varieties varied signiﬁcantly.StarchTable4Pasting properties of starches from diﬀerent corn varietiesVariety PV(cP)TV(cP)BV(cP)FV(cP)SV(cP)P Temp(°C) African Tall1252f662bc590f1388e726f75.9a Ageti1000c652b348c1222c570c77.4b Early Composite1250f671c579f1321d650de75.9a Girja1196e647b549e1324d677e77.4b Nayjot839b697d142b877b180b80.6d Parbhat840b727e113a868b141a83.8e Partap804a676c128ab824a148a83.1ePb Sathi1012c594a418d1214c629d78.3c Vijay1063d686cd377c1345de659de77.5b PV,peak viscosity;TV,trough viscosity;BV,breakdown viscosity;FV,ﬁnal viscosity;SV,setback viscosity;P Temp,pasting temperature.Values with similar letters in the same column do not diﬀer signiﬁcantly(p<0.05).Table5Textural properties of starch gels from diﬀerent corn varietiesVariety Hardness(g)Cohesiveness Gumminess Springiness Chewiness Adhesiveness(gs) African Tall21.5a0.418c8.9a0.805d7.2b38.6eAgeti28.0d0.392b10.9bc0.623b 6.8ab15.6aEarly Composite26.0c0.398bc10.3b0.590ab 6.1a32.4dGirja24.1b0.3859.30.716c 6.6ab20.6bNavjot27.5d0.431d11.8c0.626b7.4b32.6dParbhat31.1e0.437d13.6d0.518a7.0b22.9cPartap32.3f0.370a11.9c0.902e10.8c20.6bPb Sathi27.5d0.576e15.8e0.708c11.2c40.6fVijay27.5d0.434d11.9c0.614b7.3b40.3fValues with similar letters in the same column do not diﬀer signiﬁcantly(p<0.05).1504K.S.Sandhu,N.Singh/Food Chemistry101(2007)1499–1507gel from Partap showed the highest hardness (32.3g),whereas African Tall starch gel showed the lowest (21.5g).Seetharaman et al.(2001)reported the hardness of 13selected Argentinian corn landraces in the range between 16.7and 35g.The gel ﬁrmness is mainly caused by retrogradation of starch gels,which is associated with the synaeresis of water and crystallization of amylopectin,leading to harder gels (Miles et al.,1985).Starches that exhibit harder gels tend to have higher amylose content and longer amylopectin chains (Mua &Jackson,1997).Gumminess was found to be the highest for Pb Sathi (15.8)and the lowest for African Tall (8.9)starch gels.Chewiness was found to be the highest for Pb Sathi and the lowest for Early Composite starch gel.The mechanical properties of starch gels depend upon various factors,including the rheological characteristics of the amylose matrix,the volume fraction and the rigidity of the gelati-nized starch granules,as well as the interactions between dispersed and continuous phases of the gel (Biliaderis,1998).These factors are in turn dependent on the amylose content and the structure of the amylopectin (Yamin et al.,1999).The values for hardness,cohesiveness,springiness and adhesiveness of starch gels observed in the present study were comparable to those observed earlier for nor-mal corn starches by Liu,Ramsden,and Corke (1999).3.6.Pearson correlations among various properties of corn starchesSeveral signiﬁcant correlations between the physico-chemical,gelatinization,retrogradation,pasting and gel texture properties of the corn starches were observed (Table 6).SP was positively correlated to solubility (r =0.582,p <0.05).Interrelationships between the gelatinization parameters were observed.T o was positively correlated to T p and T c (r =0.970and 0.890,respectively,p <0.01).Ji et al.(2003)showed positive correlations between T o and T p for advanced generations of corn lines.D H gel was observed to be positively correlated with T o ,T p and T c (r =0.900,0.902and 0.828,respectively,p <0.01).Singh,Kaur,Sandhu,Kaur,and Nishinari (2006)observed signif-icant positive correlation between T o ,T p and T c with D H gel for rice starches.PHI was negatively correlated to R (r =À0.869,p <0.01).Relationship between the gelatiniza-tion and retrogradation properties was observed.D H gel was positively correlated to the R of retrogradation.The thermal and pasting properties were observed to be related to each other.T o ,T p and T c of gelatinization were negatively correlated to PV (r =À0.809,À0.898and À0.902,respec-tively,p <0.01),BV (r =À0.774,À0.886and À0.900,respectively,p <0.01),FV (r =À0.721,À0.795and À0.765,respectively,p <0.01)and SV (r =À0.686,À0.779and 0.762,respectively,p <0.01)whereas they were positively correlated to PT (r =0.657,0.750and 0.731,respectively,p <0.01).D H gel was negatively correlated to PV and BV (r =À0.743,À0.733,respectively,p <0.01),FV and SV (r =À0.623and À0.611,respectively,T a b l e 6P e a r s o n c o r r e l a t i o n c o e ﬃc i e n t s b e t w e e n v a r i o u s p r o p e r t i e s o f s t a r c h e s f r o m d i ﬀe r e n t c o r n v a r i e t i e sA M Y aS P aS O L aT o aT p aT c aD H g e l aP H I aR aR (1)aP V aB V aF V aS V aP T aH D aS P aÀ0.498S O L a0.1360.582bT o a0.519bÀ0.742cÀ0.227T p a0.560bÀ0.827cÀ0.2580.970cT c a0.584bÀ0.814cÀ0.1740.890c0.968cD H g e l a0.576bÀ0.731cÀ0.1680.900c0.902c0.828cP H I aÀ0.1740.4160.1790.001À0.236À0.432À0.023R a0.444À0.730cÀ0.2420.4540.657c0.788c0.513bÀ0.869cR (1)a0.354À0.717cÀ0.1790.720c0.763c0.805c0.580bÀ0.3230.565bP V aÀ0.5060.956c0.524bÀ0.809cÀ0.898cÀ0.902cÀ0.743c0.475À0.782cÀ0.815cB V aÀ0.4570.933c0.533bÀ0.774cÀ0.886cÀ0.900À0.733c0.558À0.846cÀ0.745c0.985cF V aÀ0.2100.859c0.725cÀ0.721cÀ0.795cÀ0.765cÀ0.623b0.426À0.675cÀ0.703c0.925c0.938cS v aÀ0.1840.832c0.706cÀ0.686cÀ0.779cÀ0.762cÀ0.611b0.495À0.727cÀ0.645c0.904c0.940c0.991cP T a0.251À0.914cÀ0.781c0.657c0.750c0.731c0.620bÀ0.4640.714c0.578bÀ0.903cÀ0.922c0.945cÀ0.943cH D a0.511bÀ0.811c0.516b0.590b0.713c0.787c0.518bÀ0.601b0.781c0.680cÀ0.865cÀ0.856cÀ0.799cÀ0.784c0.838cG M a0.792cÀ0.623bÀ0.3020.663c0.621c0.542b0.660c0.1370.2220.396À0.591À0.505À0.431À0.3600.4900.609baA M ,a m y l o s e c o n t e n t ;S P ,s w e l l i n g p o w e r ;S O L ,s o l u b i l i t y ;T o ,o n s e t t e m p e r a t u r e ;T p ,p e a k t e m p e r a t u r e ;T c ,c o n c l u s i o n t e m p e r a t u r e ;D H g e l ,e n t h a l p y o f g e l a t i n i z a t i o n ;P H I ,p e a k h e i g h t i n d e x ;R ,r a n g e o f g e l a t i n i z a t i o n ;R (1),r a n g e o f r e t r o g r a d a t i o n ;P V ,p e a k v i s c o s i t y ;B D ,b r e a k d o w n v i s c o s i t y ;F V ,ﬁn a l v i s c o s i t y ;S B ,s e t b a c k v i s c o s i t y ;P T ,p a s t i n g t e m p e r a t u r e ;H D ,h a r d n e s s ;G M ,g u m m i n e s s .bC o r r e l a t i o n i s s i g n i ﬁc a n t (p <0.05).c C o r r e l a t i o n i s s i g n i ﬁc a n t (p <0.01).K.S.Sandhu,N.Singh /Food Chemistry 101(2007)1499–15071505。

Introduction Goal

References[1]Och,F .J.,H.Ney (2004).“The alignment template approach to statistical machine translation”.In Computational Linguistics ,30(4):417–449.[2]Armentano-Oller,C.et al.(2006).“Open-source Portuguese-Spanish machine translation”.In Lec-ture Notes in Computer Science 3960(Computational Processing of the Portuguese Language),p.50–59,Rio de Janeiro,Brazil.Discussion•Signiﬁcant improvement in translation quality as compared to word-for-word •T ranslation quality very close to that obtained using hand-coded transfer rules •Preliminary results on the Spanish–Portuguese language pair show results in agree-ment to those provided here •Future work:–Applying shorter ATs inside the same rule when none of the longer ATs can be applied because of TL restrictions not being met•An open.source implementation of the method can be freely downloaded from/projects/apertium/,package apertium-transfer-toolsEvaluation corpusTrans.dir.Corpus #wordses-ca post-edit 10066parallel 13147ca-es post-edit 10024parallel 13686Experiments (Spanish–Catalan)•Lexicalized categories ={prep ,pronoun ,det ,cnj ,rel ,vbmodal ,vbaux }T raining corpusLang.#sentences #words es 1008341952317ca 1008342032925Results (WER)Trans.dir.Evaluation corpus No rules AT -based Hand-coded es-ca post-edit 12.6%8.5% 6.7%parallel 26.6%20.4%20.8%ca-es post-edit 11.6%8.1% 6.5%parallel 19.3%14.9%14.5%el e l (n o u n .f .s -(a r t .f .s g (a d j .f .s g )Rules generation•A shallow transfer rule consists of a set of ATs :U ={(S m ,T n ,A,R )∈Z :S m =S U },where Z is the whole set of extracted ATs,and S U is a sequence of SL word classesall ATs z ∈U have in common•Each generated rule consists of code which always applies the most frequent AT z =(S m ,T n ,A,R )∈U that satisﬁes the TL restrictions R•A “default”AT ,which translates word for word,is added with the lowest frequencyCode generated for each AT•Code is generated for each unit in T n ,which depends on the type of word class:non-lexicalized word :the aligned SL (non-lexicalized)lemma is translated and in-ﬂection information provided by the TL word class is attachedlexicalized word :it is introduced as is ;it represents a complete lexical form •Example:–Input:vivir -(verb.pret.3rd.pl)en -(pr)Francia -(noun.loc)–Output:anar -(vaux.pres.3rd.pl)viure -(verb.inf)a -(pr)Franc¸a -(noun.loc)AT applicability test•Restrictions are tested by looking at the bilingual dictionary •Example:–R ={w 2=*,w 3=adj.*}–Applicable:∗Input string (Spanish):la se˜nal roja −→el -(art.f.sg)se˜nal -(noun.f.sg)rojo -(adj.f.sg)∗T ranslation of non-lexicalized words:·se˜nal -(noun.f.sg)→senyal -(noun.m.sg)·rojo -(adj.f.sg)→vermell -(adj.f.sg)–Not applicable:∗Input string (Spanish):la silla blanca −→el -(art.f.sg)silla -(noun.f.sg)blanco -(adj.f.sg)∗T ranslation of non-lexicalized words:·silla -(noun.f.sg)→cadira -(noun.f.sg)·blanco -(adj.f.sg)→blanc -(adj.f.sg)The Apertium open-source MT platformlexical transferSL text→morph.analyzer→PoS tagger →structural transfer →morph.generator →post-generator →TLtext Example of extracted ATsBilingual phrase:Alignment template:el el (n o u n .f .s g -(art.m.sg)-(a r t .f.s g (a d j .f .s g )R ={w 2=noun.m.*,w 3=adj.*}Bilingual phrase:Alignment template:a anar e n (ve r b.p r e t.3r d .p )(noun.loc)-(pr)-(vbaux.pres.3rd.pl)-(p r )(n o u n .l o c )R ={w 1=verb.*,w 3=noun.*}AT for shallow-transfer MT•Linguistic information used to deﬁne word classes:–lexicalized categories :categories that are known to be involved in lexical changes such as prepositions∗the method can learn not only syntactic changes •Word class:part of speech with all the inﬂection information –but lexicalized words have their own single classExtending ATs with restrictions•ATs are extended to consider a set R of restrictions over the inﬂection information of non-lexicalized categories –AT z =(S n ,T m ,A,R )•Restrictions are learned from the bilingual dictionary–Bilingual entry that does not change inﬂection information <e><p><l>castigo<s n="noun"/></l><r>c`a stig<s n="noun"/></r></p></e>R :w =noun.*–Bilingual entry that does change inﬂection information <e><p><l>calle<s n="noun"/><s <r>carrer<s n="noun"/><s </p></e>•The bilingual dictionary is also used to discard phrase pairs that cannot be repro-duced by the MT systemAlignment templates (AT)•Introduced in the statistical MT framework as a feature function [1]•Alignment templates (AT)are learned in a 3-stage procedure:pute word alignments2.Extract aligned phrase pairs (translation units )3.Generalize over the extracted phrases using word classes •AT z =(S n ,T m ,A )–S n :sequence of n SL word classes –T m :sequence of m TL word classes –A :alignment informationIntroductionGoal•T o automatically infer shallow-transfer rules,to be used in machine translation (MT),from “small”parallel corpora •T ransfer rules are used to:–produce grammatically correct translations in the target language (TL)–perform some lexical changes,such as preposition changes –introduce auxiliary verbs when needed –...How?•Adapting the alignment templates already used in statistical MTResources•A sentence-aligned parallel corpus•A morphological analyzer and a PoS tagger for both languages (the ones used by the MT system in which the inferred rules will be used)Felipe S´anchez-Mart´ınez,Mikel L.ForcadaT ransducens Group,Departament de Llenguatges i Sistemes Inform`aticsUniversitat d’Alacant,E-03071Alacant,Spain{fsanchez,mlf }@dlsi.ua.esAutomatic induction of shallow-transfer rulesfor open-source machine translation。

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

1A Morphological Analyzer for Verbal As-pect in American Sign LanguageA ARON S HIELD AND J ASONB ALDRIDGE1IntroductionThe study of phonology and morphology involves breaking down linguistic signs into successively smaller units (e.g. distinctive features, phonemes, and morphemes), examining how those units influence each other in con-text, and developing systems that account for the sound changes found uni-versally in spoken languages. Despite the plethora of approaches which have been proposed, work in computational linguistics has shown that all known phonological and morphological processes – from simple concatena-tive processes to templatic and reduplicative morphology - can be treated as regular relations definable in terms of regular expressions (Beesley & Kart-tunen, 2003). Different theories organize the information flow in quite dif-ferent manners, but the solutions they provide can all be encoded in a finite-state manner. This has the tremendous upshot that very efficient morpho-logical analyzers can be produced by compiling a set of regular expressions into a finite-state tranducer, using tools such as the Xerox Finite State Toolkit (xfst). These transducers are bi-directional, and so can be used both for generation and analysis of word forms.Signed languages do not have an auditory component, but they too ex-hibit phonological and morphological processes (see Sandler & Lillo-Martin 2006 for a review of much of the literature). While morphological analyzers have been built for a wide variety of languages and language types, including most European languages, Turkish, Arabic, Korean, and Japanese (Karttunen, 2003), we are not aware of any for sign languages. Computationally oriented work aimed at creating computer characters capa-12ble of signing and/or providing machine translation for sign languages (Veale et. al., 1998; Speers 2001; Sáfár, É. and Marshall, I, 2002; Huener-fauth, 2006) has led to the development of feature-based representations of signs and the creation of computational syntactic grammars and other capa-bilities, but it has not utilized finite-state methods for handling sign lan-guage morphotactics. In general, sign languages have received little atten-tion in computational linguistics.In this paper, we present a morphological analyzer for American Sign Language (ASL) verbs that mediates underlying lemmas to abstract formal representations of their visual surface realizations. We focus on the morpho-logical effects of aspectual distinctions and handle co-articulation con-straints that rescue otherwise unpronounceable forms. Our solution is based on a series of transducers that are composed together in a cascade.2The Linguistics of American Sign LanguageSigned languages have been an object of linguistic inquiry since the early 1960’s, when William Stokoe (1960, 1965) first formalized the notion of a sign as a linguistic unit with internal structure. Previously, signed languages were considered to be rudimentary systems of pantomime having little in common with spoken languages. Stokoe’s work established ASL and other signed languages around the world as complex, natural human languages.After Stokoe, linguists sought to analyze the formal properties of sign languages. While much progress has been made, particularly in the areas of acquisition and syntax, sign linguistics has proved very challenging in other areas, particularly phonology and morpho-phonology. Since the modality of signed languages is visual-spatial, while that of spoken languages is audi-tory, the very nature of the two “phonological” systems is quite different. Signed languages are thought to reflect universals of human language, but they also differ in important ways from spoken languages. For example, signed languages tend to exhibit simultaneous (rather than sequential) mor-phology. Also, signed languages exhibit a high degree of iconicity, contra the classic Saussurian notion that linguistic symbols are arbitrary.One problem that linguists (and others) have struggled with in the effort to formalize and understand sign language structure is the lack of a writing system. Various notations have been developed such as Stokoe notation and SignWriting (Sutton, 1974), but no one has succeeded in spreading a con-ventional system for representing signs in writing. This fact has the unfor-tunate consequence that signs are most often represented by analogous words in the ambient spoken language. Thus, a given sign is typically repre-sented by the spoken language word that most closely fits its meaning, writ-ten in capital letters to differentiate it from the spoken word itself. For ex-ample, the sign for “dog” in ASL is transcribed as DOG; that particular combination of linguistic symbols (D-O-G) actually has no connection to the actual form of the sign, which is produced by the rubbing of thumb and third finger of the dominant hand in neutral space, palm facing out from the signer. This writing convention thus misrepresents the relationship between signifier and signified: the ASL sign DOG does not refer to the English word dog: both DOG and dog refer to the same concept, but the two linguis-tic symbols do not have a direct relationship.3Representing Signs FormallyAlthough spoken languages have simultaneous dimensions such as phonetic articulation, stress and intonation contours, the most important properties can be abstractly characterized with sequential written formats. Thus, when creating a morphological transducer, the usual task is to mediate between abstract underlying forms (the lemmas) and surface forms that are are pho-netic transcriptions or conventionalized written words. For example, an English transducer might map underlying forms such as leaf+Noun+Plural and leave+Verb+3rdSingular to the surface form leaves. In xfst, regular expressions map abstract features to surface realizations (such as +Plural to the string s) and handle sound alternations (such as ensuring that the output form is leaves and not leafs). Though it is tempting to view such transducers as enacting a sequence of rules, they are actually the result of composing a series of transducers (each defined by an individual regular expression) into a single transducer that implements a regular relation. Because of this prop-erty, finite-state morphological transducers are bi-directional and can thus produce the lemma representations from the surface forms as well as pro-ducing surface forms from lemmas (Beesley & Karttunen, 2003).Because sign language has little in the way of sequential morphology, an even more abstract formal representation than transcriptions or written words is necessary. Each parameter of the sign must be represented with a shorthand system that is fairly transparent and comprehensible. Various phonological models of how signs are composed have been proposed, in-cluding the Move-Hold model (Liddell & Johnson 1989) and the Hand Tier model (Sandler 1989). These models differ in the way they represent ab-stract sign features and architecture. We do not commit to a particular un-derlying phonological form of the sign, but acknowledge that basic parame-ters must be represented in order to be able to recover sign forms from a text format. We therefore specify the parameters of sign type, handshape, location, palm orientation and (simplified) movement. The following schema represents the basic parameters that combine to form signs:4•Types: 1-Handed (1H), 2-Handed Symmetrical (2HS), 2-Handed Dominant (2HD)•Handshapes: A, B, C, 5, E, F, G, H, 3, O, R, V, W, X, Y, 8•Locations: face, neutral, torso, neck, shoulders, chest, trunk, upper arm, elbow, forearm, wrist•Palm orientations: up, down, out, in, base•Movements: touch, twist, reduplication, arc (all +/- values)Signs can be one-handed, two-handed symmetrical (in which both hands form the same shape and make the same movement), or two-handed domi-nant (one hand is “dominant” - the non-dominant hand has a limited number of possible handshapes and exhibits no independent motion). For the sake of uniformity, our analysis specifies every sign as having a dominant hand (DH) shape and a non-dominant hand (NDH) shape.The locations represent all of the possible contrastive locations for signs in ASL. Handshapes are represented by the corresponding ASL number or letter. Palm orientation refers to the way the palm of the signer’s hand faces: up, down, base (i.e., the way the palms face while hanging at rest), outwards from the signer, or inwards toward the signer. Finally, movement has been simplified to four essential distinctive features: touch (whether or not there is contact between the two hands during articulation of the sign), twist (whether or not the articulating hand reverses orientation during the performance of the sign), reduplication (whether or not the sign is iterated more than once), and arc (whether or not the sign follows a path through space). Touch, reduplication, and arc are all well-attested in the literature; twist is a novel feature which we have found useful in characterizing certain phonological phenomena. It should be noted that this characterization of movement is simplified, specifying only the bare bones of movement neces-sary for producing forms. However, we believe that this description ade-quately captures the morpho-phonological problems we address.Using this notation, the verb SEE is represented as follows:<Type:1H DH:Vin NDH:none Loc:face -Touch -Twist -Redup +Arc> The goal of our analyzer is to produce such representations from underlying word signs like SEE, and vice versa.4ASL Verbal MorphologyAlthough ASL seems to lack morphological tense, it has a complex sys-tem of inflectional morphology to show aspect. The kind of movement ex-hibited by the verb changes depending on the type of aspect, while hand-shape, location, and palm orientation (typically) remain the same. Klima & Bellugi (1979) individuated the following aspectual distinctions in ASL: protractive, incessant, habitual, continuative, iterative, facilitative, incep-tive, and augmentative. In this paper, we consider two of the most common aspectual inflections -- habitual and continuative.The underlying forms of our transducer are taken to be the sign plus as-pect. For example, for the verb STUDY, our lexicon will contain the fol-lowing entries: (a) STUDY Aspect:None, (b) STUDY Aspect:Hab, and (c) STUDY Aspect:Cont (for no aspect, habitual aspect, and con-tinuative aspect, respectively). We create this lexicon in the standard way that such distinctions are produced for spoken language morphology – through rules of word formation (Karttunen, 2006):define WordSigns [COOK | FORCE | PLAY | SEE | STUDY ];define AspectFeature [Aspect ":" [None | Hab | Cont ] ] ;define Lexicon WordSigns " " AspectFeature ;These regular expressions create a lexicon with 15 entries – three for each verb. We follow Karttunen’s notation for encoding Realizational Morphol-ogy in xfst (Karttunen, 2003), which is convenient for a feature-based rep-resentation such as ours. Note that certain characters in our representation, such as the colon and whitespace and others used below, are operators in xfst. They thus need to be surrounded by double quotes in order to be used as literal strings.These underlying forms must be mapped to their correct surface forms, such as that given for SEE Aspect:None in the previous section. To do this, we start by encapsulating the base forms in the lexicon in brackets (to facilitate later processing) and then producing the base realizations for all verbs. For example, the basic form of the sign STUDY is a two-handed sign: the base hand is a 5-handshape with the palm facing up, and the active hand is a 5-handshape with the palm facing inwards1. The bracketing rule and the base form rule for STUDY are the following:define BracketedLexicon 0:"[" Lexicon 0:"]" ;1 STUDY also contains an internal movement: the fingers of the active hand wiggle. We only account for path, not internal movement, in our analysis. The internal movement does not appear to change with changes in aspectual inflection.6define StudySign [. .] ->"<" Type ":" 2HD " " DH ":" 5in " " NDH ":" 5up " "Loc ":" neutral " " "-" Touch " ""-" Twist " " "-" Redup " " "-" Arc ">" || $STUDY "]" _ ;In words, the StudySign rule says that the empty string is replaced by the given features when preceded by a string that contains STUDY and a right bracket immediately in front of it. In essence, such rules define a secondary lexicon that retrieves the feature representation associated with the basic sign stem.By composing the transducers from these regular expressions together, all three STUDY entries are enriched with brackets and the correct feature specifications. We define similar rewrite rules for the other four verbs. The transducer now contains elements such as:[STUDY Aspect:None]<Type:2HD DH:5in NDH:5up Loc:neutral -Touch -Twist -Redup -Arc>This representation forms the basis for producing the correct surface forms for each aspectual type.Continuative aspect indicates that a particular action happens through time and is characterized by a prolonged, lengthened path movement (+Arc). Habitual aspect indicates an action happening repeatedly and is characterized by a reduplicated path movement (+Redup +Arc). For regular verbs, habitual and continuative aspectual inflections can be repre-sented by simply changing the movement features appropriately. When the sign STUDY is inflected for habitual aspect, both hands move repeatedly in a circle, while retaining their original handshapes, palm orientations, and locations. The addition of habitual aspect thus changes the Redup and Arc features from - to +. Continuative aspect changes only Arc:define Continuative "-" -> "+" || $[Aspect ":" Cont] _ Arc ;define Habitual"-" -> "+" || $[Aspect ":" Hab] _ [Redup | Arc] ;The rules both utilize xfst’s containment operator “$”, which saves us from having to specify what else might occur in the string context preceding the replacement point. This is particularly useful with our non-sequential repre-sentations – typically, rules for spoken language morphology act in a very local fashion in which the context that makes the rule fire is string-adjacent to the change. Even in the case of vowel harmony rules that set off a cas-cade of vowel changes, each replacement is still locally determined (Beesley & Karttunen, 2003). Our representations are not order dependent, so adjacency is irrelevant. What is necessary instead is the ability to test whether a value exists somewhere in the string, so the “$” operator is per-fect for this. Also note that the +/- notation allows both the arc and redupli-cation changes to be encoded with a single rule.With these rules, we obtain the following forms for STUDY:[STUDY Aspect:Hab]<Type:2HD DH:5in NDH:5up Loc:neutral -Touch -Twist +Redup +Arc>[STUDY Aspect:Cont]<Type:2HD DH:5in NDH:5up Loc:neutral -Touch -Twist -Redup +Arc>Not all verbal inflection can be modeled so easily; there are constraints in some configurations which we turn to next.5Co-articulation Constraints in ASL Verb Inflection Several verbs show complications in their phonological form when in-flected with continuative or habitual aspect -- parameters other than Redup and Arc features change because some aspects of signs cannot be co-articulated. Here, we give examples for COOK, PLAY, and FORCE and show how they are handled by our transducer.PLAY in its base form includes a +Twist value, indicating that palm orientation reverses during the enunciation of the sign (specifically, a fore-arm twist produces oscillations between up and down palm orientations):<Type:2HS DH:Ybase NDH:Ybase Loc:neutral -Touch +Twist +Redup -Arc>Adding continuative aspect changes Arc to +(via the Continuative rule). These articulations cannot be pronounced simultaneously on a two-handed symmetrical sign like PLAY, resulting in a -Twist value. The following rewrite rule enacts this change:define NoTwistWith2HSArc"+" -> "-" || $[Type ":" 2HS ] _ Twist $["+" Arc] ;Applying the rule gives the correct surface form for PLAY As-pect:Cont:8<Type:2HS DH:Ybase NDH:Ybase Loc:neutral -Touch -Twist +Redup +Arc>COOK in its base form includes a +Touch value:<Type:2HD DH:5down NDH:5up Loc:neutral +Touch +Twist2 -Redup–Arc>Adding habitual aspect changes both Redup and Arc to +(via the Ha-bitual rule). The +Arc and+Twist values make it impossible to retain +Touch; we encode this with the following rule and obtain the cor-rect surface form for COOK Aspect:Hab:define NoTouchWithArcTwist"+" -> "-" || _ Touch [$["+" Arc] & $["+" Twist]] ;<Type:2HD DH:5down NDH:5up Loc:neutral -Touch +Twist +Redup+Arc>Note the use of the xfst intersection operator “&” in the rule. The expres-sion[$["+" Arc] & $["+" Twist]]describes all strings which contain both +Arc and +Twist values. The order in which they are en-coded in the representation is not important – the rule would match both “+Twist –Redup +Arc” and “+Arc –Redup +Twist”. This makes the approach extensible, since the rule will continue to work even as more features are added or their order in the representation changes.FORCE in its base form also includes a +Touch value:<Type:2HD DH:Cout NDH:5down Loc:neutral +Touch -Twist -Redup-Arc>Adding habitual aspect to FORCE leads to the non-dominant hand being dropped completely. Unsurprisingly, this change also renders +Touch im-possible. We encode this with the following rules and obtain the correct surface form:define NoNDHWith5DownRedup5down -> none || NDH ":" _ $["+" Redup] ;2 Note that the +twist feature only applies to the dominant hand in a two-handed dominant (2HD) sign, while it applies to both hands in a two-handed symmetrical (2HS) sign.define NoTouchWoutNDH "+" -> "-" || $[NDH ":" none ] _ Touch ;<Type:2HD DH:Cout NDH:none Loc:neutral -Touch -Twist +Redup+Arc>Having defined the lexicon and the rules, all that remains is to compose the individual transducers together and strip off the word sign and aspectual information to obtain a final transducer that maps underlying representa-tions to the appropriate feature representations that describe the surface vis-ual forms themselves. Using xfst, we can apply down the network to get surface forms from underlying forms, and up it to do the reverse:xfst[1]: apply down "PLAY Aspect:Cont"<Type:2HS DH:Ybase NDH:Ybase Loc:neutral -Touch -Twist +Redup+Arc>xfst[1]: apply up "<Type:2HD DH:Cout NDH:none Loc:neutral-Touch -Twist +Redup +Arc>"FORCE Aspect:HabThe feature representations on the lower side of the network could be given to another application, such as a virtual signing avatar, to pronounce it (visually). The upper side provides lemma and aspectual information, which would be useful for ASL dialog systems.6ConclusionWe have presented a morphological analyzer for ASL that represents base forms for several ASL verbs and handles morpho-phonological changes when continuative and habitual aspect are added to them. For some verbs, the addition of aspect leads to forms that are impossible to pronounce. These co-articulation constraints are mediated by a cascade of transducers that correct the forms appropriately. By virtue of being a finite-state trans-ducer that implements a regular relation, our analyzer is bi-directional, and thus can be used both for generation and analysis of ASL signs. This makes it potentially useful for a number of practical applications including educa-tional tools for learners of ASL, sign language dialog systems, and machine translation (Speers, 2001; Huenerfauth, 2006). Handling aspect is particu-larly relevant for educational goals - it is often difficult for (hearing) sign learners to master, yet it is an essential part of sign language grammar.To have broader applicability, the coverage of the analyzer would need to be greatly expanded. There are several other interesting morpho-10phonological alternations in ASL and other signed languages that could be represented with such an analyzer; one broadly-attested and important phe-nomenon is verb agreement. Many verbs in ASL change their location and directionality depending on the argument structure. For example, the verb GIVE can inflect by moving from the spatial location of the agent to the spatial location of the recipient. Thus, the path movement of the utterance I-GIVE-YOU starts at the body of the signer and moves outwards, while YOU-GIVE-ME does the opposite. These locations could be included in the encoding of agreeing verbs in xfst. This could be useful, for example, with a grammar for ASL such as that described by Wright (2006), which in-cludes lexical entries that encode such path movements.Such an analyzer would also need to account for the sequential aspects of ASL signs – as entire sentences are uttered, the features of one sign lead to assimilation in another, much like nasal assimilation in spoken lan-guages. Additionally, and unlike spoken languages, the epenthetic move-ments between individual signs are always visible, and would thus need to be represented. Our representations and rules, with their heavy use of the containment and intersection operators in xfst (similar to Karttunen’s (2003) rules for Lingala), should extend straightforwardly to this context.It bears noting as well that we have not accounted for facial marking morphology in our analysis. Facial markings are an important part of sign language grammar (used, e.g., in question-marking, negation, verb agree-ment marking, and adverbial manner marking) and must be included in eventually more elaborate descriptions of signs. However, this fact does not present problems for our analyzer: facial markings can be represented in the form of the sign or in the rules, and alternations can be handled with trans-ducers in xfst, just as manual signs can.Our analysis could help in the understanding of the formal properties of verb morpho-phonology in ASL (and perhaps other signed languages). The encoding of the rules in xfst allows us to straightforwardly test their predic-tions on all forms, and in fact it did highlight errors in the original paper-and-pencil analysis. Additionally, the overall architecture of representations and rules may also have implications for accounts of simultaneous morpho-phonological phenomena in spoken languages.The rules as defined are admittedly quite specific to each of the changes they enact. In the interest of creating a more cross-linguistically applicable analysis, it would be preferable to instead be able to state general con-straints on what are pronounceable sign representations. For example, Pfau & Steinbach (2004, 2005) provide an analysis of reciprocals and plurals in German Sign Language which uses language-specific as well as general constraints in the framework of Optimality Theory (OT; Prince & Smolen-sky, 1993). As xfst is capable of implementing OT analyses (via lenientcomposition), we intend to explore the possibility of using such cross-linguistic constraints to create analyzers that are less language-specific than the rule-based one presented here. Should these constraints turn out to be robustly attested, an OT-based analyzer could be specified in xfst that could be more easily extended to handle phenomena from signed languages other than ASL.7AppendixWe include here the full XFST script.# XFST Script for handling ASL verbal aspect.# Define what the basic signs are -- basically underlying# concepts communicated by the sign so that we can refer to# them as macros to produce the actual feature descriptions of # the signs.define WordSigns [ COOK | FORCE | PLAY | SEE | STUDY ];define AspectFeature [ Aspect ":" [None | Hab | Cont ] ] ; define Lexicon WordSigns " " AspectFeature;# Place brackets around lexical entries to facilitate later# cleanup.define BracketedLexicon 0:"[" Lexicon 0:"]" ;# Map the WordSigns to their Sign descriptions.define CookSign [. .] ->"<" Type ":" 2HD " " DH ":" 5down " " NDH ":" 5up " " Loc ":" neutral " " "+" Touch " " "+" Twist " " "-" Redup " " "-" Arc ">" || $COOK "]" _ ;define ForceSign [. .] ->"<" Type ":" 2HD " " DH ":" Cout " " NDH ":" 5down " " Loc ":" neutral " " "+" Touch " " "-" Twist " " "-" Redup " " "-" Arc ">" || $FORCE "]" _ ;define PlaySign [. .] ->"<" Type ":" 2HS " " DH ":" Ybase " " NDH ":" Ybase " " Loc ":" neutral " " "-" Touch " " "+" Twist " " "+" Redup " " "-" Arc ">" || $PLAY "]" _ ;define SeeSign [. .] ->"<" Type ":" 1H " " DH ":" Vin " " NDH ":" none " " Loc ":" face " " "-" Touch " " "-" Twist " " "-" Redup " " "+" Arc ">" || $SEE "]" _ ;define StudySign [. .] ->12"<" Type ":" 2HD " " DH ":" 5in " " NDH ":" 5up " " Loc ":" neutral " " "-" Touch " " "-" Twist " " "-" Redup " " "-" Arc ">" || $STUDY "]" _ ;# Compose these rules together to create a transducer that will # map the Englishized word forms to their sign representations.define AddSigns CookSign .o. ForceSign .o. PlaySign .o. SeeSign .o. StudySign;# Create rules to add morphological changes due to aspect and # compose them.define Habitual "-" -> "+" || $[Aspect ":" Hab] _ [Redup|Arc];define Continuative "-" -> "+" || $[Aspect ":" Cont] _ Arc ;define AspectualMorph Habitual .o. Continuative ;# Create rules for handling irregular morphology.define NoTouchWithArcTwist "+" -> "-" || _ Touch [$["+" Arc] & $["+" Twist]] ;define NoTwistWith2HSArc "+" -> "-" || $[ Type ":" 2HS ] _ Twist $["+" Arc] ;define NoNDHWith5downRedup 5down -> none || NDH ":" _ $["+" Redup] ;define NoTouchWoutNDH "+" -> "-" || $[NDH ":" none ] _ Touch ;define Irreg NoTouchWithArcTwist .o. NoTwistWith2HSArc.o. NoNDHWith5downRedup .o. NoTouchWoutNDH ;# Compose all morphotactics together.define Morph AspectualMorph .o. Irreg ;# Strip the Signs.define StripSigns "[" ?* "]" -> 0 || _ "<";# Compose it all together.define ASLVerbalAspect BracketedLexicon .o. AddSigns .o. Morph .o. StripSigns ;# Now use it.push ASLVerbalAspect。