Context-Sensitive Spoken Dialogue Processing with the DOP Model

格式：pdf
大小：50.48 KB
文档页数：23

下载文档原格式

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

Context-Sensitive Spoken Dialogue Processingwith the DOP Model

Rens BodInstitute for Logic, Language and ComputationUniversity of Amsterdam Spuistraat 134, NL-1012 VB Amsterdam&School of Computer StudiesUniversity of LeedsLeeds LS2 9JT, UKrens@scs.leeds.ac.uk

AbstractWe show how the DOP model can be used for fast and robust context-sensitive processing ofspoken input in a practical spoken dialogue system called OVIS. OVIS, Openbaar VervoerInformatie Systeem ("Public Transport Information System"), is a Dutch spoken languageinformation system which operates over ordinary telephone lines. The prototype system is theimmediate goal of the NWO Priority Programme "Language and Speech Technology". In thispaper, we extend the original DOP model to context-sensitive interpretation of spoken input. Thesystem we describe uses the OVIS corpus (which consists of 10,000 trees enriched withcompositional semantics) to compute from an input word-graph the best utterance together with itsmeaning. Dialogue context is taken into account by dividing up the OVIS corpus into context-dependent subcorpora. Each system question triggers a subcorpus by which the user answer isanalyzed and interpreted. Our experiments indicate that the context-sensitive DOP model obtainsbetter accuracy than the original model, allowing for fast and robust processing of spoken input.1. IntroductionThe Data-Oriented Parsing (DOP) model is a corpus-based parsing model which uses subtreesfrom parse trees in a corpus to analyze new sentences. The occurrence-frequencies of the subtreesare used to estimate the most probable analysis of a sentence (cf. Bod 1992, 93, 95, 98; Bod &Kaplan 1998; Bonnema et al. 1997; Chappelier & Rajman 1998; Charniak 1996; Cormons 1999;Goodman 1996, 98; Kaplan 1996; Rajman 1995; Scha 1990, 92; Scholtes 1992; Sekine &Grishman 1995; Sima'an 1995, 97, 99; Tugwell 1995; Way 1999).To date, DOP has mainly been applied to corpora of trees whose labels consist of primitivesymbols (but see Bod & Kaplan 1998, Cormons 1999 and Way 1999 for more sophisticated DOPmodels). Let us illustrate the original DOP model, called DOP1 (Bod 1992, 93), with a very simpleexample. Assume a corpus consisting of only two trees:

(1)NPVPS

NPMaryV

likesJohnNPVPS

NPVPeterhatesSusan

New sentences may be derived by combining subtrees from this corpus, by means of a node-substitution operation indicated as °. Node-substitution identifies the leftmost nonterminal frontiernode of one tree with the root node of a second tree (i.e., the second tree is substituted on theleftmost nonterminal frontier node of the first tree). Under the convention that the node-substitutionoperation is left-associative, a new sentence such as Mary likes Susan can be derived by combiningsubtrees from this corpus, as in figure (2):(2)NPVPSNPV

likesNPMaryNPSusanNPVPS

NPMaryVlikesSusan=°°Other derivations may yield the same parse tree; for instance:

(3)

NPVPS

NPVNPMaryNPVPSNPMaryV

likesSusan=

SusanVlikes°°

DOP computes the probability of a subtree t as the probability of selecting t among all corpus-subtrees that can be substituted on the same node as t. This probability is equal to the number ofoccurrences of t, | t |, divided by the total number of occurrences of all subtrees t' with the same rootlabel as t. Let r(t) return the root label of t. Then we may write:

P(t) = | t |Σ t': r(t')=r(t) | t' |Since each node substitution is independent of previous substitutions, the probability of a derivationD = t1 ° ... ° tn is computed by the product of the probabilities of its subtrees ti:

P(t1 ° ... ° tn) = Πi P(ti)

The probability of a parse tree is the probability that it is generated by any of its derivations. Theprobability of a parse tree T is thus computed as the sum of the probabilities of its distinctderivations D:

P(T) = ΣD derives T P(D)

Usually, several different parse trees can be derived for a single sentence, and in that case theirprobabilities provide a preference ordering.Bod (1992) demonstrated that DOP can be implemented using conventional context-freeparsing techniques by converting subtrees into rewrite rules. However, the computation of the mostprobable parse of a sentence is NP-hard (Sima'an 1996). The parse probabilities can be estimated byMonte Carlo techniques (Bod 1995, 98), but efficient algorithms are only known for the mostprobable derivation of a sentence (the "Viterbi parse" -- see Bod 1995, Sima'an 1995) or for the"maximum constituents parse" of a sentence (Goodman 1996, 98; Bod 1996).In van den Berg et al. (1994) and Bod et al. (1996), DOP was generalized to semanticinterpretation by using corpora annotated with compositional semantics. In the current paper, weextend the DOP model to context-sensitive interpretation in spoken dialogue understanding, and weshow how it can be used as an efficient and robust NLP component in a practical spoken dialoguesystem called OVIS. OVIS, Openbaar Vervoer Informatie Systeem ("Public Transport InformationSystem"), is a Dutch spoken language information system which operates over ordinary telephonelines. The prototype system is the immediate goal of the NWO Priority Programme "Language andSpeech Technology" (see Boves et al. 1996).

dialogue教案

页数:8
dialogue 讨价还价

页数:4
Dialogue

页数:6
Book2-Unit1-Dialogue-课前练习(家庭成员)(1)

页数:1
研究生英语听说教程一Chapter_2-4 main dialogue部分

页数:10
面试常用对话 Interview Dialogue

页数:7
关于新年的对话dialogue about new year

页数:1
Dialogue大学英语口语对话设计

页数:1
dialogue(对话短语)

页数:5
English short dialogue

页数:2

Context-Sensitive Spoken Dialogue Processing with the DOP Model

合集下载

文档推荐

最新文档