语料库语言学(黄挺)
- 格式:ppt
- 大小:387.50 KB
- 文档页数:73
语料库语言学简史语料库语言学及语料库方法的作用在今天已不容忽视,但是语料库语言学的发展却经历了一段长期曲折的过程。
以1957年Chomsky《句法结构》的发表为界,此前的语料库研究被语言学界广泛称为“早期的语料库语言学”;从上世纪50年代至80年代语料库语言学进入低谷;从80年代开始,语料库语言学迎来了它的“复苏期”。
1 早期的语料库语言学利用真实语言资料进行研究,是词汇学家和语法学家的优良传统。
早在1747年英语词典编纂的鼻祖Samuel Johnson就发表了Plan of an English Dictionary,将前人收集资料的最好方法作了总结,他所编的英语词典含15万条以上的引证说明,可见其所收集的资料库已相当可观。
Oxford English Dictionary于1928年完成,所用的引证达400多万条,卡片1100多万张,还专门出版了中古英语手稿与文章350册,供编写OED时参考。
Webster’s New International Dictionary第二版的编写参照了100多万条引证,第三版于1961年付印时,新旧引证共达1000多万条。
英语语法大师Jesperson在编写《英语语法大全》(1909-1949)时,所使用的卡片数目多达30至40万张。
(王建新,1998:52)20世纪40年代,美国的语言学家Boas在研究美洲印第安语言时就使用了语料库的方法(corpus-based methodology),后来的结构主义语言学家更是如此。
只不过当时还没有出现一个专门叫“语料库语言学”的词汇而已。
下面是早期语料库语言学的一些主要研究领域:1.1 语言习得研究19世纪70年代,语言学家就系统地对幼儿语言习得进行了研究。
这些研究都基于父母对幼儿话语及时记录的日记材料。
即使在现代,基于原始语料的语言习得研究也没有停止(Ingram,1978)。
在以日记材料为语料的研究风行过后(通常认为从1876年至1926年),语言习得的研究主要表现为下面两种方式:(1)以大量的、不同年龄段的儿童为语言素材的来源(informant),进行语言发展和成熟的研究;(2)以少量的儿童为语言素材的来源,长期跟踪记录他们使用的语言而进行的历时研究(McEnery,Wilson,2001:3)。
《语料库与批判话语分析》篇一一、引言在当今社会,语言不仅是人们交流的工具,更是社会文化、意识形态和权力关系的反映。
因此,对语言的研究显得尤为重要。
语料库和批判话语分析作为两种重要的语言研究方法,为人们提供了深入探讨语言背后隐藏的社会、文化和心理层面的手段。
本文将分别介绍语料库和批判话语分析的概念、特点及两者在语言研究中的应用,并探讨它们之间的互动关系。
二、语料库的概念与特点1. 语料库的概念语料库是一种大规模的、结构化的语言数据集合,用于语言学、语言教育、翻译等领域的研究。
它通过收集、整理和分析大量的语言实例,为研究者提供了丰富的语言数据资源。
2. 语料库的特点(1)大规模性:语料库包含大量的语言实例,可以反映语言的真实使用情况。
(2)结构化:语料库中的数据经过整理和标注,便于研究者进行数据分析和提取。
(3)客观性:语料库提供的数据具有客观性,可以避免主观臆断和偏见。
三、批判话语分析的概念与特点1. 批判话语分析的概念批判话语分析是一种以社会、文化和意识形态为背景的语言分析方法,旨在揭示语言背后的权力关系、意识形态和社会不平等。
它通过对文本、话语和交流过程的分析,揭示出语言使用中的社会、文化和心理层面的意义。
2. 批判话语分析的特点(1)社会性:批判话语分析关注语言与社会、文化和意识形态的关系。
(2)批判性:批判话语分析注重揭示语言背后的权力关系和意识形态。
(3)综合性:批判话语分析需要综合考虑文本、语境、交际者等多方面的因素。
四、语料库与批判话语分析在语言研究中的应用1. 语料库在语言研究中的应用(1)语言描写与对比:通过语料库,研究者可以收集大量语言实例,对不同语言进行描写和对比,揭示语言的特征和规律。
(2)语言教学与翻译:语料库为语言教学和翻译提供了丰富的数据资源,有助于提高教学效果和翻译质量。
(3)社会语言学研究:语料库可以反映社会的语言使用情况,为社会语言学研究提供数据支持。
2. 批判话语分析在语言研究中的应用(1)揭露社会不平等:批判话语分析通过分析文本和交际过程,揭露语言背后的权力关系和社会不平等。
国内语料库研究综述摘要本文旨在回顾国内语料库研究的发展历程、现状,并探讨未来研究方向。
通过分析相关文献资料,文章总结了国内语料库研究的主要成果、不足之处,并提出了针对性的建议。
本文旨在为语料库研究领域的学者提供参考,以推动国内语料库研究的发展。
关键词:语料库、国内研究、发展历程、现状、未来研究方向引言语料库是指为语言研究而收集的、有一定规模的、有代表性的语言材料集合。
自20世纪中期以来,语料库在国外得到了广泛应用,并在多个领域取得了显著的成果。
近年来,随着国内语言学、计算语言学等学科的快速发展,语料库在国内的研究和应用也逐渐受到重视。
本文将重点探讨国内语料库研究的现状、成果及未来研究方向。
研究现状1.国内语料库的发展历程和现状自20世纪80年代起,国内开始出现一些小型语料库的建设和研究,如国家语委现代汉语通用词库等。
随着计算机技术的不断发展,90年代中后期以来,国内开始大力推进语料库的建设和研究,涉及的领域也日益广泛。
目前,国内已经建立了一系列不同规模、不同类型的语料库,如中国传媒大学的中国广播电视媒体语言语料库、上海交通大学的中文文本分类语料库等。
2.基于不同领域语料库的研究成果和不足语料库在多个领域得到了广泛应用,如语言教学、词典编纂、语言政策研究等。
在语言教学领域,语料库可以提供真实的语言材料和语境,有助于提高语言学习者的兴趣和理解能力。
在词典编纂领域,语料库可以提供大量的实例和用法,有助于提高词典的准确性和实用性。
在语言政策研究领域,语料库可以提供真实的语言使用情况和发展趋势,有助于制定科学的语言政策和发展规划。
然而,国内基于不同领域语料库的研究成果尚不够丰富,且在某些领域还存在着研究空白。
例如,针对特定领域的语料库建设和研究尚不够深入,部分领域的语料库仍存在着规模较小、代表性不足等问题。
此外,针对语料库在二语习得、语言演化等领域的研究尚不够充分。
3.国内语料库在语言教学中的应用语料库在语言教学中的应用已经得到了广泛的认可。
语料库与批判话语分析语料库与批判话语分析引言语料库与批判话语分析是当前语言学领域中受到广泛关注的研究方法,它提供了一种利用大规模实际语言数据进行分析和研究的途径。
本文将介绍语料库和批判话语分析的基本概念,并探讨这两个领域之间的关系以及它们在当代语言学研究中的应用。
一、语料库概述语料库是指收集并整理的自然语言的大规模实际语言数据的集合。
它可以被视为语言的一个现实样本,通过对其进行分析,可以揭示出语言使用的规律和模式。
语料库的建设和应用涉及到语言学、计算机科学、统计学等多个领域的知识,因此具有很高的学科交叉性和应用性。
语料库可以分为专业语料库和通用语料库。
专业语料库以特定领域的语言为主题,如法律语料库、医学语料库等,它们可以用来分析语言在特定领域中的应用情况。
通用语料库则以各种领域的综合语言为主题,如语言学研究用的综合语料库、多语言对比语料库等,它们可以用来研究语言普遍性和多样性。
二、批判话语分析概述批判话语分析是一种研究社会文化问题的方法,它强调在话语中存在的权力关系、意识形态和社会结构等方面的问题。
该方法通常通过对社会实践中的话语进行深入分析,以揭示社会文化现象的内在因果关系。
批判话语分析关注话语背后的权力关系和话语的隐含意识形态。
它关注话语背后的表达方式、话语权力的行使者以及话语的消费者等方面。
批判话语分析的目标是通过揭示话语的复杂性和隐含意义,来理解和解释话语所涉及的社会关系和社会问题。
三、语料库与批判话语分析的关系语料库和批判话语分析都以实际语言数据为研究对象,但研究的侧重点有所不同。
语料库研究主要关注语言的现象和规律,通过对大规模语料数据的统计和计量分析,揭示出语言使用中的频率、分布、变异等规律。
语料库研究提供了一种基于实际数据的语言描述和分析方法,使语言学的研究更加客观和科学。
批判话语分析主要关注话语中隐含的权力关系和意识形态。
它通过对话语的语义、语用和语境等方面的分析,揭示出话语权力的行使者、话语的潜在意图以及话语的影响和效果。
龙源期刊网 浅析语料库语言学对高中英语学习者的促进作用作者:林玲来源:《校园英语·月末》2018年第11期【摘要】基于语料库语言学的一些理论知识还有一些具体的方法,相关的教育人员都在积极的教育实践中发现了一些问题。
这些问题可能是老师和学生在英语教学活动中出现的一些问题,也可能是师生关系的问题。
语料库语言学对高中英语教学有着很大的指导性,是很有效的英语学习工具。
本文就从这个角度出发,具体分析一下语料库语言学对高中英语学习者的促进作用,从而为学生的发展贡献自己的一份力量。
【关键词】语料库;高中英语;学习者【作者简介】林玲,江苏省启东市教师发展中心。
一、语料库语言学和英语语料库的具体分类语料库语言学是一门和语料库直接相关的学科,它承载着很多的语言资料。
而英语语料库就是针对英语这一门学科的语言材料库。
通过对相关材料的分析,它可以把文本的内容分成两个部分,分别为已经附码的英语语料库和未附码的英语语料库。
还可以從它的来源把它们分为本族语语料库还有学习者的语料库。
或者还可以再进一步的进行细分的工作。
那么其中特别重要的部分就是要把它们分为学习者语料库还有本族语语料库的区别。
作为学习的一方,他们更加注重资料的准确性,也就是语言的学习标准。
这样的一个语料库可以给需要他们的人带来很多的便利,其中最重要的就是可以带来很多真实而且生动有趣的实际例子。
作为教育的那一方,他们可能会更加的关注进行语言描述的准确性,他们会把这个资料库看成一个整体。
内容是不是正确的都需要学习者进行尝试,进行假设。
还要进行证明。
教育者可以对学习者的语言特点进行一些描述的活动,可是不能仅仅通过对语料库的使用多少就做出相应的判断。
二、高中英语学习面临的问题高中英语教育作为基础教育中最高级别的教育,它的目标就是要培养出能够经过运用英语的人才。
不过在升学的压力下,学生进行自学的部分越来越少,很多的教学活动都变成了语法的学习。
高中英语的课堂上面只有听说读写的练习,失去了它原本的魅力。
Meyer, C. F. 2002. English Corpus Linguistics: An Introduction.《英语语料库语言学导论》. Cambridge: Cambridge University Press.(外教社2008年引进)Wolfgang, Teubert & Anna Cermakova. 2007. Corpus Linguistics: A Short Introduction.《语料库语言学简论》. London & New York: Continuum International Publishing Group Ltd. (世界图书出版公司2009年引进)曹竞,2012,《语料库语言学实用入门教程(Corpus Linguistics: A Practical Introduction)》。
北京:北京理工大学出版社。
桂诗春,2009,《基于语料库的英语语言学语体分析》。
北京:外语教学与研究出版社。
国防,2012,《基于语料库的英语话语标记语分析》。
合肥:安徽大学出版社。
何安平,2004,《语料库语言学与英语教学》。
北京:外语教学与研究出版社。
何安平,2010,《语料库辅助英语教学入门》。
北京:外语教学与研究出版社。
华南师范大学外国语学院编,2005,《语料库语言学的研究与应用》。
长春:东北师范大学出版社。
黄昌宁,李涓子,2002,《语料库语言学》。
北京:商务印书馆。
卢植,2012,《基于语料库的汉英语义基元的语义韵对比研究》。
北京:外语教学与研究出版社。
潘璠,2012,《基于语料库的语言研究与教学应用》。
北京:中国社会科学出版社。
王立非、梁茂成等,2007,《计算机辅助第二语言研究方法与实用》。
北京:外语教学与研究出版社。
王世杰、赵玉华、武永胜,2013,《基于语料库的医学英语词汇研究与学习》。
兰州:兰州大学出版社。
王永庆,2014,《计算机技术、语料库与语言测试》。
语料库之于认知语言学和二语教学法的中介作用【摘要】本文探讨了语料库在认知语言学和二语教学法中的中介作用。
首先介绍了语料库的概念和意义,认知语言学的基本理论以及二语教学法的发展历程。
然后详细阐述了语料库在认知语言学和二语教学法中的应用,包括对二语学习的促进作用和对二语教学法的优化。
最后探讨了语料库在认知语言学和二语教学法中的重要性,未来语料库在二语教学中的发展前景,并对文章进行了总结与展望。
通过深入研究和理论分析,本文揭示了语料库在认知语言学和二语教学法中的关键作用,为未来的研究和实践提供了重要的参考依据。
【关键词】语料库,认知语言学,二语教学法,中介作用,应用,促进作用,优化,重要性,发展前景,总结与展望1. 引言1.1 语料库的概念意义语料库是指收集、整理和存储大量实际语言材料的数据库,其包括书面文本、口头语料、音频和视频等形式的语言数据。
语料库的概念意义在于提供了一个具体的语言学研究对象,为研究者提供了大量真实的语言使用情境。
通过分析语料库中的语言数据,研究者可以揭示语言的规律和特点,深入理解语言的结构和功能。
语料库的建立和利用不仅促进了语言学理论的发展,也为语言教学和翻译等实践领域提供了重要支持。
在认知语言学中,语料库被用来验证和丰富理论模型,通过分析大规模数据来探究语言学习和认知过程。
语料库中的语言实例可以帮助研究者验证认知理论假设,深化对语言认知机制的理解。
语料库还可以用来研究语言的语用功能,揭示语言在交际过程中的作用和特点。
通过语料库的建立和分析,认知语言学可以更好地解释和预测语言现象,推动该领域的研究进展。
语料库在语言学研究的不同领域中都发挥着重要作用,为研究者提供了丰富的语言数据资源,促进了对语言现象的深入理解和探究。
在认知语言学和二语教学法领域,语料库的应用将进一步推动理论和实践的发展,为语言学习和教学提供更为科学和有效的支持。
1.2 认知语言学的基本理论认知语言学是一门研究人类语言认知过程的学科,其基本理论涉及语言习得、语言处理和语言表征等方面。
【汉语言文学】外语教学语料库思索外语教学语料库思考1引言语料库语言学〔CorpusLinguistics〕兴起于20世纪中后期,是基于大量真实的语言资料,从调查语言信息的分布频率入手,对语言使用规律和模式进行讨论的语言科学。
语料库语言学孕育着对语言观念和语言描述框架的不断更新,是语言讨论方法论的一个重大突破,现已经成为语言学的主流分支。
随着计算机技术的迅猛进展,电子语料库语言学以其大容量的语言信息和高效的检索结果对语言讨论产生了很大的影响,使得语料库渐渐成为语言学理论讨论、应用讨论的重要资源之一。
20世纪60年月至70年月,G.Leech和T.Johns曾指出,语料库在语言教学的应用是语料库语言学的一个重要分支,由于两者是一种相互渗透的综合体。
语料库在语言教学的应用可以分为直接应用和间接应用两个方面。
直接应用指直接教授语料库的相关学问和方法体系,并利用语料库资源进行语言教学;间接应用指基于语料库资源编纂教材、参考书,开发多媒体课件等。
作为信息技术的产物,语料库语言学的消失在很大程度上冲击了外语语言教学的讨论与实践。
由语言习得的规律可知,单纯地记忆语言学问并不能保证语言的正确使用。
只有通过大量真实自然的语言输入〔input〕和输出才能实现真正的语言习得。
在传统语言学指导下的外语语言教学模式里,语言学习者只能通过记忆语言规章、理解教科书上的实例来完成语言习得。
而语料库语言学在外语教学中的应用则表达为既能供应高频率消失的语言信息,又能供应真实自然的语言环境,进而关心外语学习者发挥主观能动性,进行自主学习。
由此可见,外语老师借助计算机对语料库所供应的大量语言材料进行相应的分析、统计和应用,必将能为外语课堂教学掀开崭新的一页。
2语料库语言学对外语教学资源的影响长期以来,外语教学讨论者们始终在商量“教什么”和“如何教”的问题。
传统的外语教学一味强调语法系统的完好性,却往往忽视了语法规章在详细语言环境中的应用。
近十年国内语料库语言学研究综述一、本文概述近年来,随着信息技术的飞速发展和大数据时代的来临,语料库语言学在国内语言学界的影响力逐渐增强。
本文旨在全面梳理近十年国内语料库语言学研究的发展脉络,总结研究成果,分析存在的问题,并展望未来的发展趋势。
本文将从语料库的建设、语料库语言学理论、语料库在语言教学和研究中的应用等方面展开综述,以期为国内语料库语言学的研究者提供一个清晰的研究全景和参考框架。
通过对近十年国内语料库语言学研究的系统回顾,本文旨在推动语料库语言学在国内的深入发展,为语言学研究的创新提供新的视角和方法。
二、语料库建设与研究近十年,我国语料库语言学在语料库建设方面取得了显著进展。
语料库作为语言学研究的基础资源,其规模和质量直接影响到研究的深度和广度。
在这一时期,我国学者和机构积极投入语料库的建设工作,不仅扩充了语料库的种类和数量,还提高了语料的质量和标注精度。
在语料库种类方面,除了传统的通用语料库外,还出现了专门针对某一领域或语体的语料库,如法律语料库、医学语料库、社交媒体语料库等。
这些专业语料库为相关领域的研究提供了丰富的数据支持。
在语料库规模方面,随着大数据技术的发展,语料库的规模不断扩大。
大型语料库如“国家语委现代汉语语料库”“古代汉语语料库”等,为语言学研究提供了海量的语料资源。
在语料质量方面,我国语料库建设注重语料的真实性和代表性。
通过严格的语料采集和筛选流程,确保语料的质量和准确性。
同时,采用自动化和半自动化的方法对语料进行预处理和标注,提高了语料的处理效率和质量。
在语料库研究方面,我国学者充分利用语料库资源进行各种语言学研究。
通过语料库的统计分析,揭示语言现象的本质和规律。
还利用语料库进行语言对比研究、语言演变研究、语言教学研究等,推动了语言学研究的深入发展。
近十年我国语料库语言学在语料库建设与研究方面取得了显著成就。
语料库的规模和质量不断提升,为语言学研究提供了强大的数据支持。
Language Learning ISSN0023-8333 English L1and L2Speakers’Knowledgeof Lexical BundlesTatiana M.NekrasovaNorthern Arizona UniversityThe purpose of the present study is to contribute to the ongoing debate about the use of lexical bundles byfirst(L1)and second language(L2)speakers of English.The study consists of two experiments that examined whether L1and L2English speakers displayed any knowledge of lexical bundles as holistic units and whether their knowledge was affected by the discourse function of the lexical bundles(discourse-organizing or referential).The participants in Experiment1(N=61)completed a gap-filling activity, whereas the participants in Experiment2(N=61)carried out a dictation task.Results showed that the participants’knowledge differed for specific lexical bundles and that, overall,they knew more discourse-organizing bundles than referential bundles.The implications of the study are discussed in terms of current research about the role of frequency-based language chunks in L1and L2speech processing in English. Keywords discourse function;formulaic sequences;frequency-based chunks;language processing;lexical bundles;psycholinguisticsSince the late1970s,linguists have established the importance of formulaic sequences for language processing and production(Hakuta,1974;Nattinger& DeCarrico,1992;Peters,1983;Wong Fillmore,1976;Wray,2002).T ypically defined as frequent multiword combinations that are stored and retrieved holis-tically from the mental lexicon at the moment of speech,formulaic sequences have been argued to minimize encoding work for both the speaker and ad-dressee,thus allowing for the construction offluent spoken discourse(Erman,Preliminary results were reported at AAAL2007in Costa Mesa,CA and AAAL2008in Washing-ton,DC.I am grateful to Kim McDonough for her insightful comments on earlier versions of this article.I also thank Doug Biber,Viviana Cortes,Bethany Gray,and the editor and four anonymous reviewers of Language Learning for their valuable input,Valeria Kashpur for assistance with data collection,and Tony Becker for his assistance with data coding and consistent support of this project through comments and discussion.Any errors are,of course,my own. Correspondence concerning this article should be addressed to Tatiana M.Nekrasova,Depart-ment of English,Northern Arizona University,P.O.Box6032,Flagstaff,AZ86011-6032.Internet: tnn8@2007;Pawley&Syder,1983;Raupach,1984;Wood,2006).In addition,proper use of formulaic sequences has been found to be critical for the acquisition of nativelike language competence(Dufon,1995;House,1996).Formulaic sequences,as a broad category,include many different sub-classes:proverbs,lexicalized stems,clich´e s,speech formulae,idioms,recur-ring utterances,and others.Wray(2002)provided a list of terms that are used to describe aspects of formulaic language in terms of their place on a contin-uum from being completelyfixed(e.g.,idioms and set expressions)to more compositional(e.g.,semi-preconstructed phrases,sentence builders,patterns). Although formulaic language has been the focus of linguistic inquiry for sev-eral decades,only a few relativelyfixed subclasses of formulaic sequences have been targeted in traditional linguistic studies conducted in phraseology and pragmatics.As a result,more compositional subclasses of formulaic se-quences that differed from idioms and set expressions in their structural and functional characteristics were largely ignored in linguistic research until the development of corpus-based methods to data analysis.Ever since corpus-driven research revealed that,in addition to idioms and set phrases,a much greater number of language constructions have a tendency to occur together, attempts have been made to formally describe and classify these units in order to examine their formulaic nature and identify their importance for language production and language acquisition.These co-occurring constructions and, more specifically,the question of their psychological reality are the focus of the present study.Before turning the current discussion to the topic at hand,the following sections provide a brief overview of research conducted on formulaic sequences in phraseology,pragmatics,and corpus linguistics,thus situating the present study within a broader scope of formulaic language.Previous Research on Formulaic SequencesPhraseologyEarly research on formulaic language has directed a considerable amount of attention to idioms,which have been traditionally defined as chunks of frozen syntax that are not constructed from the generative grammar rules and are re-trieved holistically at the moment of use(e.g.,raining cats and dogs,kick the bucket,spill the beans).As nontransparent constructions,idioms were com-monly viewed as archetypical formulaic sequences;and as some researchers argued,it was the nontransparency of the idioms that defined their formulaic sta-tus(Hudson,1998;Nattinger&DeCarrico,1992;Williams,1994).At the same time,other scholars advocated for a broader definition of an idiom to includepartly analyzable constructions together with nontransparent ones(Cowie, 1988;Wray,2002).Specifically,Cowie(1988)suggested that instead of separat-ing nontransparent idioms from other(partially)transparent expressions,there should be a continuum of idiomatic expressions that would incorporate“very many semantically evolved composites which are still partially analysable”(p.135).This alternative account of idiomatic expressions expanded the cat-egory of formulaic sequences by going beyond traditional idioms and recog-nizing other types of constructions,more transparent in nature,as formulaic. PragmaticsAnother subclass of formulaic sequences—speech formulas(or routine formulas)—has been largely explored in pragmatic research within the frame-work of speech acts through the work of the linguists who examined the lan-guage of routine social encounters(Coulmas,1979,1981;Ferguson,1976; House,1996).Speech formulas were identified as set expressions that are tied to particular predictable situations and are used to realize such speech func-tions as thanking,apologizing,and others(e.g.,thank you very much,I am very sorry).Although semantically more transparent compared to traditional idioms, speech formulas acquired their formulaic status from their ability to meet cer-tain functional demands that,subsequently,led to their high predictability and frequency of occurrence in certain types of social situations.At the same time, speech formulas were described to be similar to idioms in terms of their form: both subclasses of formulaic sequences are considered to be relativelyfixed, with certain types of speech formulas,however,being defined as more compo-sitional than others(e.g.,Nattinger&DeCarrico,1992;Van Lancker-Sidtis& Rallon,2004).Corpus LinguisticsThe development of corpus-based techniques introduced a new way to explore a language.Whereas previous language research relied exclusively on native-speaker intuition when describing language units,corpus linguistics brought in a more objective frequency-based approach to not only offer new insights about existing language regularities but also to reveal previously unobserved phenomena(e.g.,Conrad,2000;Lindemann&Mauranen,2001).Thus,when analyzing a range of oral and written corpora,it became obvious that,in addition to already established classes of formulaic sequences(sayings,proverbs,speech formulae,and idioms),a large number of language units were found to co-occur in preferred order without being governed by specific grammar rules (Altenberg,1998;Biber&Conrad,1999;Granger,1998;Moon,1998;Sinclair, 1999;Wray,2002).Observations from corpus-driven research motivated a growing number of studies that explored different structural types of recurrent multiword chunks in various kinds of oral and written corpora:recurrent word combinations (Altenberg,1998),prefabricated patterns(Granger,1998),phrasal lexemes (Moon,1998),highly recurrent word combinations(De Cock,2000),and lex-ical bundles(Biber&Conrad,1999;Biber,Johansson,Leech,Conrad,& Finegan,1999),all of which are identified in a language on the basis of their frequency of co-occurrence.These structural and semantic distinctions of re-current multiword constructions from other subclasses of formulaic sequences thus led to questions about the formulaic status of recurrent combinations. However,whereas much research has investigated the psychological reality of traditional idioms and set expressions(see Gibbs&Gonzales,1985;Gibbs, Nayak,&Cutting,1989;Swinney&Cutler,1979;Van Lancker&Kempler, 1987),little has been done to examine the psychological status of recurrent multiword chunks(but see Schmitt,Grandage,&Adolphs,2004).The present study examines the issue of psychological validity of a particular type of recur-rent chunks:lexical bundles.The following sections briefly outline the research on lexical bundles conducted to date and provide the rationale for the present study.Identification and Characteristic Features of Lexical BundlesFirst introduced by Biber and colleagues,lexical bundles are defined as the most frequently occurring sequences“of three or more words that show a sta-tistical tendency to co-occur”(Biber&Conrad,1999,p.183).Lexical bundles were initially identified in two major registers of the The Longman Grammar of Spoken and Written English(Biber et al.,1999)—conversation and academic prose—as units that occurred at least10times per million words.One charac-teristic feature that distinguishes lexical bundles from other types of recurrent multiword chunks is that they are often structurally incomplete units that occur at the phrase and clause boundaries(e.g.,in the case of,the point of view of, I don’t know if).Structurally,these units can consist of incomplete nominal chunks(i.e.,prepositional or noun phrase:the nature of the,as a result of)or clausal chunks(i.e.,verb phrase and the beginning of a complement clause:I don’t know how,I thought that was).Furthermore,shorter lexical bundles can often be incorporated within longer lexical bundles,sometimes more than one (e.g.,I don’t think as a part of well I don’t think,I don’t think so,but I don’t think).In addition,lexical bundles can be classified in terms of their discourse functions,which are described in the next section.According to Biber,Conrad,and Cortes(2004),the four primary functions of lexical bundles identified in English academic registers and conversation in-clude(a)stance bundles that convey interpersonal meanings,such as attitudes and assessments(e.g.,it is important to,I don’t think so,I want you to);(b) discourse organizers that help reveal relationships between prior and coming discourse,such as topic introduction and topic elaboration(e.g.,nothing to do with,on the other hand,as well as the);(c)referential bundles that perform an ideational function and are used to make direct reference to physical or abstract entities,such as time,place,and text references(e.g.,is one of the,in the form of,as a result of,the nature of the);and(d)special conversational bundles that are mostly used in conversation to express politeness,inquiry,and report (e.g.,thank you very much,what are you doing,I said to him/her).These four discourse functions of lexical bundles should be distinguished from pragmatic functions that other multiword constructions,such as speech formulas,can have (Coulmas,1979,1981).Pragmatic functions are usually associated with highly conventionalized expressions that are more salient and,thus,are used to effec-tively communicate certain pragmatic meanings,such as expressing requests, apologies,or gratitude.Unlike speech formulas that are more interactional in nature and are highly dependent on conversational context,the majority of lexical bundles operate on a textual level and are relatively“context-free.”For example,whereas the occurrence of a speech formula Nice to see you is closely bound to a social situation of greeting a person,the occurrence of a lexical bundle nothing to do with is not associated with any specific situation and can be equally frequent in a variety of contexts.In this regard,lexical bundles that serve special conversational functions in discourse are more likely to have pragmatic functions.L1and L2Research on Lexical BundlesFirst language(L1)research on lexical bundles has mostly focused on the identification of these units and a description of their patterns of occurrence in different English L1registers,including both written academic prose and con-versation(Biber&Conrad,1999;Biber et al.,1999;Biber,Conrad,&Cortes, 2004).More recent L1research on lexical bundles has focused on identifying the discourse functions that lexical bundles serve in different texts(Biber,Con-rad,&Cortes;Cortes,2004),as discussed in the previous section.In addition, a number of corpus studies have adopted a contrastive approach to the analysis of the use of corpus-derived constructions(including lexical bundles as well as other recurrent multiword chunks)by comparing L1and L2written andoral corpora(De Cock,2000;De Cock,Granger,Leech,&McEnery,1998; Granger,1998;Warga,2005).These studies indicated that L1and L2(second language)speakers’use of recurrent phrases was different both quantitatively and qualitatively(De Cock,2000;De Cock et al.).More specifically,L2speak-ers were found to be unaware of the more common,yet less salient L2chunks, and in order to compensate for their lack of awareness,they often referred to L1transfer.The process of L1transfer was realized in several ways.First,L2 speakers were found to either modify or avoid using certain L2constructions that did not have L1equivalents.Second,L2speakers tended to overuse those L2constructions whose L1equivalents were more common.Finally,L2speak-ers showed the misuse of those constructions whose L2equivalents did not match their L1counterparts.As De Cock(2000)argued,turning to L1transfer during L2production could potentially lead to the“foreign-soundness”of L2 speakers’speech and writing.Because lexical bundles are defined as combinations that occur frequently in a text or a collection of texts,it is logical to assume that the frequency counts serve as an indication of these units being conventionalized by the speech community,which would suggest their formulaic nature.At the same time, some corpus linguists argue that simple frequency counts do not provide enough grounds to view any corpus-derived construction as formulaic(De Cock,2000; De Cock et al.,1998).One of the reasons for skepticism is that frequency information may not be relevant to how language structure is represented in one’s mind.For example,a combination of the two words it and is extremely frequent in English language,mostly because the individual words included in this combination are closed-class items that occur very frequently in any corpus.Thus,it is very unlikely that this combination is represented in the mind as a holistic unit and can be defined as formulaic.Another reason to question the assumptions about the formulaic nature of lexical bundles comes from their structural and functional differences from other established classes of formulaic sequences.In order to contribute to the existing body of research on lexical bundles and define their place within a broader category of multiword constructions,the present study explores the issue of psycholinguistic validity of lexical bundles.Research on Psychological Reality of Lexical BundlesThe extent to which lexical bundles(or other recurrent lexical sequences)could be considered to be psycholinguistically real has not been fully examined,as corpus linguists have become interested in this topic only in the last decadeand the majority of lexical bundle studies generally describe the distribution of these units in different registers(see Biber&Conrad,1999;Biber,Conrad,& Cortes,2004;Biber et al.,1999;Cortes,2004).In the only study to date that has examined the issue of psycholinguistic validity of lexical bundles,Schmitt,Grandage,et al.(2004)questioned whether corpus-derived recurrent clusters(i.e.,lexical bundles)are psycholinguistically valid and,therefore,stored and processed holistically.After identifying25 sequences from previous publications,they created a text about a hitchhiker and embedded the target sequences in it.Both English L1(n=34)and L2 (n=45)participants performed a dictation task during which they listened to the recorded text and orally reconstructed it sentence by sentence.The authors argued that the bundles the participants were able to reproduce could be considered formulaic and were holistically stored in mind.Thefindings of the study suggested that not all corpus-driven clusters were psycholinguistically valid according to their criteria,with many of them being used idiosyncratically by the individual speakers.The researchers concluded that both corpus and psycholinguistic approaches should be used when deciding whether corpus-driven clusters share the same psycholinguistic characteristics as holistically stored formulaic chunks.By employing the sequences that varied in length, frequency,and transparency of meaning,the study did not provide conclusive evidence to either bridge the two categories(i.e.,lexical bundles and formulaic chunks)or distinguish them.Schmitt,Grandage et al.’s(2004)study is innovative in that it put a com-monly accepted assumption to empirical testing.At the same time,this study displayed several limitations that need to be addressed here.First,not all target sequences employed in the study could qualify as recurrent bundles:Some of them were much more frequent in the corpus than others(e.g.,you know vs. to make a long story short).Second,whereas some of the bundles could be described as more salient in terms of the pragmatic functions they realized in certain language situations(e.g.,I don’t know what to do,go away,to make a long story short,it’s not too bad),other bundles did not have any pragmatic function and served more like cohesive devices in a text(e.g.,as shown in figure,is one of the most,what I want to,etc.).Finally,some of the bundles examined in the study were clearly extracted from academic register,whereas other bundles came from and were characteristic of a more informal register (i.e.,conversation).Both types of bundles were then embedded in a story about a hitchhiker,a narrative that had a rather informal tone,which,as the authors acknowledged,might have created some difficulties for the academic regis-ter bundles to be equally produced by the participants.Thus,the choices theauthors made during the initial selection of the target structures and the context in which they were embedded might have contributed to the inconclusive results. Present StudyThe present study was designed to contribute to the debate concerning the psy-cholinguistic validity of lexical bundles by addressing some of the limitations of the previous research conducted in this area.First,all target structures em-ployed in the study were lexical bundles;thus,they were identified strictly on the basis of frequency counts.Second,all lexical bundles were homogeneous in terms of their functional characteristics:They all performed discourse functions signaling the relationships between different elements(i.e.,phrases,clauses, sentences)in a text.None of the bundles had an advantage of expressing a prag-matic function by carrying out the meaning related to a certain conversational context(e.g.,See you later in a situation of saying“good-bye”to someone). Furthermore,thefindings of previous L1corpus-based studies indicated that discourse function of lexical bundles related to the frequency of their use by the participants(Cortes,2004,2006).Therefore,the present study also investigates the possible effect of two discourse functions of lexical bundles—referential bundles and discourse organizers—that may affect their production by L1and L2English speakers.Finally,an attempt was made to ensure that all contexts in which target lexical bundles were embedded were register-appropriate,that is, both the target lexical bundles and the contexts belonged to the same registers: university teaching and textbooks.The main purpose of the study was to examine if lexical bundles are recog-nized by L1and L2participants as holistic units and,therefore,have psycholog-ical validity.Following Schmitt,Grandage,et al.’s(2004)study,it was assumed that no direct nonlaboratory measure was available to determine whether L1 and L2participants recognize lexical bundles as holistic units.For that reason, participants’recognition of lexical bundles as holistic units was operationalized as(a)their ability to produce them asfixed units in both short and extended pieces of discourse,(b)their ability to produce lexical bundles in a contex-tually appropriate matter,and(c)participants’use of lexical bundles to ease the processing burden during text comprehension and subsequent production (see Wray,2000,2002;Wray&Perkins,2000).The study consists of two experiments that employed different measures to assess whether L1and L2 English speakers have knowledge of lexical bundles as holistic units.Whereas Experiment1involved a controlled-production activity(a gap-filling task),Ex-periment2employed an extended production activity(a timed dictation task).Both experiments addressed the same research question:Do English L1and L2speakers differ in their knowledge of lexical bundles that serve different discourse functions?Because previous research has indicated that native and nonnative speakers use lexical bundles differently(De Cock,2000;Granger, 1998:Warga,2005),it was predicted that L1speakers would display greater knowledge of lexical bundles than L2learners.In addition,because previous corpus-based studies have illustrated that lexical bundles performing certain discourse functions were used more frequently by the participants than lexical bundles performing other functions(Cortes,2004,2006),it was predicted that the participants’production of lexical bundles in the present study would be affected by the discourse function performed by these units in context. Experiment1MethodParticipantsThe participants were L1English speakers(n=20),advanced L2English speakers(n=18),and intermediate L2English speakers(n=23),all of whom were undergraduate and graduate students at a regional university in the western United States.None of the participants were majoring in applied linguistics or TESL.The L1speakers consisted of4males and16females, aged between18and45years(M=24.3,SD=7.88).The advanced L2 speakers included8males and10females,aged between20and43years(M =28.44,SD=8.32),who had completed between3and16years(M=10.56, SD=3.71)of formal high school/college education in English and reported the length of residence in the United States ranging from1to127months (M=20.17,SD=30.56).The intermediate L2speakers included12males and11females,aged between17and37years(M=20.7,SD=3.85),who had completed between4and11years(M=7.17,SD=1.99)of formal English instruction,and their length of residence in the United States was reported to be between2and48months(M=6.39,SD=9.65).The advanced and intermediate groups were established on the basis of the participants’enrollment status.Whereas the advanced L2speakers were degree-seeking undergraduate or graduate students,the intermediate L2speakers were enrolled in an Intensive English Program.The participants volunteered to take part in the study and were not compensated.MaterialsGap-filling task.Following Schmitt,Grandage,et al.(2004),it was as-sumed that no direct measures were available to assess participants’knowledgeof lexical bundles as holistic units.Thus,if the participants could,based on the context,reproduce the missing parts of lexical bundles correctly,it would suggest that they had the knowledge of these units as holistic entities.The gap-filling task was designed to measure whether participants were able to recognize and produce the missing parts of the target lexical bundles based on the surrounding context.The test materials consisted of32sentences with embedded lexical bundles that performed two different discourse functions: discourse-organizing bundles(n=15)and referential bundles(n=17).The two sets of lexical bundles were matched for frequency.Only these two func-tional types of lexical bundles were selected to be used in the study because, compared to the stance bundles,they contained less repetition of the same word sequence in the form and,thus,were less synonymous(e.g.,stance bundles: I don’t know if,I don’t know what,I don’t know how,etc.).The vast use of synonymous structures in the task could potentially make it difficult for the participants to display their knowledge of a specific lexical bundle in a context that allows multiple alternatives.Finally,special conversational bundles were not targeted in the materials because the Biber,Conrad,and Cortes(2004) original list included only three of these bundles.The gap-filling task was designed in several steps.First,32lexical bun-dles and their functions(see Appendix A)were identified from Biber,Conrad, and Cortes’s(2004)corpus-based study of university discourse.The study was based on an analysis of texts from university registers in the TOEFL2000 Spoken and Written Academic Language Corpus(Biber,Conrad,Reppen, et al.,2004)and focused on four-word bundles that occurred40or more times per million words.Because one’s language experience is shaped by information coming from different types of registers(i.e.,academic and everyday language), an attempt was made to match the bundles in both functional categories in terms of their frequency distribution in academic prose as well as conversation.Oth-erwise,some bundles that were found to be frequent in both academic prose and conversation could potentially be more salient(i.e.,more recognizable) than those bundles that were frequent only in academic prose.Thus,40%of discourse-organizing bundles were frequent in both the academic prose and conversation,whereas60%of the bundles were frequent only in academic prose.For the referential bundles,the frequency distribution was41%for the academic prose and conversation and59%for the academic prose.Next,using the academic subcorpus of The Longman Grammar of Spoken and Written En-glish Corpus(Biber et al.,1999),each lexical bundle was embedded within an attested context in which the function of the bundle(i.e.,discourse-organizing or referential)would remain the same as initially identified.Then one contentword within a bundle was deleted with space provided to befilled in by the participants.Finally,all test items were randomly ordered and presented as a list of sentences(see Appendix B).The decision as to which word within a lexical bundle to delete was moti-vated by two criteria.First,lexical bundles are typically described as incomplete structural units(Biber et al.,1999),so they usually include a limited set of func-tion words,such as articles,particles,and prepositions,that are often used to construct the frame of a lexical bundle(e.g.,to__with the,in the__of,the__ of the,etc.).Because one frame could be employed in several different lexical bundles,it would be easier for the participants to produce the missing elements of the frame;this,however,would not necessarily illustrate their knowledge of a specific bundle.Thus,the decision was made to delete a content word that is used uniquely in each bundle to explore if the participants could produce each individual bundle rather than the frame(e.g.,the bundles the rest of the and the top of the are created from the same frame the_of the).Second,each frame could potentially befilled with a number of different content words(e.g.,in the absence of,in the form of,in the case of).Therefore,selecting an appropriate content word associated with a particular frame in a certain context would pro-vide more support for the idea that participants recognize certain lexical bundle as a unit.The materials were pilot-tested with10L1and10L2speakers of English. Based on the pilot test,a few sentences were judged to be too difficult for intermediate learners and the contexts for the target bundles were replaced.The replaced contexts were also selected from the corpus.A split-half reliability procedure was used to measure the internal consistency among the items in the gap-filling task.Because there were two different structures targeted in the task, separate Guttman split-half coefficients were obtained for discourse-organizing and referential bundles,which were.86and.84,respectively,suggesting that both sections of the task had sufficient internal consistency.Design and ProcedureThe study had a2×3mixed design,with the discourse function(discourse-organizing vs.referential)as a within-group variable and participant group(L1 vs.advanced L2vs.intermediate L2speakers)as a between-groups variable. The L1and advanced L2speakers scheduled a30-min appointment with the researcher,during which they completed the gap-filling task.The intermediate L2speakers completed the task during their scheduled English as a second language(ESL)class.The session lasted approximately30mins.。
我国语言学研究的现状、趋势与展望目录一、内容简述 (2)1.1 语言学的重要性 (3)1.2 我国语言学研究的背景与意义 (3)二、我国语言学研究现状 (5)2.1 语言学各分支学科的发展概况 (6)2.1.1 音韵学 (7)2.1.2 句法学 (8)2.1.3 语义学 (10)2.1.4 语用学 (11)2.1.5 社会语言学 (12)2.1.6 心理语言学 (13)2.1.7 计算语言学 (15)2.2 我国语言学研究的代表性成果与贡献 (15)2.2.1 重大科研项目与成果 (17)2.2.2 学术论文与专著 (18)2.2.3 国际合作与交流 (19)三、我国语言学研究趋势 (21)3.1 科技创新与语言学研究融合 (22)3.2 跨学科研究方法的运用 (23)3.3 语言资源保护与利用 (24)3.4 语言智能与自然语言处理技术的发展 (25)3.5 全球化背景下的汉语研究 (26)四、我国语言学研究展望 (27)4.1 未来语言学研究的方向与重点 (29)4.2 语言学与其他学科的交叉融合前景 (30)4.3 语言学研究的社会服务功能与应用 (32)五、结论 (33)5.1 我国语言学研究的总结 (34)5.2 对未来发展的建议与思考 (36)一、内容简述随着我国经济的快速发展和科技的不断进步,语言学研究在国内外的地位日益重要。
本文将对我国语言学研究的现状、趋势与展望进行分析,以期为我国语言学领域的发展提供有益的参考。
我国语言学研究仍然面临一些挑战和问题,理论研究方面,虽然取得了一定的成果,但仍存在许多未解之谜,需要进一步深化探讨。
应用研究方面,虽然在某些领域取得了显著的成果,但与国际先进水平相比仍有较大差距,需要加强基础研究和技术创新。
跨学科研究方面,虽然取得了一定进展,但仍需加强与其他学科的交流与合作,形成合力。
人才培养方面,我国语言学界需要进一步加强人才培养,提高整体素质,培养更多具有国际视野和创新能力的优秀人才。
《语料库语言学在语域研究中的应用》摘要:下面就以体育语域为例,在运用语料库语言学基础上进行体育赛事语言的研究,(李葆嘉,2003)应用于体育领域的言语即“体育语域”,“体育语言”包含于“体育语域”,体育赛事语域是体育语域中的重要部分,由于体育语域语料库涉及体育语域的各个方面,不同的领域使用的词汇有一定的特殊性,因此,将语料分为两大部分,即体育赛事和赛事传播,分别进行统计和研究摘要:随着计算机和网络技术的发展,语料库语言学成为新的研究领域。
本文从语域语言的研究入手,以体育语域为例,研制体育语域语料库,利用语料库进行定性和定量的分析,通过词汇统计和词频分级,探讨体育语言的特点,并提取专用词汇,制成词表,以此为基础进行更深层次的研究。
而研究的成果则可以应用于语言教学、词典编撰等领域,也为机器翻译提供了一定的参考。
关键词:语料库语言学语域研究应用语料库(corpus或corpora,corpuses(复))是指按照一定的语言学原则,运用随机抽样的方法,收集自然出现的连续语言运用文本或话语片段而建成的具有一定容量的大型电子文本库。
语料库语言学就是在文本语料的基础上进行语言研究的一门学科。
语料库是作为信息载体的大量语言资料的集合。
以语料库为手段研究语言的主要目的是描写和解释语言中的词汇和句法的各种问题,以及处理自然语言的各种课题。
语言研究者可以根据研究课题的要求,使用“标注”手段(即给语料的词汇和其它形式加上识别和分类标记)和检索工具,分析语料库中的语料,对语言现象进行定量和定性的分析。
一、语料库语言学研究的现状语料库方法应用于特殊用途英语的研究在国外已具规模。
英国学者运用以语料库和计算机技术为基础的多维度分析法(multi-dimensional analysis)调查了生物学论文与历史学论文两种特殊用途英语的语言特征。
(B1bin,Coad,Reppen,《语料库语言学》(Corpus Linguisttcs),2000目前已建成了几个较大的语料库,如SEU语料库、布朗语料库、LOB语料库等。