Wideband Speech Recovery from Narrowband Speech using Classified Codebook Mapping
- 格式:pdf
- 大小:410.62 KB
- 文档页数:6
远场语音专用术语
远场语音专用术语主要包括以下几个:
远场语音识别(Far-field Speech Recognition):指在较远的距离(通常是几米到十几米)内,通过语音输入设备(如智能音箱、电视等)捕捉用户的语音指令,并将其转换为文字或执行相应的操作。
语音唤醒(Voice Wake-up):指通过特定的唤醒词或命令,使设备从休眠状态唤醒,进入待命状态,等待用户的进一步指令。
麦克风阵列(Microphone Array):指由多个麦克风组成的阵列,用于捕捉声音信号,并通过信号处理算法提高语音识别的准确性和稳定性。
声源定位(Sound Source Localization):指通过信号处理算法,确定声音信号的来源方向或位置,以便设备能够更准确地捕捉用户的语音指令。
降噪算法(Noise Reduction Algorithm):指通过信号处理算法,降低环境噪声对语音识别的影响,提高语音识别的准确率。
语音指令集(Voice Command Set):指设备支持的一系列语音指令,用户可以通过这些指令控制设备的各种功能。
这些术语都是远场语音技术中常用的专业术语,对于理解和应用远场语音技术具有重要意义。
音响专业术语中英对照专业音频术语中英文对照 AAAC automatic ampltiude control 自动幅度控制 AB AB制立体声录音法Abeyancd 暂停,潜态 A-B repeat A-B重复ABS absolute 绝对的,完全的,绝对时间 ABS american bureau of standard 美国标准局ABSS auto blank secrion scanning 自动磁带空白部分扫描 Abstime 绝对运行时间A.DEF audio defeat 音频降噪,噪声抑制,伴音静噪 ADJ adjective 附属的,附件ADJ Adjust 调节 ADJ acoustic delay line 声延迟线Admission 允许进入,供给 ADP acoustic data processor 音响数据处理机ADP(T) adapter 延配器,转接器 ADRES automatic dynamic range expansion system 动态范围扩展系统 ADRM analog to digital remaster 模拟录音、数字处理数码唱盘 ADS audio distribution system 音频分配系统A.DUB audio dubbing 配音,音频复制,后期录音 ADV advance 送入,提升,前置量ADV adversum 对抗 ADV advancer 相位超前补偿器Adventure 惊险效果 AE audio erasing 音频(声音)擦除AE auxiliary equipment 辅助设备 Aerial 天线AES audio engineering society 美国声频工程协会 AF audio fidelity 音频保真度AF audio frequency 音频频率 AFC active field control 自动频率控制AFC automatic frequency control 声场控制 Affricate 塞擦音AFL aside fade listen 衰减后(推子后)监听 A-fader 音频衰减AFM advance frequency modulation 高级调频 AFS acoustic feedback speaker 声反馈扬声器AFT automatic fine tuning 自动微调 AFTAAS advanced fast time acoustic analysis system 高级快速音响分析系统 After 转移部分文件Afterglow 余辉,夕照时分音响效果 Against 以……为背景AGC automatic gain control 自动增益控制 AHD audio high density 音频高密度唱片系统AI advanced integrated 预汇流 AI amplifier input 放大器输入AI artificial intelligence 人工智能 AI azimuth indicator 方位指示器A-IN 音频输入 A-INSEL audio input selection 音频输入选择Alarm 警报器 ALC automatic level control 自动电平控制ALC automatic load control自动负载控制 Alford loop 爱福特环形天线Algorithm 演示 Aliasing 量化噪声,频谱混叠Aliasing distortion 折叠失真 Align alignment 校正,补偿,微调,匹配Al-Si-Fe alloy head 铁硅铝合金磁头 Allegretto 小快板,稍快地Allegro 快板,迅速地 Allocation 配置,定位All rating 全(音)域 ALM audio level meter 音频电平表ALT alternating 震荡,交替的 ALT alternator 交流发电机ALT altertue 转路 ALT-CH alternate channel 转换通道,交替声道Alter 转换,交流电,变换器 AM amperemeter 安培计,电流表AM amplitude modulation 调幅(广播) AM auxiliary memory 辅助存储器Ambience 临场感,环绕感 ABTD automatic bulk tape degausser磁带自动整体去磁电路 Ambient 环境的Ambiophonic system 环绕声系统 Ambiophony 现场混响,环境立体声AMLS automatic music locate system 自动音乐定位系统AMP ampere 安培 AMP amplifier 放大器AMPL amplification 放大 AMP amplitude 幅度,距离Amorphous head 非晶态磁头 Abort 终止,停止(录制或播放)A-B TEST AB比较试听 Absorber 减震器Absorption 声音被物体吸收 ABX acoustic bass extension 低音扩展AC accumulator 充电电池 AC adjustment caliration 调节-校准AC alternating current 交流电,交流 AC audio coding 数码声,音频编码AC audio center 音频中心 AC azimuth comprator 方位比较器AC-3 杜比数码环绕声系统 AC-3 RF 杜比数码环绕声数据流(接口)ACC Acceleration 加速 Accel 渐快,加速Accent 重音,声调 Accentuator 预加重电路Access 存取,进入,增加,通路 Accessory 附件(接口),配件Acryl 丙基酰基 Accompaniment 伴奏,合奏,伴随Accord 和谐,调和 Accordion 手风琴ACD automatic call distributor 自动呼叫分配器 ACE audio control erasing 音频控制消磁A-Channel A(左)声道 Acoumeter 测听计Acoustical 声的,声音的 Acoustic coloring 声染色Acoustic image 声像 Across 交叉,并行,跨接Across frequency 交叉频率,分频频率 ACST access time 存取时间Active 主动的,有源的,有效的,运行的 Active crossover 主动分频,电子分频,有源分频Active loudsperker 有源音箱 Armstrong MOD 阿姆斯特朗调制ARP azimuth reference pulse 方位基准脉冲 Arpeggio 琶音Articulation 声音清晰度,发音 Artificial 仿……的,人工的,手动(控制)AAD active acoustic devide 有源声学软件 ABC auto base and chord 自动低音合弦Architectural acoustics 建筑声学 Arm motor 唱臂唱机Arpeggio single 琶音和弦,分解和弦 ARL aerial 天线ASC automatic sensitivity control 自动灵敏度控制 ASGN Assign 分配,指定,设定ASP audio signal processing 音频信号处理 ASS assembly 组件,装配,总成ASSEM assemble 汇编,剪辑 ASSEM Assembly 组件,装配,总成Assign 指定,转发,分配 Assist 辅助(装置)ASSY accessory 组件,附件 AST active servo techonology 有源伺服技术A Tempo 回到原速 Astigmatism methord 象散法专业音频术语中英文对照 BB band 频带 B Bit 比特,存储单元B Button 按钮 Babble 多路感应的复杂失真BZ buzzer 蜂音器 Back clampin g 反向钳位Back drop 交流哼声,干扰声 Background noise 背景噪声,本底噪声Backing copy 副版 Backoff 倒扣,补偿Back tracking 补录 Back up 磁带备份,支持,预备Backward 快倒搜索 Baffle box 音箱BAL balance 平衡,立体声左右声道音量比例,平衡连接 Balanced 已平衡的Balancing 调零装置,补偿,中和 Balun 平衡=不平衡转换器Banana jack 香蕉插头 Banana bin 香蕉插座Banana pin 香蕉插头 Banana plug 香蕉插头Band 频段, Band pass 带通滤波器Bandwidth 频带宽,误差,范围 Band 存储单元Bar 小节,拉杆 BAR barye 微巴Bargraph 线条 Barrier 绝缘(套)Base 低音 Bass 低音,倍司(低音提琴)Bass tube 低音号,大号 Bassy 低音加重BATT battery 电池 Baud 波特(信息传输速率的单位)Bazooka 导线平衡转接器 BB base band 基带BBD Bucket brigade device 戽链器件(效果器) B BAT Battery 电池BBE 特指BBE公司设计的改善较高次谐波校正程度的系统 BC balanced current 平衡电流BC Broadcast control 广播控制 BCH band chorus 分频段合唱BCST broadcast (无线电)广播 BD board 仪表板Beat 拍,脉动信号 Beat cancel switch 差拍干扰消除开关Bel 贝尔 Below 下列,向下Bench 工作台 Bend 弯曲,滑音Bender 滑音器 BER bit error rate 信息差错率BF back feed 反馈 BF Backfeed flanger 反馈镶边BF Band filter 带通滤波器BGM background music 背景音乐 Bias 偏置,偏磁,偏压,既定程序Bidirectional 双向性的,8字型指向的 Bifess Bi-feedback sound system 双反馈系统Big bottom 低音扩展,加重低音 Bin 接收器,仓室BNG BNC连接器(插头、插座),卡口同轴电缆连接器 Binaural effect 双耳效应,立体声Binaural synthesis 双耳合成法 Bin go 意外现象Bit binary digit 字节,二进制数字,位 Bitstream 数码流,比特流Bit yield 存储单元 Bi-AMP 双(通道)功放系统Bi-wire 双线(传输、分音) Bi-Wring 双线BK break 停顿,间断 BKR breaker 断电器Blamp 两路电子分音 Blanking 关闭,消隐,断路Blaster 爆裂效果器 Blend 融合(度)、调和、混合Block 分程序,联动,中断 Block Repeat 分段重复Block up 阻塞 Bloop (磁带的)接头噪声,消音贴片BNC bayonet connector 卡口电缆连接器 Body mike 小型话筒Bond 接头,连接器 Bongo 双鼓Boom 混响,轰鸣声 Boomy 嗡嗡声(指低音过强)Boost 提升(一般指低音),放大,增强 Booth 控制室,录音棚Bootstrap 辅助程序,自举电路Bond 接头,连接器 Bongo 双鼓Boom 混响,轰鸣声 Boomy 嗡嗡声(指低音过强)Boost 提升(一般指低音),放大,增强 Booth 控制室,录音棚Bootstrap 辅助程序,自举电路 Bottoming 底部切除,末端切除Bounce 合并 Bourclon 单调低音Bowl 碗状体育场效果 BP bridge bypass 电桥旁路BY bypass 旁通 BPC basic pulse generator 基准脉冲发生器BPF band pass filter 带通滤波器 Both sides play disc stereo system 双面演奏式唱片立体声系统 Bottoming 底部切除,末端切除Bounce 合并 Bourclon 单调低音Bowl 碗状体育场效果 BP bridge bypass 电桥旁路BY bypass 旁通 BPC basic pulse generator 基准脉冲发生器BPF band pass filter 带通滤波器 BPS band pitch shift 分频段变调节器BNC bayonet connector 卡口电缆连接器 Body mike 小型话筒BPS band pitch shift 分频段变调节器 BR bregister 变址寄存器BR Bridge 电桥 Break 中止(程序),减弱Breathing 喘息效应 B.Reso base resolve 基本解析度Bridge 桥接,电桥,桥,(乐曲的)变奏过渡 Bright 明亮(感)Brightness 明亮度,指中高音听音感觉 Brilliance 响亮BRKRS breakers 断路器 Broadcast 广播BTB bass tuba 低音大喇叭 BTL balanced transformer-less 桥式推挽放大电路BTM bottom 最小,低音 BU backup nuit 备用器件Bumper 减震器 Bus 母线,总线Busbar 母线 Buss 母线Busy 占线 BUT button 按钮,旋钮BW band width 频带宽度,带度 BYP bypass 旁路By path 旁路专业音频术语中英文对照 CCan 监听耳机,带盒 CANCL cancel 删除CANCL Cancelling 消除 Cancel 取消Cannon 卡侬接口 Canon 规则Cap 电容 Capacitance Mic 电容话筒Capacity 功率,电容量 CAR carrier 载波,支座,鸡心夹头Card 程序单,插件板 Cardioid 心型的CATV cable television 有线电视 Crispness 脆声Category 种类,类型 Cartridge 软件卡,拾音头Carrkioid 心型话筒 Carrier 载波器Cart 转运 Cartridge 盒式存储器,盒式磁带Cascade 串联 Cassette 卡式的,盒式的CAV constant angular velocity 恒角速度 Caution 报警CBR circuit board rack 电路板架 CC contour correction 轮廓校正CCD charge coupled device 电荷耦合器件 CD compact disc 激光唱片CDA current dumping amplifier 电流放大器 CD-E compact disc erasable 可抹式激光唱片CDG compact-disc plus graphic 带有静止图像的CD唱盘CDV compact disc with video 密纹声像唱片CE ceramic 陶瓷 Clock enable 时钟启动Cell 电池,元件,单元 Cellar club 地下俱乐部效果Cello 大提琴CENELEC connector 欧洲标准21脚AV连接器 Cent 音分Central earth 中心接地 CES consumer electronic show (美国)消费电子产品展览会 CF center frequency 中心频率Cross fade 软切换 CH channel 声道,通道Chain 传输链,信道 Chain play 连续演奏Chamber 密音音响效果,消声室 CHAN channel 通道Change 交换 Chapter 曲目Chaper skip 跳节 CHAE character 字符,符号Characteristic curve 特性曲线 Charge 充电Charger 充电器 Chase 跟踪Check 校验 CHC charge 充电CH - off 通道切断 Choke 合唱Choose 选择 Chromatic 色彩,半音Church 教堂音响效果 CI cut in 切入CIC cross interleave code 交叉隔行编码 CIRC circulate 循环Circuit 电路 CL cancel 取消Classic 古典的 Clean 净化CLR clear 归零Click 嘀哒声 Clip 削波,限幅,接线柱CLK clock 时钟信号 Close 关闭,停止CLS 控制室监听 Cluster 音箱阵效果CLV ceiling limit value 上限值 CMP compact 压缩CMPT compatibility 兼容性 CMRR common mode rejection ratio 共模抑制比CNT count 记数,记数器 CNTRL central 中心,中央 CO carry out 定位输出Coarse 粗调 Coax 同轴电缆Coaxial 数码同轴接口 Code 码,编码Coefficient 系数 Coincident 多信号同步Cold 冷的,单薄的 Color 染色效果COM comb 梳状(滤波) COMB combination 组合音色COMBI combination 组合,混合 COMBO combination 配合,组合Combining 集合,结合 COMM communication 换向的,切换装置Command 指令,操作,信号 COMMON 公共的,公共地端Communieation speed 通讯速度选择 COMP comparator 比较器COMP compensate 补偿 Compact 压缩Compander 压缩扩展器 Compare 比拟Compatibility 兼容 Compensate 补偿Complex 全套设备 Copmoser 创意者,作曲者Compressor 压缩器 CON concentric cable 同轴电缆COMP-EXP 压扩器 Compromise (频率)平衡Computer 计算机,电脑Concentric 同轴的,同心的CON console 操纵台 CON controller 控制器Concert 音乐厅效果 Condenser Microphone 电容话筒Cone type 锥形(扬声器) CONFIG 布局,线路接法Connect 连接,联络 CORR correct 校正,补偿,抵消Configuration 线路布局 Confirmation 确认Consent 万能插座 Console 调音台Consonant 辅音 Constant 常数CONT continuous 连续的(音色特性) CONT control 控制,操纵Contact 接触器 Content 内容Continue 连续,继续 Continue button 两录音卡座连续放音键Contour 外形,轮廓,保持 Contra 次八度Contrast 对比度 CONV convert 变换Contribution 分配 Controlled 可控的Controller 控制器 CONV conventional 常规的CONV convertible 可转换的 Copy 复制Correlation meter 相关表 Coupler 耦合Cover 补偿 Coverage 有效范围CP clock pulse 时钟脉冲 CP control program 控制程序CPU 中央处理器 Create 建立,创造CR card reader 卡片阅读机 CRC cyclic redundancy check 循环冗余校验Crescendo 渐强或渐弱 Crispness 清脆感CRM control room 控制室 CROM control read only memory 控制只读存储器Crossfader 交叉渐变器 Cross-MOD 交叉调制Crossover 分频器,换向,切断 Cross talk 声道串扰,串音Crunch 摩擦音 CU counting unit 计数单元C/S cycle/second 周/秒 CSS content scrambling system 内容加密系统CST case style tape 盒式磁带 CT current 电流 CTM close talking microphone 近讲话筒Cue 提示,选听 Cue clock 故障计时钟Cueing 提示,指出 Cursor 指示器,光标Curve (特性)曲线 Custom 常规CUT 切去,硬切换 early warning 预警专业音频术语中英文对照DE earth 真地,接地 E error 错误,差错(故障显示)EA earth 地线,真地 EAR early 早期(反射声)Earphone 耳机 Earth terminal 接地端EASE electro-acooustic simulators for engineers Eat 收取信号工程师用电声模拟器,计算机电声与声学设计软件EBU european broadcasting union 欧洲广播联盟 EC error correction 误差校正ECD electrochomeric display 电致变色显示器 Echo 回声,回声效果,混响ECL extension zcompact limitter 扩展压缩限制器 Edge tone 边棱音ECM electret condenser microphone 驻极体话筒 Edit 编辑ECSL equivalent continuous sound level 等级连续声级ECT electronec controlled transmission 电控传输ED edit editor 编辑,编辑器 EDTV enhanced definition television增强清晰度电视(一种可兼容高清晰度电视)E-DRAW erasable direct after write 可存可抹读写存储器EE errors excepted 允许误差 EFF effect efficiency 效果,作用Effector 操纵装置,效果器Effects generator 效果发生器 EFM 8/14位调制法EFX effect 效果 EG envelope generator 包络发生器EIA electronec industries association (美国)电子工业协会EIAJ electronic industries association Japan 日本电子工业协会EIN einstein 量子摩尔(能量单位) EIN equivalent input noise 等效输入噪声EIO error in operation 操作码错误Eject 弹起舱门,取出磁带(光盘),出盒 EL electro luminescence 场致发光ELAC electroacoustic 电声(器件) ELEC electret 驻极体Electret condenser microphone 驻极体话筒 ELF extremely low frequency 极低频ELEC electronec 电子的 Electroacoustics 电声学EMI electro magnetic interference 电磁干扰 Emission 发射EMP emphasispo 加重 EMP empty 空载Emphasis 加重 EMS emergency switch 紧急开关Emulator 模拟器,仿真设备 EN enabling 启动Enable 赋能,撤消禁止指令 Encoding 编码End 末端,结束,终止 Ending 终端,端接法,镶边ENG engineering 工程 Engine 运行,使用ENG land 工程接地 Enhance 增强,提高,提升ENS ensemble 合奏 ENS envelope sensation 群感Envelopment 环绕感 EQ equalizer 均衡器,均衡EQ equalization 均衡EQL equalization 均衡EOP end of program 程序结束EOP end output 末端输出EOT end of tape 磁带尾端EP extend playing record 多曲目唱片EP extended play 长时间放录,密录EPG edit pulse generator 编辑脉冲发生器EPS emergency power supply 应急电源Equal-loudness contour 等响曲线Equipped 准备好的,已装备Equitonic 全音Equivalence 等效值ER erect 设置ER error 错误,误差ERA earphone 耳机Eraser 抹去,消除Erasing 擦除,清洗Erasure 抹音Erase 消除,消Er early 早期的ERCD extended resolution CD 扩展解析度CDEREQ erect equalizer均衡器(频点)位置(点频补偿电路的中点频率)调整ERF early reflection 早期反射(声)Ernumber 早期反射声量Error 错误,出错,不正确ES earth swith 接地开关ES electrical stimulation 点激励Escqpe 退出ETER eternity 无限Euroscart 欧洲标准21脚AV连接器Event 事件EVF envelope follower包络跟随器(音响合成装置功能单元)EX exciter 激励器EX exchange 交换EX expanding 扩展EXB expanded bass 低音增强EXC exciter 激励器EXCH exchange 转换Exclusive 专用的Excursion 偏移,偏转,漂移,振幅EXP expender 扩展器,动态扩展器EXP export 输出Exponential horn tweeter 指数型高音号角扬声器Expression pedal表达踏板(用于控制乐器或效果器的脚踏装置)EXT extend 扩展EXT exterior 外接的(设备)EXT external 外部的,外接的EXT extra 超过EXTN extension 扩展,延伸(程控装置功能单元)Extract 轨道提出EXTSN extension 扩展,延伸(程控装置功能单元)专业音频术语中英文对照 K-MK key 按键Karaoke 卡拉OK,无人伴奏乐队KB key board 键盘,按钮Kerr 克耳效应,(可读写光盘)磁光效应Key 键,按键,声调Keyboard 键盘,按钮Key control 键控,变调控制Keyed 键控Key EQ 音调均衡kHz Kiloherts 千赫兹Kikll 清除,消去,抑制,衰减,断开Killer 抑制器,断路器Kit 设定Knee 压限器拐点Knob 按钮,旋钮,调节器KP key pulse 键控脉冲KTV karaoke TV 拌唱电视(节目)KX key 键控Lesion 故障,损害Leslie 列斯利(一种调相效果处理方式)LEV level 电平LEVCON level control 电平控制Level 电平,水平,级LF low frequency 低频,低音LFB local feedback 本机反馈,局部反馈LFE lowfrequency response 低频响应LFO low frequency oscillation 低频振荡信号LGD long delay 长延时LH low high 低噪声高输出LH low noise high output 低噪声高输出磁带L.hall large hall 大厅效果Lift 提升(一种提升地电位的装置)Lift up 升起Labial 唇音L left 左(立体声系统的左声道)L line 线路L link 链路L long 长(时间)LA laser 激光(镭射)Lag 延迟,滞后Lamp 灯,照明灯Land 光盘螺旋道的肩,接地,真地Lap dissolve 慢转换Lapping SW 通断开关Large 大,大型Large hall 大厅混响Larigot 六倍音Laser 激光(镭射)Latency 空转,待机Launching 激励,发射Layer 层叠控制,多音色同步控制LCD liquid crystal display 液晶显示LCR left center right 左中右LD laser vision disc 激光视盘,影碟机LD load 负载LDP input 影碟输入LDTV low definition television低分辨率数字电视LCD projictor 液晶投影机Lead 通道,前置,输入Lead-in 引入线Leak 漏泄Learn 学习LED light emitting deivce 发光辐射器,发光器件M main 主信道M master 主控M memory 存储器M mix 混频M moderate 适中的M music 音乐Mac manchester auto code 曼切斯特自动码MADI musical audio digital interface 音频数字接口Main 主要的,主线,主通道,电源MAG magnet 磁铁Magnetic tape 磁带Magnetic type recorder 磁带录音机Main 电源,主要的Major chord 大三和弦Make 接通,闭合Makeup 接通,选配Male 插头,插件MAN manual 手动的,手控Manifold technology (音箱)多歧管技术Manipulate 操作,键控MANP 手动穿插Manual 手动的,人工的,手册,说明书March 进行曲Margin (电平)余量Mark 标志,符号,标记Mash 压低,碾碎Masking 掩蔽Master 总音量控制,调音台,主盘,标准的,主的,总路MAR Matrix 矩阵,调音台矩阵(M),编组Match 匹配,适配,配对Matrix quad system 矩阵四声道立体声系统MAX maximum 最大,最大值MB megabytes 兆字节Mb/s megabytes per second 兆字节/秒MC manual control 手控,手动控制MCH multiple chorus 多路合唱MCR multiple cjhannel amplification reverberation 多路混响增强MD mini disc 光磁盘唱机,小型录放唱盘MD moving coil 动圈式MDL modulation delay 调制延时MEAS measure 测量,范围,测试Measure 乐曲的,小节Meas edit 小结编辑MECH mechanism 机械装置MED medium 适中,中间(挡位)Medley 混合Mega bass 超重低音MEM memory 存储器,存储,记忆Member 部件,成员Menu 菜单,目录,表格MEQ mono equalizer 单声道均衡器Merge 合并,汇总,融合Meridian 顶点的,峰值Measure 小结Megaphone 喇叭筒Mel 美(音调单位)Menu 菜单,节目表Message 通信,联系Metal 金属(效果声)Metal tape 金属磁带Meter 电平表,表头,仪表Metronome 节拍器MF matched filter 匹配滤波器MF maveform 波形MF middle frequency 中频,中音MFL multiple flange 多层法兰(镶边)效果MFT multiplprogramming with a fixed number of tasks 任务数量固定的多通道程序设计MIC micro 微米MIC microphone 话筒,麦克风,传声器Michcho level 话筒混响电平Micro monitor amp 微音监听放大器MICROVERB microcomputer reverb 微处理机混响MID middle 中间的,中部的,中音,中频MIDI music instrument digital interface电子乐器数字接口MIN minimum 最小,最小值MIN minute 分钟MIND master integrated network device一体化网络总装置Minitrim 微调Minitrue 微机调整Minor chord 小三和弦Mismatch 失配Mistermination 终端失配MIX 混合,音量比例调节Mixer 调音台,混音器MM moving magnet 动磁式MNTR monitor 监控器MNOS metallic nitrogen - oxide semiconductor金属氮氧化物半导体MO magneto optical 可抹可录型光盘MOC magnet oscillator circuit 主振荡电路MOD mode 状态,方式,模式MOD model 型号,样式,模型,典型的MOD modulation 调制Mode 状态,(乐曲的)调式Mode select 方式选择Mush 噪声干扰,分谐波Mush area 不良接收区Music 音乐,乐曲Music center 音乐中心,组合音响Music conductor 音乐控制器MUT mute 静音,哑音,噪声控制Muting 抑制,消除Multiple 复合的,多项的,多重的MV mean value 平均值MV multivibrator 多谐振荡器MW medium wave 中波MXE mono exciter 单声道激励器MXR mixer 混频器专业音频术语中英文对照M-PN normal 正常,普通,标准N negative 阴极,负极ohm 欧姆(电阻的单位)Oboe 双簧管O/C open circuit 开路OCK operation control key 操作控制键OCL output capacitorless 无输出电容功率放大器OCL output control line 输出控制线OCT octave 倍频程,八度音OD operations directive 操作指示OD optical density 光密度OD over drive 过激励OFC office 职能OFC oxygen-free cupreous 无氧铜导线Off 关闭,断开Offering 填入,提供Offset (移相)补偿,修饰,偏置OFHC oxygen free high conductivity copper高导电率无氧铜导线Ohm 欧姆OK 确认OL on line 在线,连机OL over load 过载Omnidirectional 无方向性的On 开,接通Once 一次,单次One-way relay play 单向替换放音Online 联机,联线Only 仅仅,只On-mike 正在送话,靠近话筒One touch 单触连接OP output 信号输出OP over pressure 过压Open 打开,开启Opera 歌剧Operate 操作,运转Operation 操作,运转Operator 操纵器,合成器算子Optical 数码光缆接口Optical master 激光器Option 选型,选择Optimum 最佳状态OPTOISO optoisolator 光隔离器Or 或,或者ORC optimum recording current 磁头最佳记录电流Orchestra 管弦乐器Organ 风琴,元件 Original 原(程序),初始(化)OSC oscillator 振荡器,试机信号(一千赫兹)OSC oscillograph 示波器OSS optimal stereo signal 最佳立体声信号OTL 无输出变压器功率放大器OTR one-touch time recording 单触式定时录像OTR operation temperature range 工作温度范围OTR overload time relay 过载限时继电器OUT output 输出Outage 中断Out-burst 脉冲,闪光,闪亮Outcome 结果,输出,开始Out let 输出端子,引出线Outline 轮廓线Out phase 反向OVDB 重叠录音Overall 轮廓,总体上Overcut 过调制Over drive 过激励Overdubs 叠录Overflow 信号过强Overhang (激励器)低音延伸调节Overhearing 串音OVLD over lode 过载,超负荷Over sampling 过取样Overtone 泛音OVWR overwrite 覆盖式录音Proximity effect 近距离效果Prwsnt 突出感PS position 位置,状态PSC program switching center 节目切换中心PSL phase sequence logic 相位顺序逻辑PSM peak selector memory 峰值选择存储器PSM phase shifter module 移相模件PSM pitch shift modulation 交频调制PST posterrior 后面PST preset 预置PSU power supply 供电Psychological acoustics 心理声学PT power transformer 电源变压器PT portable 便携式PT pulse timer 脉冲计时器PTD pan turnout piece delay 声像分支延时PTE private 专用的PTN pattern 模式,样式PTN procedure turn 程序变化PU pickup 拾音PU power unit 电源设备Pull 拉,趋向Pull-in 接通,引入Pumping 抽气效应PWR power 电源,功率P-P peak-Peak 峰-峰值PPD pingpong delay 乒乓延时PPG programme pulse generator 脉冲程序发生器PPI peak program indicator 峰值显示器PPI programmable peripheral interface 程序外部接口PPL peak program level 峰值音量电平PPM peak program meter 峰值节目表,峰值音量表PPM pulse phase modulation 脉冲相位调制Pr power rate 功率比PR program register 程序寄存器PRC precision 精确,精细,精密度Pre 前置,预备,之前Pre-delay 预延迟PREAMP preamplifier 前置放大器Preselection 预选Presence 临场效果,现场感Preserve 保存,维持Preset 预置,预调Press 按,压Previous 向前,前位的PRM parameter 参量PRO professional 专业的Process 处理,加工PROCR processor 处理器PROG program 程序,基本音色Prosody 韵律POSTF 后置(万分)Power 电源,功率Power amplifier 功率放大器Power dump 切断电源Power out 功率输出Piower supply 电源供给PP peak power 峰值功率P plug 插头P positive 正极P pulse 脉冲P power 功率PA preamplifier 前置放大器PA public address 扩声Pace 步速,级数Packed cell 积层电池Packing 图像压缩P-I-P picture in picture 画中画PAD 定值衰减,衰减器,(打击乐大按键的)鼓垫Padding 统调,使……平直Paddle 开关,门电路Page 一面,(存储器)页面地址,寻找Pair (立体声)配对,比较PAL phase alternation line 逐行倒相彩色电视制式PAM pulse amplitude modulation 脉幅调制PAM pole amplitude modulation 极点调幅Pan panorama 声像调节,定位,全景Panel 面板,操纵板,配电盘Panotrope 电唱机Paper cone 纸盆Parallel 并联,平衡PAR(PARAM) parameter 参数,参量,系数Parametric 参量的Part 声部数,部分Partial tone 分音,泛音PAS public address system 扩声系统PASC precision adaptive subband coding精密自适应分段编码Pass 通过Passive 被动,被动分频,功率分频Paste 粘贴PAT pattern 模仿,型号,图谱特性曲线Patch 临时,插接线,用连接电缆插入Patch bay 配电盘Patch board 插线板Path 信号通路Pattern 样式,方式,样板Pause 暂停,间歇,停顿PB playback 播放,重放PB push button 按钮开关PBASS proper bass active supply system最佳低音重放系统PBC play back control 重放控制,回放控制PC perceptual coding 感觉编码PC program control 程序控制PCB printed circuit board 印刷线路板PCC phase correlation cardioid microphone相位相关心形话筒PCM precision capacitor microphone 精密电容话筒PD power divider 功率分配器PD power doubler 功率倍增器PD program directive 程序指令PDS programmable data system 程序可控系统PE program execution 程序执行Peak 峰值,削波(灯)Peak and dip 峰式频率欧洲标准21脚AV接口Pedal 踏板PEM pulse edge molulation 脉冲边缘调制Pentatonic 五声调式PEQ parameter equalizer 参量均衡器PERC percussion 打击乐器PERCUS 打击乐器Performance 施行,表演,表现,演出Permutator 转换开关,变换器Perspective 立体感Perform 执行,完成,施行PFL per fader louder speaker 衰减前监听,预监听PG pulse generator 脉冲发生器PGM program 节目,程序PH phase 相位PHA phase 相位Phantom 幻像电源,幻象供电Phase 相位,状态Phase REV 倒相(电路)Phaser 移相器Phasing 相位校正,移相效果Phon 方(响度单位)Phone 耳机,耳机插口Phoneme 音素Phono(phonograph) 唱机PHS phaser 移相器Physiological acoustics 生理声学PI phase inversion 倒相PIA peripheral interface adapter 外围接口适配器Pianotron 电子钢琴Piano 钢琴Piano whine 钢琴鸣声Pilot 指示器,调节器Pilot jack 监听插孔Pin 针型插口,不平衡音频插口Pink noise 粉红噪声Pipe 管,笛Pitch 音高,音调Pitch shifter 变调器,移频器PK peak 削波(灯),峰值PL phase lock 相位锁定,锁相PL pilot lamp 指示灯PL pre listen 预监听,衰减前监听Placement 连接方式Plate 金属板效果,板混响器Play 播放,重放,弹奏Playback 播放Player 唱机,放音器Plug 插头Plunge 切入Pop filter 噗声滤除器Pops 流行音乐,流行音乐音响效果。
英语如何克服发音难题并提高口语流利度English is widely regarded as one of the most challenging languages to master due to its complex phonetic system. Many non-native English speakers struggle with pronunciation, which hinders their ability to speak fluently. In this article, we will explore effective strategies to overcome pronunciation difficulties and improve oral proficiency.1. Develop an awareness of phonetic soundsTo improve pronunciation, it is crucial to develop an awareness of the different phonetic sounds in English. Take time to familiarize yourself with the International Phonetic Alphabet (IPA) and learn to associate each sound with its corresponding symbol. Use online resources, textbooks, or language learning apps to practice identifying and producing these sounds correctly.2. Mimic native speakersOne of the best ways to improve pronunciation is to mimic native speakers. Pay close attention to their intonation, stress patterns, and rhythm. Practice imitating their pronunciation by listening to podcasts, watching movies or TV shows, and repeating after the speakers. This technique helps to train your ear and mouth muscles to produce accurate sounds.3. Practice tongue twisters and minimal pairsTongue twisters and minimal pairs are effective exercises for improving pronunciation. Tongue twisters are phrases or sentences that contain a sequence of similar or challenging sounds. Practice saying them repeatedly to enhance the flexibility and coordination of your tongue and mouth.Minimal pairs are words that differ by only one sound, such as "ship" and "sheep." Practicing minimal pairs can help you distinguish between similar sounds and improve your overall pronunciation accuracy.4. Use pronunciation applications or softwareIn today's digital age, various pronunciation applications and software have been developed to assist language learners. These tools provide interactive exercises, feedback, and audio recordings to help you practice and improve your pronunciation skills. Some popular apps include ELSA Speak, Sounds: The Pronunciation App, and Forvo Pronunciation.5. Seek feedback from native speakers or language teachersRegular feedback from native speakers or language teachers is invaluable for identifying and correcting pronunciation errors. Engage in conversations with native speakers and ask for their guidance and feedback on your pronunciation. If possible, consider enrolling in a language course or working with a language tutor who can provide personalized instruction and corrective feedback.6. Record and listen to your own voiceRecording and listening to your own voice while speaking English allows you to identify areas for improvement. Pay attention to your pronunciation of specific sounds or words and compare it to native speakers. Repeat the recording multiple times, striving to mimic native-like pronunciation. As you progress, you will notice a gradual improvement in your accuracy and fluency.7. Practice speaking in a supportive environmentFind opportunities to practice speaking English in a supportive environment. Join conversation clubs, language exchange programs, or online communities where you can interact with other English learners and native speakers. Speaking in real-life situations will help build your confidence, reduce anxiety, and provide valuable practice for improving your overall fluency.Remember, improving pronunciation takes time, patience, and consistent practice. By incorporating these strategies into your language learning routine, you can successfully overcome pronunciation challenges and enhance your oral proficiency in English. Keep practicing, stay dedicated, and you will see progress over time.。
Articulatory Tradeoffs Reduce Acoustic Variability DuringAmerican English /r/ ProductionFrank H. Guenther1,2, Carol Y. Espy-Wilson3,2, Suzanne E. Boyce4,2,Melanie L. Matthies5,2, Majid Zandipour2,1, and Joseph S. Perkell2,6 Journal of the Acoustical Society of America (1999), vol. 105, pp. 2854-2865.Address correspondence to:Prof. Frank H. GuentherBoston UniversityCenter for Adaptive Systems andDepartment of Cognitive and Neural Systems677 Beacon StreetBoston, MA, 02215Fax Number: (617) 353-7755Email: guenther@ABSTRACTThe American English phoneme/r/has long been associated with large amounts of articulatory variability during production.This paper investigates the hypothesis that the articulatory variations used by a speaker to produce/r/in different contexts exhibit systematic tradeoffs,or articulatory trading relations,that act to maintain a relatively stable acoustic signal despite the large variations in vocal tract shape.Acoustic and articulatory recordings were collected from seven speakers producing/r/infive phonetic contexts.For every speaker,the different articulator configurations used to produce/r/in the different phonetic contexts showed systematic tradeoffs,as evidenced by significant correlations between the positions of transducers mounted on the tongue.Analysis of acoustic and articulatory variabilities revealed that these tradeoffs act to reduce acoustic variability,thus allowing relatively large contextual variations in vocal tract shape for/r/ without seriously degrading the primary acoustic cue.Furthermore,some subjects appeared to use completely different articulatory gestures to produce/r/in different phonetic contexts.When viewed in light of current models of speech movement control,these results appear to favor models that utilize an acoustic or auditory target for each phoneme over models that utilize a vocal tract shape target for each 1Department of Cognitive and Neural Systems, Boston University2Research Laboratory of Electronics, Massachusetts Institute of Technology3Department of Electrical and Computer Engineering, Boston University4Department of Communication Sciences and Disorders, University of Cincinnati5Department of Communication Disorders, Boston University6Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology1. IntroductionThe American English phoneme/r/has long been associated with relatively large amounts of articulatory variability(Alwan,Narayanan,and Haker,1997;Delattre and Freeman,1968;Espy-Wilson and Boyce,1994;Hagiwara,1994,1995;Ong and Stone,1998;Westbury,Hashi,and Lindstrom,1995, 1998).In fact,the endpoints of the articulatory continuum for/r/can be analyzed as functionally different articulator configurations that use different primary articulators(tongue tip vs.tongue dorsum).These endpoints have been characterized in the literature as“bunched”(using the tongue dorsum)and “retroflexed”(using the tongue blade/tip).Often,the same speaker will use different types of/r/in different productions,e.g.,in different phonetic contexts.At the same time,the primary acoustic cue for/r/is relatively simple and stable:a deep dip in the trajectory of the third spectral energy peak of the acoustic waveform,or third formant frequency(F3)(Boyce and Espy-Wilson,1997;Delattre and Freeman,1968; Westbury et al.,1995,1998).Furthermore,no consistent acoustic difference between bunched and retroflexed /r/’s has been discovered.How is it that a speaker can produce perceptually acceptable/r/’s despite using such variable vocal tract shapes?One possible answer to this question is that the variations in vocal tract shape for/r/are not haphazard,but are instead systematically related in a way that maintains a relatively stable acoustic signal across productions despite large variations in vocal tract shape across productions.In other words,the different vocal tract shapes used to produce/r/by a particular subject might involve articulatory tradeoffs, or trading relations.The concept of articulatory trading relations is illustrated by the following example. Assume that narrowing either of two constrictions at different locations along the vocal tract(call them location1and location2)has the same effect on an important acoustic cue for a phoneme.Assume further that narrowing either constriction causes a reduction in F3.In such a case,one could use different combinations of the two constrictions to achieve the same acoustic effect.For example,to achieve a particular value of F3,one might form a very narrow constriction at location1and a less narrow constriction at location2,or one might alternatively form a very narrow constriction at location2and a less narrow constriction at location1.If a speaker used one of these options in one phonetic context and the other option in a second phonetic context,a negative covariance between the sizes of these two constrictions would be seen across phonetic contexts.The primary purpose of the current study is to investigate the issue of whether the various vocal tract shapes used by an individual to produce/r/in different phonetic contexts exhibit articulatory trading relations that act to maintain a relatively stable acoustic signal.As discussed at the end of this article,this issue has important implications for theories of speech motor control and speech rgely for this reason,several recent experiments have investigated the trading relations issue for phonemes other than/r/(e.g.,de Jong,1997;Perkell et al.,1993;Perkell,Matthies,and Svirsky,1994;Savariaux et al. 1995a),but the results have not been uniform across subjects:although most subjects exhibit expected articulatory trading relations,some others do not.A possible reason for this ambiguity is that these studies have primarily concentrated on one hypothesized trading relationship,and subjects who do not exhibit this trading relation may be exhibit other,unanalyzed trading relations that act to reduce acoustic variability. For example,Perkell et al.(1993)investigated an hypothesized trading relation between lip rounding and tongue body raising for the vowel/u/.Three of four subjects showed weak trading relations,but the fourth subject showed the opposite pattern.This fourth subject may have been using other trading relations that overrode the effect of the lip rounding/tongue body raising relationship.In the current study,we employ analysis procedures that allow us to assess the combined effects of multiple articulatory covariances on the variability of the acoustic signal.Furthermore,American English/r/was chosen1because the large amount of articulatory variability associated with/r/productions should make it easier to detect trading relations if they are present.2. Methods2.1. Data collectionAn electromagnetic midsagittal articulometer(EMMA)system(Perkell et al.,1992)was used to track the movements of six small(5mm long x2.5mm diameter)transducer coils.The coils were attached in the midsagittal plane to the tongue(3coils),lips(2coils),and lower incisor(1coil)with bio-compatible adhesive.Transducers were also placed on the upper incisor and the bridge of the nose,for defining a coordinate system with a maxillary frame of reference.A directional microphone was suspended14inches from the subject's mouth and the acoustic signal was recorded simulatenously with the EMMA signals. Standard EMMA calibration protocols were completed prior to each experiment(c.f.Perkell et al.,1992 for details).The current study focused on the positions of the three tongue transducers,which were located approximately 1, 2.5, and 5 cm back from the tongue tip (with the tongue in a neutral configuration).The seven subjects were young adults,two females(Subjects2and3)andfive males.They had no history of speech,language,or hearing deficits or pronounced regional dialects.Each of the seven subjects produced4-6repetitions of the carrier phrase"Say____for me"for each of thefive test utterances; /warav/,/wabrav/,/wadrav/,/wagrav/,and/wavrav/.The articulatory and acoustic data from these utterances were time-aligned to allow direct comparison between the two data types.2.2. F3 extraction and alignmentThe minimum measured F3value during/r/production,which can be thought of as the acoustic “center”of/r/,served as a landmark for time-alignment of the data across utterances for each speaker. Formant tracks were computed for all utterances using the ESPS/W A VES formant tracker and a51.2ms window and3.2ms frame rate.The F3minimum was detected using an automatic procedure thatfirst identified all sonorant regions,then located the point of minimal F3from the relevant sonorant regions.F3 values and transducer positions within a140ms time window centered at the F3minimum were extracted. Extracted F3traces for some utterances were corrupted due to technical difficulties in automatically tracking low amplitude and low frequency values of F3after stop consonants.Therefore,utterances whose F3tracks changed by more than200Hz in a3.2ms time step were eliminated from the study,leaving12to 27analyzed utterances per subject.After this elimination process,the tongue shapes at the F3minimum of the remaining utterances were visually inspected,and two additional utterances(one each for Subjects1 and4)were identified as having articulations that were incorrectly labeled as/r/by the automatic extraction process. These two utterances were also eliminated from the study.2.3. Effects of vocal tract shape parameters on F3The vocal tract shape for/r/involves a palatal constriction formed by the tongue in the anterior third of the tract.Roughly speaking,the third formant frequency(F3)of/r/is the resonance resulting from the cavities anterior to the palatal constriction(e.g.,Alwan et al.,1997;Espy-Wilson,Narayanan,Boyce,and Alwan,1997;Stevens,1998).This part of the vocal tract consists of an acoustic compliance due to a large front cavity volume and two parallel acoustic masses due to natural tapering by the teeth/lips and the palatal constriction behind the front cavity.The resulting resonance is inversely proportional to the product of the total acoustic mass and the acoustic compliance.Because it is difficult to accurately infer lip aperture from EMMA data,we focus on the effects of the acoustic mass due to the size and location of the palatal constriction.From these considerations,we conclude that the frequency of F3can be decreased by tongue movements that lengthen the front cavity(thereby increasing the acoustic compliance of the front cavity),lengthen the constriction(thereby increasing the acoustic mass of the constriction behind the front cavity),or decrease the area of the constriction(thereby increasing the acoustic mass of the constriction)2.The predicted effects of these movements on F3were confirmed using vocal tract area functions derived from structural MRI scans of a speaker producing/r/3.Two area functions were derived:one representing a“bunched”/r/configuration,and one representing a“retroflexed”/r/configuration.Three manipulations were carried out on each area function to test the effects on F3predicted from acoustic theory:(i)the palatal constriction was extended backward by narrowing the vocal tract area immediately behind the constriction,(ii)the front cavity was lengthened by displacing the palatal constriction backward,and(iii)the vocal tract area at the palatal constriction was decreased.For all three manipulations,an acoustic signal was synthesized(using S.Maeda’s VTCALCS program;Maeda,1990) and compared to the signal synthesized from the original area function.Each manipulation resulted in a lower F3 in both the bunched and retroflexed /r/ cases, as expected from the acoustic theory analysis.Because all three manipulations act to lower F3,subjects could maintain a relatively stable F3despite vocal tract shape variations across contexts if these variations involved tradeoffs between the different manipulations.When looking at the different vocal tract shapes for/r/across contexts,these tradeoffs would be manifested by correlations between constriction length,front cavity length,and constriction area. Specifically,the following three correlations would be expected to aid in maintaining a relatively stable F3 across utterances while allowing variations in vocal tract shape:(1)a negative correlation between constriction length and front cavity length,since increases inconstriction length and front cavity length both act to reduce F3,(2)a positive correlation between constriction length and constriction area,since increases inconstriction length reduce F3 and decreases in constriction area reduce F3, and(3)a positive correlation between front cavity length and constriction area,since increases in frontcavity length reduce F3 and decreases in constriction area reduce F3.2.4. Predicted articulatory covariancesTo determine whether a subject uses any of the three trading relations hypothesized above,we must first describe the trading relations in terms of the x and y coordinates of the tongue transducers.For tongue configurations during/r/production,a forward movement of the tongue front transducer generally corresponds to a shortening of the front cavity,an upward movement of the tongue front transducer generally corresponds to a decrease in the area of the palatal constriction for/r/,and,since the point of maximal constriction for/r/is typically anterior to the tongue back transducer,an upward movement of the tongue back transducer generally corresponds to a lengthening of the palatal constriction and possibly a decrease in the area of the constriction.When determining the signs of the transducer coordinate correlations corresponding to the trading relations delineated above,we must take into account that increasing values of the tongue front horizontal position correspond to decreases in front cavity length,and increasing values of the tongue front vertical position correspond to decreases in constriction area.From these considerations,we can surmise that the three trading relation strategies described above should be evidenced by the following articulatory correlations:(1) a positive correlation between tongue back height and tongue front horizontal position,(2) a negative correlation between tongue back height and tongue front height, and(3) a positive correlation between tongue front horizontal position and tongue front height.Note that the use of all three trading relations by a single subject is unlikely given that they impose competing constraints;i.e.,if tongue back height and tongue front horizontal position are positively correlated as in relation(1),and tongue front horizontal position and tongue front height are positively correlated as in relation(3),it is very likely that tongue back height and tongue front height will also be positively correlated, thus violating relation (2).2.5. Analysis of articulatory and acoustic variancesTo quantify the combined effects of articulatory covariances on F3variability,an analysis was performed using both acoustic and articulatory data to estimate F3variance as a function of articulatory variances.The relationship between transducer coordinates and F3during /r/can be written for each speaker as follows:(1)where the are constants,the are the transducer coordinates,is the number of transducer coordinates considered in the analysis,and is a residual term that accounts for the effects on F3due to all other sources,including articulators not included in the analysis,measurement errors,and nonlinearities in the relationship between F3and the transducer coordinates.The equation relating F3variance to articulatory variances at each point in time is then:.(2)To determine the effects of articulatory covariances on F3variability,we can compare the variance estimate of Equation 2to the following variance estimate that excludes the covariances between the analyzed transducer coordinates:.(3)If the F3variance estimate in the absence of articulatory covariances (Equation 3)is significantly larger than the variance estimate including the articulatory covariances (Equation 2),we conclude that the primary effect of the articulatory covariances is a reduction in the variance of F3.Strictly speaking,a comparison of the F3variance estimates in Equations 2and 3tells us only about the effects of the covariances of the linear component of each transducer’s relation to F3.However,the relationship between F3and transducer coordinates should be linear near a particular configuration of the vocal tract,since F3is presumably a continuous nonlinear function of the vocal tract area function,and such functions are locally linear.One would further expect that the relationship is still approximately linear for the relatively limited range of vocal tract configurations utilized by a particular subject for /r/.The linear approximations reported below captured approximately 80%of the variance when using only three pellet coordinates,providing support for the assertion that the primary effect of articulatory covariances on F3variance can be captured by considering only the linear component of each transducer’s relationship to F3.Furthermore,the sign (positive or negative)of an articulatory covariance’s contribution to F3variance depends only on the sign of the corresponding terms,and we are primarily interested in the sign of the combined effects of articulatory covariances on F3variance.The expected signs of the for tongue back height,tongue front horizontal position,and tongue front height can be deduced from acoustic theory considerations (Sections 2.3and 2.4).values were estimated for each subject using muliple linear regression on the acoustic and articulatory data.As discussed in Section 3.4,all 21estimated values (3values for each of 7 subjects) were of the sign expected from these acoustic theory considerations.F 3A 0A i c i i 1=N∑+=E +A i c i N E Var F 3()A i 2Var c i ()i ∑Var E ()2A i A j Cov c i c j ,()∑i j <∑2A i Cov c i E ,()i ∑+++=Var F 3()A i 2Var c i ()i ∑Var E ()2A i Cov c i E ,()i ∑++=A i A i A i A i3. Results3.1. Temporal progression of tongue shapesFigures1through7show sample lingual articulations used to produce/r/in thefive contexts by the seven subjects.For each context,two schematized tongue shapes and a palatal trace4are shown.The tongue shape schematics were formed by connecting the three tongue transducers with straight lines.The solid tongue shape corresponds to the point in time at which F3reached its minimum value.The tongue shape70ms prior to the F3minimum is indicated by dashed lines.The movement of the tongue can thus be roughly characterized as a transition from the dashed tongue shape to the solid tongue shape.This movement corresponds to the articulation toward the“acoustic center”of/r/;i.e.,the portion of the movement up to the point in time of the F3 minimum.Inspection of the lingual articulations for some subjects suggests that these subjects utilize different articulatory gestures,aimed at different vocal tract shapes,to produce/r/in different phonetic contexts.For example,the backward movement of the tongue,with a slight downward movement of the tongue blade, used by Subject1to produce the/r/in/wadrav/does not appear to be aimed at the same vocal tract shape for/r/as the upward movements of the tongue blade used by the same subject to produce/r/in the/warav/, /wabrav/,and/wavrav/contexts(Figure1).Similarly,the downward movement of the tongue blade used by Subject2to produce the/r/in/wadrav/does not appear to be aimed at the same vocal tract shape as the upward movements of the tongue blade used by the same subject to produce/r/in/warav/,/wabrav/,or /wavrav/(see Figure2).Additional examples of this phenomenon can be seen in Figures1-7.The possible relevance of these observations to theories of speech motor control will be addressed in the Discussion section.3.2. Tongue shapes at acoustic center of /r/Figure8shows tongue configurations at the F3minimum of/r/for each of the seven speakers.For each utterance,the three tongue transducer positions are connected by a straight line.The tongue configurations for all repetitions in all phonetic contexts are superimposed for each speaker.Thus,the fact that different numbers of utterances were analyzed for different subjects and contexts is reflected in this figure.As previously reported elsewhere(e.g.,Delattre and Freeman,1968;Hagiwara,1994,1995;Ong and Stone,1998;Westbury,Hashi,and Lindstrom,1995),a wide range of tongue shapes is seen both within and across subjects.Also of note is the fact that,although most subjects seem to use an approximate continuum of tongue shapes(e.g.,S2,S3,S6,and S7),others show a more bimodal distribution of tongue shapes(e.g.,S4,S5).Finally,the tongue shapes across subjects appear to form an approximate continuum between a bunched configuration(e.g.,S6)and a retroflexed configuration(e.g.,S4).A more detailed indication of the effects of the different phonetic contexts on the tongue shapes for/r/can be gained from Figure9,which shows the average tongue shapes used by each subject in each phonetic context,coded by phonetic context.Figures10through16show the corresponding average F3traces,starting from the point of the F3minimum for/r/and continuing for70ms,for each speaker coded by phonetic context.With the exception of the/wadrav/productions of Subject2,which had considerably higher values of F3than the other utterances for that subject,the subjects showed minimum F3values well below2000Hz in all contexts, as expected from earlier studies of /r/ production.Figure17shows the midsagittal palatal outline(thick solid line)and mean tongue shapes at the time of the F3minimum for/r/for each of the seven subjects.For each subject,mean configurations from two phonetic contexts(solid and dashed lines)are shown to illustrate the range of tongue shapes used by that subject.Tongue outlines were created by connecting the average positions of the three tongue transducers for a given utterance with a smooth curve to roughly approximate tongue shape5.A line was then extendeddownward from the tongue front transducer position,then forward to the lower incisor transducer position,to provide a rough estimate of the relative size of the front cavity across contexts 6.Also shown in the upper left corner of this figure are two superimposed,highly schematic vocal tract outlines that illustrate trading relations for maintaining a relatively stable F3.The effect on F3of the longer front cavity of the dashed outline,which can be roughly characterized as a retroflexed /r/,is counteracted by the effects of the longer and slightly narrower constriction of the solid outline,which can be roughly characterized as a bunched /r/.Similarly,the vocal tract outlines for all subjects indicate that shorter front cavity lengths are accompanied by a compensating increase in constriction length and/or decrease in the constriction area.Furthermore,the tongue shapes during /wagrav/(solid lines)are generally much closer in shape to tongue shapes for /g/than are the /r/shapes for /wabrav/or /warav/(dashed lines),suggesting that subjects utilize /r/configurations that are reached relatively easily in the current phonetic context.3.3. Articulatory trading relationsFor each subject,Pearson correlation coefficients corresponding to the predicted covariances described in Section 2.4were estimated across utterances at the point of F3minimum and are listed in Table 1.All subjects showed a significant positive correlation between tongue back height (TBY inTableFigures 1-3.Sample lingual articulations used by Subjects 1-3to produce /r/in the five phonetic contexts.For each context,two schematized tongues shapes and a palatal trace are shown.Each tongue shape schematic was formed by connecting the three tongue transducers with straight lines.The tongue shape at the F3minimum for /r/is drawn with solid lines.The tongue shape 70ms prior to the F3minimum is drawn with dashed lines.1)and tongue front horizontal position (TFX),indicative of a trading relation between constriction length and front cavity length.Six of seven subjects also showed a second strong trading relation:five subjects showed a trading relation between constriction length and constriction area as evidenced by a negative correlation between TBY and tongue front height (TFY),and one subject showed a trading relation between front cavity length and constriction area as evidenced by a positive correlation between TFX and TFY .One subject (Subject 7)showed only very weak correlations other than the strong trading relation between tongue back height and tongue front horizontal position.3.4. Analysis of acoustic and articulatory variabilitiesThe results in Section 3.3indicate that most subjects exhibited two of three hypothesized articulatory trading relationships that act to reduce acoustic variability.Furthermore,as described in Section 2.4,it is unlikely or impossible for a subject to utilize all three trading relations because they counteract one another.However,it is still possible that the significant correlations that violate the trading relations could effectively “override”the beneficial articulatory tradeoffs,potentially nullifying or even reversingtheFigures 4-6.Sample lingual articulations used by Subjects 4-6to produce /r/in the five phonetic contexts.For each context,two schematized tongues shapes and a palatal trace are shown.Each tongue shape schematic was formed by connecting the three tongue transducers with straight lines.The tongue shape at the F3minimum for /r/is drawn with solid lines.The tongue shape 70ms prior to the F3minimum is drawn with dashed lines.Figure 8.Tongue configurations at the F3minimum of /r/for each of the seven speakers.For each utterance,the threetongue transducer positions are connected by straight lines.The tongue configurations for all repetitions in all phoneticcontexts are superimposed for each speaker.Figure 7.Sample lingual articulations used by Subject 7to produce /r/in the five phonetic contexts.For each context,two schematized tongues shapes and a palatal trace are shown.Each tongue shape schematic was formed by connecting the three tongue transducers with straight lines.The tongue shape at the F3minimum for /r/is drawn with solid lines.The tongue shape 70ms prior to the F3minimum is drawn with dashed lines.effect of the utilized trading relations on acoustic variability.It is therefore necessary to estimate the net effect of all three articulatory covariances, as outlined in Section 2.5.F3variance estimates with and without covariance terms (Equations 2and 3,respectively)were calculated using the tongue back height,tongue front horizontal position,and tongue front height transducer coordinates.The corresponding F3standard deviations were then averaged across subjects.The values for each speaker were estimated using multiple linear regression across utterances and time bins and are provided in Table 2;the value of for a particular time bin was simply the residual of the regression in that time bin.R 2values for the F3fit (without the residual term)ranged from 0.75to 0.87for the different subjects,with an average R 2of 0.79.If covariances are high and the actual effect of an articulator’s position on F3is very low,the regression analysis can possibly result in estimates of transducer contributions that have the wrong sign,which could in turn cause some articulatory covariances to decrease estimated F3variability when in reality they increase or have no significant effect on F3variability.The fact that none of the transducer contribution estimates produced by the regression were of the opposite sign as expected from acoustic theory considerations and the MRI-based area function analysis indicates that this potential problem did not affect our results.F3standard deviation estimates with and without covariance terms are shown in Figure 18as a function of time starting at the F3minimum for /r/,averaged across subjects.(Standard deviations were plotted in place of variances to produce values whose units are Hz.)Also plotted is the standard deviation obtained from measured values of F3.When articulatory covariances are included,the F3standard deviation estimate is equal to the measured F3standard deviation;this is as expected because of the inclusion of the residual term in the variance estimate calculations.The solid line in the figure thus represents both the measured F3standard deviation and the estimated F3standard deviation including the covariance terms.When articulatory covariances are removed from the estimates,the estimated F3standard deviation increases substantially.The dashed line in Figure 18represents estimated F3standard deviation without covariances using the three tongue transducer coordinates.According to this estimate,then,F3standard deviation would be 105%higher at the acoustic center of /r/if the articulatory tradeoffs had not been present.Figure 9.Averaged tongue configurationsat the F3minimum of /r/for each of theseven speakers.The averaged positions ofthe three tongue transducer positions foreach of the five phonetic contexts areconnected by straight lines.A i E。
如何改变哑巴式英语作文下面是一篇关于改变哑巴式英语的作文,旨在引导读者更自信地表达自己的想法和观点:---。
Breaking the Silence: Transforming Mute English Writing。
In the realm of language, expression is paramount. Yet, many of us find ourselves imprisoned within the confines of what can be described as "mute English" a constrained,stilted form of expression that fails to capture the vibrancy and depth of our thoughts. However, there exists a path towards liberation, a journey of transforming our muted prose into eloquent symphonies of words. Let us embark upon this transformative voyage together.The first step in breaking the silence of mute Englishis to cultivate confidence in our own voices. Too often, we hesitate to express our thoughts for fear of ridicule orrejection. Yet, it is only by embracing our unique perspectives that we can truly contribute meaningfully to the conversation. Like a bird learning to spread its wings and take flight, we must trust in our ability to soar above the constraints of self-doubt.Furthermore, we must enrich our vocabulary and hone our linguistic skills. Just as a painter requires a diverse palette of colors to create a masterpiece, so too do we need a rich array of words to convey the nuances of our thoughts. By expanding our lexicon and refining our syntax, we can infuse our writing with depth and clarity, transforming it from a mere whisper into a resounding proclamation.Moreover, we must learn to infuse our writing with passion and emotion. Language is not merely a tool for communication; it is a vehicle for conveying the depths of human experience. Whether we are writing about love, loss, or triumph, we must imbue our words with the full spectrum of human emotion, allowing our readers to connect with our writing on a profound level.In addition, we must embrace the power of storytelling. Throughout history, stories have served as a means of transmitting culture, conveying wisdom, and fostering empathy. By weaving narratives into our writing, we can captivate our readers' imaginations and convey complex ideas in a compelling and accessible manner.Furthermore, we must learn to embrace feedback as a means of growth. Just as a sculptor refines their work through successive iterations, so too must we refine our writing through the process of revision. By seeking out constructive criticism and actively incorporating feedback into our writing, we can continuously improve our craft and strive towards ever greater heights of eloquence and expressiveness.In conclusion, the journey towards breaking the silence of mute English is one of self-discovery, growth, and transformation. By cultivating confidence in our voices, enriching our vocabulary, infusing our writing with passion and emotion, embracing the power of storytelling, andembracing feedback as a means of growth, we can transcend the constraints of muted prose and unleash the full potential of our linguistic prowess. So let us raise our voices, and together, let us write a new chapter in the story of English expression.---。
71MXL LARGE DIAPHRAGM CONDENSER STUDIO MICS Studio condenser microphones with both FET and tube circuitry available. FET models require 48V phantom power, while tube models include power supply.ITEM DESCRIPTIO N PRICE MXL-770...............Cardioid, 6-micron 20mm gold-sputtered diaphragm FET w/shockmount and case,30Hz-20kHz, -10dB pad and bass roll off .................................................................................................89.95 MXL-2003.............Cardioid transformerless condenser w/bass cut and -10dB pad switches, 20Hz-20kHz ........................169.95 PRO-PAC-PLUS+..MXL2003 & MXL603S (transformerless cardioid instrument mic),w/shockmounts, cables, windscreens in padded case ............................................................................279.95 MXL-2006.............Cardioid, true Class 'A' FET preamp w/balanced transformerlessoutput, high-isolation shockmount and case, 30Hz-20kHz .......................................................................99.95 MXL-2010.............3-pattern (cardioid/Figure-8/omni) capsule, FET preamp w/bassroll-off switch, shockmount & windscreen, 30Hz-20kHz .........................................................................169.95 MXL-CUBE............Drum condenser cardioid mic, FET balanced output w/mic clip, 20Hz-20khz ...........................................99.95 MXL-GOLD-35.......Gold-plated cardioid condenser w/shockmount, leather case, windscreen, 20Hz-20kHz ........................695.00 MXL-M3................Cardioid condenser, transformer balanced output, FET preamp, 20Hz-23kHz ........................................249.00 MXL-R77 ..............Classic body Figure-8-patterned ribbon condenser w/wood case,desktop stand, & 25ft XLR cable, 20Hz-18kHz .......................................................................................399.99 MXL-R77-L...........Same as above, but w/custom Swedish Lundahl transformer ................................................................549.95 MXL-R144 ............Figure-8 patterned ribbon condenser, purple & chrome finishw/shockmount, case, cleaning cloth, 20Hz-17kHz ....................................................................................99.99 MXL-TROPHY........Cardioid condenser w/side-tapped capsule & removable nameplate, 20Hz-20kHz ................................199.95 MXL-V6.................Cardioid mic w/balanced FET output stage, case, & clip, 20Hz-20kHz ...................................................299.00 MXL-V67G ............Transformer-balanced cardioid mic w/solid-state preamp & gold grill, 30Hz-20kHz .............................119.95 MXL-V67I .............Same as above but w/2 selectable 1" cardioid capsules (one bright, one warm) & -10dB pad switch ..169.95 MXL-V67Q ............X/Y stereo pattern condenser mic, w/gold-plated grill, 2 externally biased 22mm pressuregradient capsules, 10ft custom XLR cable, 20Hz-20kHz ........................................................................199.95 MXL-V69-MOGAMI-EDITNTransformerless tube cardioid condenser w/shockmount, power supply, 15ft 7-pin cable,15ft XLR cable, cleaning cloth, & case, 20Hz-18kHz ..............................................................................299.95 MXL-V69XM..........Same as above, but w/transformer balanced output, 20Hz-18kHz .........................................................399.00 MXL-V87...............Low-noise transformer-balanced cardioid condenser withFET preamp, w/pop filter & shockmount, 20Hz-20kHz ............................................................................299.95 MXL-V88...............Low-noise cardioid condenser w/balanced transformerless output, FET preamp,high-isolation shockmount & flight case, 20Hz-20kHz ...........................................................................199.00 MXL-V177.............Low noise classic body cardioid condenser w/6-micron thick gold-sputtered diaphragm, 20Hz-20kHz .299.95 REVELATION.........Variable pattern tube condenser mic w/EF86 pentode tube ..................................................................1295.00 GENESIS...............Tube condenser cardioid mic w/-10 dB pad, dedicated power supply ....................................................595.00R144V1772010GOLD-35GENESIS REVELATIONCUBEVIP50DC-196DC-96B72ID The RØDE NT5 is a true condenser microphone specifically two stand mounts and two windscreens. Requires 48V phantom power. Frequency dynamic range >128dB. Available in pairs, or as single RØDE NT2A MULTI-PATTERN LARGE DIAPHRAGM CO NDENSER MIC The successor to the very popular NT-2 studio condenser mic, the NT2A now has the internally shock mounted H F1 dual 1" capsule, the same as is used in the K2 microphone, and new electronics with extremely low 7dBA self-noise. It has a 3-position variable polar pattern for omni, cardioid, and figure-8 patterns, three positions of selectable attenuation (0dB, -5dB, or -10dB) and low freq. roll-off (flat, 80Hz or 40Hz). The NT-2A comes with a soft pouch and mounting clip.ITEM DESCRIPTI O N PRICE NT2A ...................Multi-pattern condenser ................399.00ID phone with an incredible 5dBA self noise figure. 48V phantom power, transformerless output, 137dB max. SPL, fixed cardioid pattern.SCHO EPS CO LETTE MODULAR MIC SETSMicrophones consist of a mic cap-sule and amplifier. The CMC 6 microphone amplifier converts extremely high-imped-ance signals from the attached capsule toa very low-impedance signal suitable for transmission via mic cable. It also featuresa symmetrical Class “A” output stage, free of coupling condensers or an output transformer. They are light-weight, have extremely low distortion, and their very low output impedance helps make them insensitive to electrical interference. The CMC 6 works with both 12V and 48V phantom powering. Matte gray options listed below but also available in nickel. All sets include SG20 stand clamps and B5 popscreens.ITEM DESCRIPTI O N PRICE CMC621G-SET ....Includes CMC 6 amplifier & MK21wide cardioid capsule ..................1632.00 CMC622G-SET ....Includes CMC 6 amplifier, & MK22 open cardioid capsule ........1632.00 CMC64G-SET ......Includes CMC 6 amplifier & MK4 cardioid capsule ..........................1542.00 CMC641G-SET ....Includes CMC 6 amplifier &MK41 supercardioid capsule ........1743.00CMC641CG-SET ..Includes CMC 6 amplifier & MK41 super-cardioid capsule, CUT 1 filter ......2320.00CMC65G-SET ......Includes CMC 6 amplifier &MK5 omni/cardioid capsule..........2059.00CMC68G-SET ......Includes CMC 6 amplifier &MK8 figure-8 capsule...................1938.00NEAR-FIELD-OMNI-STIncludes 2 CMC 6s, matched pairof MK2 omnis ...............................3150.00UNIVERSAL-OMNI-H-STIncludes 2 CMC 6s, matched pairof MK2 H omnis ............................3214.00UNIVERSAL-OMNI-S-ST ........................................................Includes 2 CMC 6s, matched pairof MK2 S omnis ............................3150.00WIDE-CARDIOID-STIncludes 2 CMC 6s, matched pairof MK21 wide cardioids ................3199.00OPEN-CARDIOID-STIncludes 2 CMC 6s, matched pairof MK22 open cardioids ...............3199.00CARDIOID-ST ......Includes 2 CMC 6s, matched pairof MK4 cardioids ..........................2999.00SUPERCARDIOID-STIncludes 2 CMC 6s, matched pairof MK41 supercardioids ...............3439.00SWITCHABLE-PATTRN-STIncludes 2 CMC 6s, matched pairof MK5 omni/cardioids ......................CALL MK21MK5MK41MK8MK22MK4NEUMANN KM180 SERIES The KM180series condensers offer the same trans-formerless circuitry as the KM100s without the 10dB pad. Frequency response is 20Hz-20kHz. The KM183, KM184 and KM185 have SPL han-dling of 140dB, 138dB and 142dB respec-tively. 48V phantom power is required.ITEM DESCRIPTI O N PRICE KM183 ................Omni condenser mic ......................899.95KM184 ................Cardioid condenser mic .................849.95KM185-SILVER ...Hypercardioid condenser mic .........899.95EA2124A .............Optional suspension mount ...........279.01WNS100 .............Replacement windscreen .................26.07WS100................Optional 90mm windscreen .............28.03SKM183-MT ........Stereo mic pair-omni,w/case, black ...............................1699.95SKM183-SILVER .Stereo mic pair-omni, w/case ......1699.95SKM184-MT ........Stereo mic pair-cardioid,w/case, black ...............................1599.95SKM184-NI .........Stereo mic pair-cardioid,w/case, nickel ..............................1599.95NEUMANN U87AI 3-PATTERNA legendary FET large diaphragmtransformer coupled studio mic withomni, cardioid and figure-8 patterns,N PRICEMXL SMALL DIAPHRAGM CONDENSER MICS These small diaphragmcondenser mics are quiet, sensitive, rugged enough for the road and inexpen-sive. They are an excellent choice for drum overheads and for close recordingof acoustic instruments. 48V phantom power required.ITEM DESCRIPTI O N PRICE MXL-603PAIR .....Stereo pair of transformerless cardioid instrument mics, 30Hz-20kHz,w/ MXL-41-603 shockmounts & case ................................................................................199.00MXL-604.............Small diaphragm cardioid condenser, 30Hz-20kHz, w/ wooden case,mounting clip and omnidirectional capsule.........................................................................83.93MXL-V67N ..........Small diaphragm instrument condenser, omni/cardioid capsules, 20Hz-20kHz ................119.95MXL-41-603 .......Replacement shockmount for MXL-603PAIR ........................................................................38.95603PAIR ID This mic sure level of 144dB, which is good forrecording amps, drums, and other loudinstruments. Acoustic instruments willbenefit from the TLM102’s fast transientresponse. The TLM102 also features a slight boostNEUMANN TLM103/TLM103D These are large diaphragm cardioid mics with a transformerless output stage. Designed primarily for vocals, the TLM03 also excels on piano, percussion, and stringed instru-ments. The TLM103D is a digital version of the mic, designed for home recording and project studios. Its A/D converter receives the output signal directly from the capsule, ensuring that the signal has no coloration and superb transparency. Both mics require 48V phantom power and come with We service many of the major brands that we carry.Call our Authorized Repair Department at ext. 1172Honesty and Valuesince 197173Thisrugged system has separate capsules and powering modulesthat can be combined to produce a wide variety of microphones.A single module and a few capsules can provide the user withflexibility that would otherwise require investing in a numberof individual microphones. It converts quickly from one type ofmicrophone to another by simply threading together varioussystem components. All capsules use back-electret technologyfor uncompromised quality. Output of all powering modules isbalanced, low impedance (200ohms) and terminates in a stan-“Universal” powering module for the sys-tem. Powered by a single 1.5V “AA” battery with a life of 150 hours, or phantom power SHURE KSM42 & KSM44A LARGE Premium side-address mics offering a dual-diaphragm design with an active front, ultra thin (2.5 micron) 24-karat gold, low mass, 1" Mylar® diaphragm. The KSM42 is ideal for vocal applications with a tailored frequency nal pop filtering. The single-pattern design exhibits smooth proximity control & ultra-wide dynamic range. An internal shock KSM42KSM44A SSED SINGLE-DIAPHRAGM For studio recording and live sound production. The KSM32 offers an extended frequency response of 20Hz-20kHz for an open, natural sound. The Class “A” transformerless preamp circuitry eliminates cross-over distortion for improved linearity across the full frequency range. mass diaphragm provides extended low frequency response and excellent transient microphones feature 20Hz-20kHz frequency response, a 3-position pad switch, a 3-position LF filter, and include a windscreen and carrying case. The KSM137 is a single pattern cardioid mic, while the KSM141 is a dual ...............................665.00 ...............................870.00 PRICE NDENSER The e614 has a super-cardioid pattern, with a neutral response and moderate sensitivity, which insures optimum isolation from other instruments on stage. Unobtrusive for precise place-ment and powered by external phantom power (P48), the e614 tolerates extremely high SPLs, precisely capturing cymbals and hi-hats better than other mics in its price range. 10-year warranty. Frequency response is 40Hz-20KHz.PRICE A large diaphragm side-address microphone with a car-dioid pickup pattern and a nickel colored finish. It has a large 1" diaphragm precisely sputtered with 24-carat gold. It has a sturdy metal housing and the elastically mounted capsule suppresses handling noise. It has a max SPL level of 140dB and a low self-noise level of 10dB(A). It has a frequency response of 20Hz-20kHz, a sensitivity rating of 25mV/Pa and a dynamic range ING SOON!99.9574at home in sound reinforcement systems or in sound studios and motion picture/ TV scoring stages. When used with the optional windscreen, the SM94 can be used by vocalists and speechmakers who desire a wide, flat z and a uniform SHURE PG27LC & PG42LC SIDE ADDRESSThese mics feature acardioid pickup pattern, large diaphragm capsules withswitchable attenuators. The PG27LC is designed for usewith instruments. It has a flat, neutral frequency responsefeatures a voice-tailored frequency response of 20Hz-20kHz as well as an SPL rating PG42LC PG27LC NDENSER ect choice for y demo tapes, les om pow-STC2OMEGA ORPHEUS STC1HELIOS STC10STC2X STC20SATURN studio environments. Features interchangeable capsules for superior versatility. The small diaphragm design provides superior audio with con-sistent, textbook polar responses and is small enough to use in the tightest conditions. The compact side-address design features an innovative lock-ing ring to provide secure connection between capsule and preamplifier. The 20Hz-20kHz frequency response is tailored for wide dynamic range applications for use in high SPL environments. Ships with stand adapter, LET DESIGN BLACK KNIGHT ID PRICE VISIT OR CALL 800-356-5844 FOR MORE INFO We have the largest selection of hard to find items. Call us!Honesty and Valuesince 197176VIOLET DESIGN PEARL SERIES CARDIOID CONDENSER These mics have a rear-ported spherical head and nar-row tapered form reflector to reduce internal resonances and optimize the cardioid polar pattern. The Pearl Standard uses a medium single-diaphragm to provide a modern,airy and detailed sound while the Grand Pearl uses a largedual-diaphragm, providing the warm, classic sound usuallyassociated with vintage studio microphones. Pearl Vocal is ahandheld stage vocal mic, and has special shockmounting forreduced handling noise, a special mid-sized capsule designedfor high gain before feedback with extended flat LF responseand outstanding dynamic range. All models include wood box and stand clip.NES These are side-addressed, cardioid, large dual diaphragm vacuum-tube studio microphones in 3 variations, with acous-tically-transparent head construction and internal shock-mounts. The 6267 vacuum tube preamplifier is a Class “A” fully-discrete circuit with massive heatsink, very high output, flat audio-response and ultra-low noise and distortion. The large custom-wound Permalloy humbucking audio trans-former adds even more “analog warmth” to the sound with wide frequency response and minimum saturation. ComesVIOLET DESIGN AMETHYST CONDENSER STUDIO MICSThe Amethyst cardioid studio microphones use fully-discrete,phantom-powered Class “A” transformerless amplifiers. TheAmethyst Standard uses a single diaphragm center-terminatedcapsule which provides modern, detailed sound with airy highs,medium vocal presence and accentuated low-end response.The Amethyst Vintage uses a dual-diaphragm, dual-backplatecapsule which provides a classic, wide-spectrum, vintage micro-phone sound. Both models have a uni-directional cardioid polarpattern with minimal proximity effect and are linear over the wide frontal incidence NDENSER MICS The Flamingo Junior series are side-addressed studio microphones that use a single-diaphragm, cardioid electrostatic transducer which provides vintage tone with transparent highs, optimized vocal presence and fundamental low-frequency register with minimal proximity effect. An integrated capsule shockmount within the head and an included external compact studio shock-mount reduce stand rumble and mechanical shocks. The custom-wound large humbucking audio transformer adds “analog warmth” to the sound. Comes - Globe features transformerless Class “A” discreet electronics, internal shockmounts for both electronics and capsule, and a spherical and acoustically transparent head design. The Globe Standard has a classic wide-spectrum sound ideal for orchestral applications. The Globe Vintage uses a different dual-diaphragm capsule providing a sweet and warm, vin-tage vocal microphone sound. Comes in a wooden box with shockmount.GLOBE-STANDARD An electrostatic multi-pattern studio mic with a vintage sound, warmed by vacuum tube circuitry and an output transformer. Features specially designed large dual-diaphragm transducer, and 9 different polar patterns. Its Class “A” fully-discrete internal vacuum tube preamp provides high output, low impedance and low distortion. A large humbucker, custom-wound audio transformer balances the microphone’s output, providing additional analog warmth to the sound. Soft starting, smart power supply and external elastic This large dual-diaphragm cardioid con-denser studio microphone has a warmer smooth tone with very high output and low noise. The fast transient response, crystal clear highs and loud SPL handling make it excellent for recording drums. The multi-layered wedge shaped brass mesh head design effectively reduces plosive sounds and breath, pop or wind noises. The internal solid state preamplifier is a Class“A” fully discrete transformerless circuit, and runs on 48V phantom power.PRICE 679.0099.00ID rating an integrated, tapered reflector which reduces resonances, removes reflections and tern. Integrated damping of the transducer reduces stand rumble, outside infrasonic FRR A compact electrostatic mic preamp body for use with vintage German-made M-series inter-changeable capsule heads, as well as B-series and VIN-seriesinterchangeable capsule heads made by JZ in Latvia. The elec-tronics are phantom powered, with a linear Class “A” discretesolid state transformerless design to provide extremely low self-noise and distortion. The Global Pre provides a low-impedance,balanced output on a gold-plated 3-pin XLR. It comes in a woodenVIN47GLOBAL-PREM COMPATIBLE。
I.J. Intelligent Systems and Applications, 2016, 2, 45-52Published Online February 2016 in MECS (/)DOI: 10.5815/ijisa.2016.02.06Bandwidth Extension of Speech Signals: AComprehensive ReviewN.PrasadNational Institute of Technology Warangal, Warangal-506004, IndiaE-mail: prasad.niz@T. Kishore KumarNational Institute of Technology Warangal, Warangal-506004, IndiaE-mail: kishorefr@Abstract—Telephone systems commonly transmit narrowband (NB) speech with an audio bandwidth limited to the traditional telephone band of 300-3400 Hz. To improve the quality and intelligibility of speech degraded by narrow bandwidth, researchers have tried to standardize the telephonic networks by introducing wideband (50-7000 Hz) speech codecs.Wideband (WB) speech transmission requires the transmission network and terminal devices at both ends to be upgraded to the wideband that turns out to be time-consuming. In this situation,novel Bandwidth extension (BWE) techniques have been developed to overcome the limitations of NB speech.This paper discusses the basic principles, realization, and applications of BWE. Challenges and limitations of BWE are also addressed.Index Terms—Speech bandwidth extension, Data hiding, Model-based techniques, Non-Model- based techniques, Speech quality, Speech intelligibility, wideband speech coding.I.I NTRODUCTIONDue to historical and economic reasons, the audio frequency range of telephone speech is typically limited to approximately the traditional telephone band of 300–3400 Hz. The mobile users are experiencing the muffled sound of speech due to its narrow audio bandwidth [1]. The lack of low frequencies results in a thin voice and degrades the naturalness and presence, whereas the absence of high frequencies deteriorates especially fricative sounds such as [s] and [f] and causes the characteristic muffled speech quality.This constraint limits the quality of speech received by the listener on either side of the connection. Furthermore, some of the speaker-specific characteristics of speech are lost.The technology and standardized methods exist for the transmission of speech of a wider bandwidth. Wideband speech typically refers to the acoustic bandwidth of 50–7000 Hz, which gives a substantially better speech quality with brighter and fuller sound compared to narrowband speech. Until recently, the deployment of wideband speech transmission has been slow, and wideband speech has been mainly used in specific applications such as video conferencing and internet telephony.WB speech transmission requires the transmission network and terminal devices at both ends to be upgraded to the wideband that turns out to be a costly for all the users across the communication system.Presently, majority of the network operators are offering wideband speech services, but it requires the transmission network and terminal devices at both the ends to be upgraded to the wideband that turns out to be prolonged changeover time.To improve the quality during the changeover, novel BWE techniques have been developed using digital signal processing techniques. One such method is artificial bandwidth extension (ABWE) in which the missing spectra is estimated from narrowband signal. Alternatively, the additional information about the missing frequency regions can be transmitted as side information to support the bandwidth extension. In general, this provides better output quality than artificial bandwidth extension, but the transmission of side information requires special arrangements that may be difficult to implement in practice.The speech bandwidth can be extended to frequencies lower than or higher than the narrow band or both [2]. The missing frequency band higher than the narrow band typically ranges from 3.4 kHz (or 4 kHz) up to 7 kHz (or 8 kHz) and is referred to as the highband (HB). The missing frequency band below the narrow band typically covers frequencies below 300 Hz and is referred to as the lowband (LB). Bandwidth extension methods towards high frequencies create new content in the frequency band from 3.4 kHz (or 4 kHz) to 7 kHz (or 8 kHz). There are also bandwidth extension methods towards low frequencies, 50-300 Hz. The LB effects the quality and speech naturalness but has only a minor influence on intelligibility, whereas the HB influences the speech intelligibility and quality.Bandwidth extension can be implemented in the frequency domain or in the time domain. Frequency domain processing commonly computes the spectrum of the input frame using Fast Fourier transform (FFT), constructs the spectrum of a frame in the spectral domain, and applies the inverse FFT to produce the corresponding time domain signal. Time domain processing typically involves adaptive filters or a filter bank for the shaping ofthe extension band spectrum. Frequency domain processing has the benefit of accessing the spectral representation directly, whereas time-domain processing has the benefit of yielding a lower overall delay and avoid potential degradations due to the limited accuracy of FFT computation especially on fixed point platforms. The Paper is organized as follows. Sect II discusses ABWE techniques. Sect III presents the limitations of ABWE approach. Sect IV describes the BWE with additional information. BWE with embedded additional information is describes in Sect V. Sect VI presents the challenges of BWE with embedded additional information. Sect VI presents the applications of BWE techniques. Finally, Sect VII presents the conclusion remarks.II.A RTIFICIAL B ANDWIDTH E XTENSIONThe ABWE methods can be classified into the following categories•Model based techniques•Non-Model- based techniquesA. Model based TechniquesA generalized model for ABWE proposed in the literature is shown in fig.1.Here, initially up sample the NB signal through interpolation. A feature set is computed from the up sampled NB signal. Then, the extracted features set are utilized to estimate parameters for highband shaping. These estimated parameters are used for analysis filtering of the interpolated narrowband signal to get NB linear predictive coding (LPC) residual signal. The extended excitation signal is obtained by extending the NB LPC residual signal to the missing frequency range (LB and HB). The LB and HB signals are obtained after shaping the excitation utilizing the estimated shaping parameters. Finally, the original speech signal (output signal) contains the bandlimited signal (NB) and an artificially generated missing frequency range signals.Fig.1. ABWE according to [1]The model based techniques are based on a source filter model of speech production. This model consists of two parts, an excitation signal and a vocal tract filter. The ABWE is divided into two individual tasks based on this model as:1.Spectral envelope extension2.Excitation signal extension1. Spectral Envelope ExtensionThe extension of the spectral envelope from the narrowband to the highband is usually made based on some model that maps a narrowband feature vector onto the wideband envelope. The feature vector may consist of features that describe the shape of the envelope directly or some other features that are related to the temporal waveform.The existing techniques [2, 3] for estimating the highband spectral envelope are discussed here.1.1 Codebook MappingThe codebook mapping techniques make use of two codebooks and are constructed in training phase utilizing vector quantization to find a representative set of narrowband and wideband speech frames. A wideband codebook contains a number of wideband spectral envelopes that are parameterized and stored as, e.g., LPC coefficients or line spectral frequencies (LSFs). A narrowband codebook contains the parameters of the corresponding narrowband speech frames. When the bandwidth extension system is used, the best matching entry for each narrowband input frame is identified in the narrowband codebook and the spectral envelope of the extension band is generated using the corresponding entry in the wideband codebook.This technique was proposed in [4], and presented in [5] with refinements. Separate codebooks were constructed for unvoiced and voiced fricatives in [5], which enhance the quality of spectral envelope. In addition, a codebook mapping with interpolation was reported to enhance the extension quality in [6]. Furthermore, in [7], codebook mapping with memory was implemented by interpolating the current envelope estimate with the envelope estimate of the previous frame.1.2 Linear MappingLinear mapping is utilized in [8] to estimate the highband spectral envelope. The linear mapping between the input and output parameters (LPCs or LSFs) is denoted asY = WX (1) Where the matrix W is obtained through an off-line training procedure with the least-squares approach that minimizes the model error (Y–WX) using a training data with NB parameters is denoted by a vector x = [x1, x2, . . . , xn] and corresponding wideband envelope to be estimated by another vector Y = [y1, y2 . . . , ym].To better reflect the non-linear relationship between the NB and HB envelopes, modifications for the basic linear mapping technique have been presented. The mapping can be implemented by several matrices instead of using a single mapping matrix. In [9], input data areclustered into four clusters, and for every cluster a separate mapping matrix is created. In [8], Objective analysis was showing that spectral distortion (SD) for codebook mapping and neural network was larger than for piecewise linear mapping. On the other hand, the objective comparison given in [6] indicates the performance of codebook mapping is better than linear mapping.1.3 Gaussian Mixture ModelIn linear mapping, only linear dependencies between the NB spectral envelope and the HB envelope are exploited. Non-linear dependencies can be included in the statistical model by utilizing Gaussian mixture model (GMM). A GMM approximates a probability density function as a sum of several multivariate Gaussian distributions, a mixture of Gaussians. GMMs are used in ABWE to model the joint probability distribution of the parametric representations of the NB input and the extension band. Given the input feature vector of a speech frame, the parameters representing the extension band envelope are determined using an error estimation rule.The GMM is utilized directly in envelope extension to estimate WB LPC coefficients or LSFs from corresponding NB parameters [10]. The performance of the GMM based spectral envelope extension was then enhanced by using Mel frequency cepstral coefficients (MFCCs) instead of LPC coefficients [11]. The GMM mapping with memory further results in better performance in terms of log spectral distortion (LSD) and perceptual evaluation of speech quality (PESQ) [12].The advantage of the GMM in envelope extension methods is that they offer a continuous approximation from NB to WB features compared to the discrete acoustic space resulting from vector quantization (VQ). Better results were reported for GMM-based methods compared to codebook mapping in [10] in terms of SD, cepstral distance, and a paired subjective comparison.1.4 Hidden Markov ModelHidden Markov model (HMM) is used to estimate the high band spectral envelope. Each state of the HMM represents a typical extension band spectral envelope. The model also contains state probabilities, transition probabilities between the states, and state specific models for the probability density function of the input feature vectors. Each probability density function is approximated with a GMM. Given a sequence of input features, the aposteriori probability of each state is calculated from the observation probabilities and the state transition probabilities. The parameters representing the extension band envelope are determined using an error estimation rule. In particular, information in preceding frames is taken into account by the state transitions.The advantage of HMM is its capability of implicitly exploiting information from the preceding frames to improve the estimation quality. Each state of the HMM represents a typically highband spectral envelope, so the change of the state of the HMM defines the envelope change of speech. HMM offers better results with lower order in comparison with GMM with even higher order. HMM based ABWE was proposed in [13].Training of the HMM aided by phonetic transcriptions was introduced in [14].The Baum-Welch training algorithm was utilized in [15] and shown to outperform the one presented in [13].1.5 Neural NetworksIn [16], proposed neural networks to estimate the highband spectral envelope. The principle of an artificial neural network is inspired by the mechanisms of neural processing in the brain. An estimation task is performed by a large number of interconnected neurons, each of which computes a single output from several inputs by applying a simple linear or nonlinear function to the weighted sum of the inputs. The neurons are typically arranged in a layered structure where each neuron receives inputs from the previous layer and provides outputs to the following layer. The neural network is usually composed of a small number of such layers. Networks with only forward connections between layers are called feedforward networks, whereas networks containing feedback loops to preceding layers are known as recurrent networks. The training of a neural network involves finding suitable weights for the connections between the neurons in a fixed network topology. Neural networks are commonly trained using a large set of training data and the back-propagation algorithm. In [17], a neural network technique was compared with a codebook mapping technique, and attained better cepstral distortion and log spectral distortion. The computational complexities of code book methods are higher than those of neural network methods.2. Excitation Signal ExtensionThe existing techniques [2, 3] for excitation signal extension are discussed here.2.1 Spectral FoldingSpectral folding is a simple way to extend the excitation signal of a narrowband speech signal from the frequency range 0–4 kHz to 0–8 kHz. In spectral folding, a mirror image of the bandlimited (NB) signal spectrum in the missing frequency range is generated [18]. It has the same effect as up sampling the signal by two. Excitation extension can be implemented by mirroring the FFT coefficients in the frequency domain or by inserting a zero sample after each input sample in the time domain. Spectral folding which preserves the input spectrum is computationally simple, and the resulting temporal structure in the highband follows that of the input signal. However, the spectral folding technique has some drawbacks [19].Due to the limited telephone bandwidth of 300–3400 Hz there is a spectral gap at around 4 kHz. Also, the harmonic structure is not preserved in the highband.2.2 Modulation TechniqueA translated spectrum of the up sampled NB spectrumcan be produced in the extension band. High pass filtering is used to prevent the overlapping between the both spectrums (NB spectrum and the translated spectrum). The spectral translation with a modulation frequency can be equal to 4 kHz described in [18]. Other modulation frequencies can also be used and allow, e.g., filling the gap in the spectrum at 4 kHz [13]. A drawback of the modulation techniques with a fixed frequency is that the harmonic structure is not preserved in the HB. 2.3 Pitch Adaptive ModulationAdaptive modulation frequency is used to preserve the excitation signal harmonic structure .The modulation frequency should be multiple of the fundamental frequency of speech (pitch).Pitch adaptive modulation requires an accurate estimate of fundamental frequency. Pitch detection is a non-trivial task for which there are several basic methods and a number of modifications proposed to improve robustness and accuracy. For example, an advanced pitch estimation algorithm was proposed in [19]. The benefit of the pitch-adaptive modulation approach has been found to be small compared to the complexity of the technique2.4 Sinusoidal SynthesisThe harmonic structure of the excitation signal can be preserved by using a sinusoidal synthesis [20]. The harmonic structure for the HB is produced by a bank of oscillators with amplitudes, frequencies, and phases that are found from the NB speech. Spectral flattening is not needed in sinusoidal synthesis, since the spectral envelope can be created directly through sinusoidal amplitudes.Sinusoidal synthesis is especially suitable for the excitation extension to low frequencies below 300 Hz because the lowband signal primarily consists of a small number of harmonics of fundamental frequency and the ear is sensitive to the harmonic components in this frequency range.2.5 Non-Linear ProcessingThe excitation signal can be extended by applying a nonlinear function to the narrowband excitation. The non-linear functions include a half wave rectification, a full wave rectification, a quadratic function, and a cubic function. Nonlinear processing has the advantage of maintaining the harmonic structure of the excitation signal, i.e., nonlinear processing of a periodic signal generates a spectrum with spectral peaks at integer multiples of fundamental frequency. However, due to the non-linearity, the energy level generated in the extension band is difficult to control, and required subsequent energy normalization [17].Nonlinear functions are often used for the low-frequency extension because they generate a harmonic spectrum, which is perceptually important at low frequencies.2.6 Noise ModulationAn excitation signal can be extended by applying noise modulation is motivated by the fact that the harmonic structure becomes more noisy above 4 kHz. Furthermore, above 4 kHz the frequency resolution of human ear is poor and the pitch harmonics are not resolved individually, but the pitch periodicity is maintained by band pass speech signal temporal envelope [21].Therefore,the harmonic spectrum does not need to be reproduced and the excitation can be extended by modulating HB noise with the temporal envelope of the band pass speech signal (2.5–3 kHz) [22].2.7 Noise ExcitationA noise signal has also been used as an excitation in combination with other techniques to avoid an overly periodic excitation at high frequencies [23] or to provide an excitation for unvoiced speech sounds [24]. For the extension from the wideband frequency range to the super-wideband range, a noise excitation may be sufficient, especially if the temporal envelope of the excitation is adjusted.2.8 Voice Source ModelingIn [25], proposed estimating the wideband voice source signal from the narrowband signal to extend the excitation of voiced speech. The technique was found to be especially effective in the low frequency range. Furthermore, the bandwidth extension layer in the ITU-T G.729.1 codec utilizes a lookup table of glottal pulse shapes to reconstruct the excitation for voiced speech B. Non-Model-Based TechniquesBWE without make use of source filter model was proposed in the literature. In [26], the speech signal was modelled as the addition of N consecutive amplitude and frequency Modulation (AM-FM) signals. The lost frequencies are determined using error estimation rule. In [27], proposed a method based on convolutive non-negative matrix factorization. A set of wideband non negative bases determined by using recordings of original speech signal (WB) and bandlimited (NB) speech signals from a target talker. In [28], proposed a method that maps each telephony band speech feature to the same feature of missing frequency band and uses a model for missing frequency band. By using the statistical characteristics of missing frequencies, the correlation between telephony band speech features and missing frequency band model is determined. In [29], proposed a method that operates in the modified discrete cosine transform domain (MDCT).III. L IMITATIONS OF A BWEThough the informal listening tests [30] show preference of ABWE systems compared to the conventional NB telephony, its performance is less compared to that of the original WB speech. The better results for ABWE have been obtained for systems trained for an individual speaker and then for a specific language. But, in both the cases, the quality of ABWE technique is not better than the original WB speech. An information theoretic measure on the correlation between the telephonyband and the HB supports this observation [31].IV. B WE WITH A DDITIONAL I NFORMATIONBWE techniques utilizing a small amount of additional transmitted information about the missing frequency range, the process is known as BWE with additional (side) information [32, 33]. The additional information transmitted over a separate channel enables a more accurate estimation of the missing frequencies than ABWE. A generalized model for BWE with additional information is illustrated in fig.2. The additional information when transmitted through a separate channel compromises the backward compatibility of the communication network. To solve the backward compatibility problem by embedding the additionalinformation as a digital watermark into the NB speech.Fig.2. BWE with additional informationV. B WE W ITH E MBEDDING A DDITIONAL I NFORMATION A backward compatible WB speech communication system can be achieved using BWE with watermarking technology. Digital watermarking is art of hiding data in the certain host signal. The system architecture for BWE with embedding additional information transmission is shown in Fig.3.Here, after passing the signal through low pass filter, the speech signal is down sampled through decimation. The host signal for the watermark is the down sampled signal. In parallel, BWE encoder analyzes the missing frequency band (HB) spectral envelope ofFig.3. BWE with embedded additional information transmissionaccording to [34]wideband input signal and determines a set of corresponding parameters, and produces BWE information. The input to the watermarking encoder is theBWE information. This BWE information is encoded by watermarking encoder and added to the host signal.The system architecture for BWE with embedding additional information reception is shown in Fig.4. Here, the watermarked NB speech signal is decoded by NB decoder. The telephony band (NB) signal is up sampled through interpolation. In parallel, the watermark detector reconstructs the hidden message (BWE information) by analysing the signal mixture. The BWE information is decoded, and produces the linear prediction filter coefficients and gain factor of missing frequency band (HB) signal. The WB signal is produced by adding the missing frequency band (HB) signal to telephony band (NB) signal.Fig.4. BWE with embedded additional information reception accordingto [34]If interoperability with existing standard codecs is required, data transmission of BWE with embedding additional information can be accomplished in three different ways by embedding the additional (side) information in the samples of speech signal itself, by modifying the encoded bit stream, or most efficiently by joint coding and data hiding within the encoder. All these approaches maintain interoperability with the existing networks and codecs but may cause a minor degradation in speech quality. The signal domain approach is problematic because it is not sufficiently robust when coding with code excited linear prediction (CELP) codecs. The third approach of including the data hiding within the encoding process provides the best results but requires codec dependent processing.In [35], according to auditory masking principle, the imperceptible components of the NB signal can be removed. A concealed transmission medium provided by this removal process. The audible components outside the telephone bandwidth are embedded into this hidden channel by using spread spectrum technique and then transmitted. The hidden signal can be extracted at the receiver and results a better perceptual speech signal.In [36], a frame of missing frequency band (HB) signal is encoded into 23 bits and embed these bits unnoticeably into 80 telephone band (NB) speech bit streams. Due to the distortion introduced by 23 bits, there is deterioration in the NB speech quality. In [37], a frame of missing frequency band (HB) signal is encoded into 12 bits by using acoustic phonetic classification and results a significant improvement in NB speech quality than in [36].In [38], the missing frequency band (HB) components parameters are extracted and compressed. Embed these parameters unnoticeably into telephone band (NB) speechbit steam. The missing frequency band (HB) components are extracted in the decoder. Finally, the original speech signal (output signal) contains the bandlimited signal (NB) and an artificially generated HB signal.In [39], the signal spectrum is divided into sub bands. The data embedding parameters are determined for each adaptively selected sub band. According to the principles of the scalar costa scheme (SCS), alter the discrete Hartley transform (DHT) coefficients in order to embed data.VI. C HALLENGES OF B WE WITH E MBEDDING A DDITIONALI NFORMATIONOne of the challenges of a watermarking system is to provide a virtually inaudible watermark with high data rate. For consistent results, the embedding additional information requires a data rate of at least 100–300 bits/sec for every frame of input speech signal. Higher data rates produce better results. Another challenge is that the watermarking system must be robust to withstand the strong signal distortion introduced by transmission medium.VII.A PPLICATIONSThe most important application of speech BWE is to improve the telephone speech quality and intelligibility. Speech BWE can be utilized in a hands-free system for the car environment [40]. The recognition of WB speech is easier for cochlear implant users than the recognition of NB telephone speech [41]. In [42], it was shown that significant enhancement in speech recognition for cochlear implant users by utilizing highband BWE. The other application of speech BWE is to enhance the WB speech. In [43], proposed a method where the traditional speech enhancement methods [44] are used to process the NB and the HB is constructed with BWE.Narrowband speech features of BWE can be utilized in automatic speech recognition (ASR). In [45], it was shown that the BWE of ASR features can be successfully used to train wideband ASR systems with narrowband speech, which is beneficial if the amount of WB training speech is insufficient. In this application, no audio signal needs to be generated from the bandwidth extended feature representation. The BWE techniques can also be utilized in WB speech coding.VIII.C ONCLUSIONSABWE is used to enhance the speech quality and intelligibility of bandlimited signal (NB speech) thus the enhanced speech signal reducing the listening effort. One of the earliest ABWE techniques estimates the parameters of a source filter model utilizing only the information available in NB signal. The theoretical performance upper bound on this estimation of ABWE is improved by using additional information extracted from the original speech signal. The additional information when transmitted through a separate channel compromises the backward compatibility of the communication network. Further works done using BWE with embedding additional (side) information solves the backward compatibility problem by embedding the additional information as a digital watermark into the NB speech.R EFERENCES[1]P.Jax, “Enhancement of bandlimited speech signals:Algorithms and theoretical bounds,” Ph.D. dissertation, RWTH Aachen University, Aachen, Germany, 2002. [2]Laura Laaksonen, “Artificial bandwidth extension ofnarrowband speech - enhanced speech quality and intelligibility in mobile devices,” Ph.D. dissertation, Aalto University, Finland, 2013.[3]Hannu Pulakka, “Development and evaluation of artificialbandwidth extension methods for narrowband telephone speech,” Ph.D. dissertation, Aalto University, Finland, 2013.[4]Y.Yoshida and M.Abe, “An algorithm to reconstructwideband speech from narrowband speech based on codebook mapping,” In Pro. of ICSLP, pages 1591–1594, 1994.[5]Y.Qian and P.Kabal, “Wideband speech recovery fromnarrowband speech using classified codebook mapping,”In Pro.of AICSST, pages 106–111, Australia, 2002.[6]J.Epps and W.H.Holmes, “A new technique for widebandenhancement of coded narrowband speech,” In Pro. of the IEEE Workshop on Speech Coding, pages 174–176, Finland, 1999.[7]R.Hu et al., “Speech bandwidth extension by improvedcodebook mapping towards increased phonetic classification,”In Pro.of Interspeech, pages 1501–1504, Portugal, 2005.[8]Y.Nakatoh et al., “Generation of broadband speech fromnarrowband speech using piecewise linear mapping,”In Proc.of EUROSPEECH, Pages 1643-1646, Greece, 1997.[9]S.Chennoukh et al., “Speech enhancement via frequencybandwidth extension using line spectral frequencies,”In Proc.of ICASSP, Pages 665-668, USA, 2001.[10]K.Y.Park and H.S.Kim, “Narrowband to widebandconversion of speech using GMM based transformation,”In Proc. of ICASSP, pages 1843–1846, Turkey, 2000. [11] A.H.NourEldin and P.Kabal, “Mel frequency cepstralcoefficient based bandwidth extension of narrowband speech,”In Proc.of Interspeech, pages 53–56, Australia, 2008.[12] A.H.NourEldin and P.Kabal, “Combining frontend basedmemory with MFCC features for bandwidth extension of narrowband speech,” In Proc.of ICASSP, pages 4001–4004, Taiwan, 2009.[13]P.Jax and P.Vary, “On artificial bandwidth extension oftelephone speech,” Signal Process., pages 1707–1719, 2003.[14]P.Bauer and T.Fingscheidt, “An HMM based artificialbandwidth extension evaluated by cross-language training and test,” In Proc.of ICASSP, pages 4589–4592, USA, 2008.[15]G.B.Song and P.Martynovich, “A study of HMM basedbandwidth extension of speech signals,” Signal Process., pages 2036–2044, 2009.[16] D.Zaykovskiy and B.Iser, “Comparison of neuralnetworks and linear mapping in an application for ba ndwidth extension,” In pro.of SPECOM, Greece, 2005.[17] B.Iser and G.Schmidt, “Neural networks versuscodebooks in an application for bandwidth extension of。
《即兴⼝语表达》课程教学⼤纲《即兴⼝语表达》课程教学⼤纲⼀、课程基本信息课程代码:课程名称:即兴⼝语表达英⽂名称:Impromptu Speech Communication课程类别:必修课学时:112学时#学分:7学分适⽤对象: 播⾳与主持艺术本科专业考核⽅式:考试先修课程:《普通话语⾳与播⾳发声I》《普通话语⾳与播⾳发声II》《播⾳创作基础》⼆、课程简介《即兴⼝语表达》课程是播⾳与主持艺术专业的⼀门专业基础课,也是播⾳与主持艺术专业的核⼼课程。
本课程⽴⾜我国播⾳与主持艺术⼯作的实际,同时借鉴新闻传播学的相关规律以及语⾔学和艺术学的相关知识,以有声语⾔表达为核⼼,以即兴⼝语表达训练为主要⽅式,主要介绍即兴⼝语表达的基本规律和基本技能,包括⼝头复述、⼝头描述、⼝头评述以及即兴解说、即兴语智和即兴幽默等内容。
本课程能够为学⽣提供较为全⾯、系统的即兴⼝语表达的基本技能训练,是对有声语⾔表达技巧的进⼀步巩固和提升,为⼴播电视节⽬主持打下基础。
Impromptu Speech Communication is not only a professional introductory course of Broadcasting and conduct, but also a Core curriculum of Broadcasting and conduct. According the fact of Broadcasting and conduct , the course borrow the laws of Information communications and relerant knowledge of Linguistics and art, in spoken word expressing as the core, to impromptu spoken language as the method, generally produce the basic rule and basic skills of impromptu spoken language, including oral retelling ,oral describing ,oral comments and impromptu explaining , impromptu language wisdom, impromptu humor. The course provide general systemic basic skills for impromptu oral expressing, improving the skills of spoken language expressing, for Radio and television emceeing lay a foundation.,三、课程性质与教学⽬的《即兴⼝语表达》课程是播⾳与主持艺术专业的⼀门专业基础课、必修课,也是播⾳与主持艺术专业的核⼼课程。