当前位置：文档之家› 基于STM32单片机的嵌入式语音识别系统设计

基于STM32单片机的嵌入式语音识别系统设计

陈心灵1，钱宁博2，胡佳辉1，王战中1

（1.石家庄铁道大学机械工程学院，河北石家庄050043；2.石家庄铁道大学电气与电子工程学院，河北石家庄

050043）

摘要：设计了一款以STM32F103为核心的自然语言识别系统，为满足实时语音识别系统对内存资源和运算速度的要求，基于硬件资源合理

设计语音处理算法，在嵌入式平台上实现了对孤立词语的语音识别。首先根据背景噪声和语音信号的时域特征差异设定相应门限值，从而实现了对语音信号的端点检测。然后针对语音识别中传统梅尔倒谱系数对语音的高频信息敏感度较低，对语音信号分别提取梅尔倒谱系数(MFCC)与翻转梅尔倒谱系数(IMFCC)，结合Fisher 准则构造混合特征参数。最后采用动态时间规整算法实现语音识别。因系统体积小、便携性好等特点，易于实现对不同设备的语音控制，有一定的市场前景。关键词：语音识别；梅尔倒谱系数；翻转梅尔倒谱系数；Fisher 准则；动态时间规整算法；STM32F103

中图分类号：TP391.4

文献标识码：A

文章编号：1009－9492(2019)06－0135－03

Embedded Speech Recognition System Design Based on STM32F103

CHEN Xin-ling 1，QIAN Ning-bo 2，HU Jia-hui 1，WANG Zhan-zhong 1

（1.College of Mechanical Engineering ，Shijiazhuang Tiedao University ，Shijiazhuang 050043，China ；

2.College of Electrical and Electronic Engineering ，Shijiazhuang Tiedao University ，Shijiazhuang 050043，China ）

Abstract:A natural language recognition system is designed based on STM32F103.To meet the requirements of real-time speech recognition system

for memory resources and computing speed ，the speech processing algorithm is designed based on hardware resources and speech recognition of

isolated words is implemented on the embedded platform.Firstly ，the corresponding threshold is set according to the time domain characteristic difference of the speech signal and the background noise and thereby realizing the endpoint detection of the speech signal.Concerning the traditional Mel Frequency Cepstral Coefficient (MFCC)in speech recognition is less sensitive to high frequency signals of speech ，MFCC and IMFCC (Inverted

MFCC)are extracted respectively for the speech signal and the Fisher criterion is used to construct the mixed feature parameters.Dynamic time warping algorithm is used in speech recognition process.Due to the small size of the system and good portability ，it is easy to implement voice control for different devices and has much marker potential.

Key words:speech recognition ；MFCC ；IMFCC ；Fisher criterion ；DTW ；STM32F103

收稿日期：2018－12－22

DOI:10.3969/j.issn.1009-9492.2019.06.045

0引言

语音识别是人机交互很重要的模块，应用领域相当广阔。集成电路的快速发展使得将具有先进功能的语音识别系统固化到更加微小的芯片或模块上成为可能[1]，更便于语音识别系统的推广与使用，嵌入式语音识别技术开发变得更加有价值。

本文设计一个基于STM32F103单片机的嵌入式语音识别系统，包括硬件设计和软件设计

[2-3]

。语音特征提取在传

统梅尔倒谱系数基础上，运用Fisher 比结合梅尔倒谱系数与翻转梅尔倒谱系数，构建了混合特征参数[4]，识别算法采用动态时间规整算法。硬件设计上实现了语音信号采集、语音信号处理、语音信息存储、语音识别结果的显示等功能。

1系统硬件设计

本系统主要由电源部分（LDO ）、主控（STM32F103）、语音采样电路、LCD 显示模块等组成，如图1所示。

1.1MCU 选择

STM32F103开发板基于Cortex-M3处理器，内置2个

12位模数转换器，2个DMA 控制器，共12个DMA 通道，其可以满足本系统中的语音处理需求。1.2采样电路

采样电路选用差分放大电路，抑制共模干扰，放大有用信号，有效地解决采样噪声硬件预处理的问题。其原理图如图2。

在设计过程中，其输出端（即Q1\Q2的C 极）静态工作点为1/2Vcc 最为适宜，能保障其最大动态输出范围。电路设计尽可能使Q1、Q2的静态工作参数一致，构成对

称电路。

图1系统硬件框图

Fig.1The system hardware block

diagram

·135