基于BURG算法的谱估计研究和MATLAB实现毕业论文
- 格式:doc
- 大小:533.00 KB
- 文档页数:28
基于Burg 算法的最大熵谱估计一、 实验目的使用Matlab 平台实现基于Burg 算法的最大熵谱估计二、 Burg 算法原理现代谱估计是针对经典谱估计方差性能较差、分辨率较低的缺点提出并逐渐发展起来的,其分为参数模型谱估计和非参数模型谱估计。
而参数模型谱估计主要有AR 模型、MA 模型、ARMA 模型等,其中AR 模型应用最多。
ARMA 模型功率谱的数学表达式为:212121/1)(∑∑=-=-++=p i i j i q i i j i j e a e b e P ωωωσ其中,P(e j ω)为功率谱密度;s 2是激励白噪声的方差;a i 和b i 为模型参数。
若ARMA 模型中b i 全为0,就变成了AR 模型,又称线性自回归模型,其是一个全极点模型: 2121/)(∑=-+=p i i j i j e a e P ωωσ研究表明,ARMA 模型和MA 模型均可用无限阶的AR 模型来表示。
且AR 模型的参数估计计算相对简单。
同时,实际的物理系统通常是全极点系统。
要利用AR 模型进行功率谱估计,必须由Yule - Walker 方程求得AR 模型的参数。
而目前求解Yule - Walker 方程主要有三种方法: Levinson-Durbin 递推算法、Burg 算法和协方差方法。
其中Burg 算法计算结果较为准确,且对于短的时间序列仍能得到较正确的估计,因此应用广泛。
研究最大熵谱估计时,Levinson 递推一直受制于反射系数K m 的求出。
而Burg 算法秉着使前、后向预测误差平均功率最小的基本思想,不直接估计AR 模型的参数,而是先估计反射系数K m ,再利用Levinson 关系式求得AR 模型的参数,继而得到功率谱估计。
Burg 定义m 阶前、后向预测误差为:∑=-=mi m m i n x i a n f 0)()()( (1)∑=*--=mi m i n x i m a n g m 0)()()( (2) 由式(1)和(2)又可得到前、后预测误差的阶数递推公式:)1()()(11-+=--n g K n f n f m m m m (3))1()()(11-+=--*n g n f K n g m m m m (4)定义m 阶前、后向预测误差平均功率为:∑=+=Nmn m m m n g n f P ])()([2122(5) 将阶数递推公式(3)和(4)代入(5),并令0=∂∂mmK P ,可得∑∑+=--+=*---+--=N m n m m Nm n m m m n g n f n g n f K 12121111])1()([21)1()((6)三、 Burg 算法递推步骤Burg 算法的具体实现步骤:步骤1 计算预测误差功率的初始值和前、后向预测误差的初始值,并令m = 1。
AR 模型的功率谱估计BURG 算法的分析与仿真钱平(信号与信息处理 S101904010)一.引言现代谱估计法主要以随机过程的参数模型为基础,也可以称其为参数模型方法或简称模型方法。
现代谱估计技术的研究和应用主要起始于20世纪60年代,在分辨率的可靠性和滤波性能方面有较大进步。
目前,现代谱估计研究侧重于一维谱分析,其他如多维谱估计、多通道谱估计、高阶谱估计等的研究正在兴起,特别是双谱和三谱估计的研究受到重视,人们希望这些新方法能在提取信息、估计相位和描述非线性等方面获得更多的应用。
现代谱估计从方法上大致可分为参数模型谱估计和非参数模型谱估计两种。
基于参数建摸的功率谱估计是现代功率谱估计的重要内容,其目的就是为了改善功率谱估计的频率分辨率,它主要包括AR 模型、MA 模型、ARMA 模型,其中基于AR 模型的功率谱估计是现代功率谱估计中最常用的一种方法,这是因为AR 模型参数的精确估计可以通过解一组线性方程求得,而对于MA 和ARMA 模型功率谱估计来说,其参数的精确估计需要解一组高阶的非线性方程。
在利用AR 模型进行功率谱估计时,必须计算出AR 模型的参数和激励白噪声序列的方差。
这些参数的提取算法主要包括自相关法、Burg 算法、协方差法、 改进的协方差法,以及最大似然估计法。
本章主要针对采用AR 模型的两种方法:Levinson-Durbin 递推算法、Burg 递推算法。
实际中,数字信号的功率谱只能用所得的有限次记录的有限长数据来予以估计,这就产生了功率谱估计这一研究领域。
功率谱的估计大致可分为经典功率谱估计和现代功率谱估计,针对经典谱估计的分辨率低和方差性能不好等问题提出了现代谱估计,AR 模型谱估计就是现代谱估计常用的方法之一。
信号的频谱分析是研究信号特性的重要手段之一,通常是求其功率谱来进行频谱分析。
功率谱反映了随机信号各频率成份功率能量的分布情况,可以揭示信号中隐含的周期性及靠得很近的谱峰等有用信息,在许多领域都发挥了重要作用。
基于Burg 算法的AR 模型功率谱估计简介摘要:在对随机信号的分析中,功率谱估计是一类重要的参数研究,功率谱估计的方法分为经典谱法和参数模型方法。
参数模型方法是利用型号的先验知识,确定信号的模型,然后估计出模型的参数,以实现对信号的功率谱估计。
根据wold 定理,AR 模型是比较常用的模型,根据Burg 算法等多种方法可以确定其参数。
关键词:功率谱估计;AR 模型;Burg 算法随机信号的功率谱反映它的频率成分以及各成分的相对强弱, 能从频域上揭示信号的节律, 是随机信号的重要特征。
因此, 用数字信号处理手段来估计随机信号的功率谱也是统计信号处理的基本手段之一。
在信号处理的许多应用中, 常常需要进行谱估计的测量。
例如, 在雷达系统中, 为了得到目标速度的信息需要进行谱测量; 在声纳系统中, 为了寻找水面舰艇或潜艇也要对混有噪声的信号进行分析。
总之, 在许多应用领域中, 例如, 雷达、声纳、通讯声学、语言等领域, 都需要对信号的基本参数进行分析和估计, 以得到有用的信息, 其中, 谱分析就是一类最重要的参数研究。
1 功率谱估计简介一个宽平稳随机过程的功率谱是其自相关序列的傅里叶变换,因此功率谱估计就等效于自相关估计。
对于自相关各态遍历的过程,应有:)()()(121lim *k r n x k n x N N x N N n =⎭⎬⎫⎩⎨⎧++∞→∑-= 如果所有的)(n x 都是已知的,理论上功率谱估计就很简单了,只需要对其自相关序列取傅里叶变换就可以了。
但是,这种方法有两个个很大的问题:一是不是所有的信号都是平稳信号,而且有用的数据量可能只有很少的一部分;二是数据中通常都会有噪声或群其它干扰信号。
因此,谱估计就是用有限个含有噪声的观测值来估计)(jw x e P 。
谱估计的方法一般分为两类。
第一类称为经典方法或参数方法,它首先由给定的数据估计自相关序列)(k r x ,然后对估计出的)(ˆk rx 进行傅里叶变换获得功率谱估计。
数字信号处课程小论文题目:功率谱估计方法与实现的研究——对心电信号(ECG)谱估计的研究摘要:心血管疾病是威胁人类生命的最主要疾病之一, 而心电图(ECG)是诊断心血管疾病的主要依据。
对其的特征分析一直是医学信号处理的热点,本文针对心电信号的谱估计做了一些分析讨论,首先是对来自MIT-BIH数据库的心电信号进行了预处理,然后分析了其AR 模型的阶次问题,最后是在MATLAB中,用Burg算法实现了ECG信号的谱估计。
实验结果显示,心电信号的谱能够反应隐藏在心电信号中的疾病问题。
关键词:心电信号谱估计频谱心电图 Burg算法目录一、课题研究背景与意义 (3)1 心电图(ECG)及其谱估计简介 (3)2 功率谱估计简介 (4)3 功率谱估计国内外的研究历史和现状 (5)3.1 基于二阶统计量的功率谱估计的方法 (5)3.1.1 经典功率谱估计方法的原理和算法 (6)3.1.2 现代功率谱估计方法的原理和算法 (7)3.2 基于高阶统计量(HOS)的谱估计方法 (9)3.2.1 非参数估计法 (10)3.2.2 参数模型估计法 (10)3.3 基于分数低阶统计量(FLOS)的谱估计方法 (11)4 总结 (12)5 参考文献 (12)二、心电图谱估计问题的基本方法和技术 (14)1 心电图谱估计研究的现状与意义 (14)2 MIT-BIH 心电图数据库 (15)3 AR模型功率谱估计的有关方法 (15)3.1自相关法 (17)3.2 Burg算法 (18)3.3 改进的协方差方法 (19)3.4 总结概述 (21)4 本文主要的研究内容 (21)三、MATLAB实验与讨论 (22)1 MIT/BIH 心电图数据的读取 (22)2 心电信号的简单预处理 (23)3 AR模型阶次的选取 (24)4 Burg算法的实现 (30)5 心电图谱估计的实现 (32)6 实验结果与分析 (34)四、结束语 (36)参考文献: (36)附件: (38)一、课题研究背景与意义1 心电图(ECG)及其谱估计简介心脏是人体循环系统中的重要器官。
i 0基于Burg 算法的最大熵谱估计实验目的使用Matlab 平台实现基于Burg 算法的最大熵谱估计Burg 算法原理现代谱估计是针对经典谱估计方差性能较差、分辨率较低的缺点提出并逐渐发展起来 的,其分为参数模型谱估计和非参数模型谱估计。
而参数模型谱估计主要有 模型、ARMA 模型等,其中AR 模型应用最多。
ARMA 模型功率谱的数学表达式为:其中,P(e j 「为功率谱密度;s 2是激励白噪声的方差;a i 和b i 为模型参数。
若ARMA 模型中b i 全为0,就变成了 AR 模型,又称线性自回归模型, 其是一个全极点 模型:P(e j )研究表明,ARMA 模型和MA 模型均可用无限阶的 AR 模型来表示。
且 AR 模型的参数 估计计算相对简单。
同时,实际的物理系统通常是全极点系统。
要利用AR 模型进行功率谱估计,必须由Yule - Walker 方程求得AR 模型的参数。
而目前求解Yule - Walker 方程主要有三种方法:Levinson-Durbin 递推算法、Burg 算法和协方差方 法。
其中Burg 算法计算结果较为准确,且对于短的时间序列仍能得到较正确的估计,因此 应用广泛。
研究最大熵谱估计时,Levin so n 递推一直受制于反射系数 K m 的求出。
而Burg 算法秉着 使前、后向预测误差平均功率最小的基本思想,不直接估计AR 模型的参数,而是先估计反射系数K m ,再利用Levin son 关系式求得AR 模型的参数,继而得到功率谱估计。
Burg 定义m 阶前、后向预测误差为:AR 模型、MAP(e j )21b i ei 1a i e ja i e jf m ( n)ma m (i)x( n i)(1)mg m( n) a m(m i)x( n i)i 0由式(1 )和(2)又可得到前、后预测误差的阶数递推公式:f m(n) f m i (n) K mg m i(n 1)g m( n) K m f mi( n)g m 1(n 1)⑷定义m阶前、后向预测误差平均功率为:1N2 2P m - [|f m(门)g m(门)]2n m将阶数递推公式(3)和(4)代入(5),并令一也0,可得K mNf m 1( n)g m 1( n 1)n m 1K m 1 N-[f m 1( n)2 g m 1(n 1)2]2 n m 1三、Burg算法递推步骤Burg算法的具体实现步骤:步骤1计算预测误差功率的初始值和前、后向预测误差的初始值,并令2x(n)f°( n) g°( n) x(n)步骤2求反射系数Nf m 1( n)g m 1( n 1)n m 1 ______________________________K m 1 N-[f m 1(n)2 g m 1(n 1)2]2 n m 1步骤3计算前向预测滤波器系数P。
三种功率谱估计方法性能研究1.前言:我们已经知道一个随机信号本身的傅里叶变换并不存在,因此无法像确定性信号一样用数字表达式来精确表达它,而只能用各种统计平均量来表征它. 其中,自相关函数最能完整地表它他的统计平均量值.而一个随机信号的功率谱密度正是自相关函数的傅里叶变换,可以用功率谱密度来表征它的统计平均谱密度(PSD). 跟据维纳辛钦定理,广义平稳随机过程的功率谱是自相关函数的傅里叶变换,它取决于无数多个自相关函数值. 但对于许多实际应用中,可资利用的观测数据往往是有限的,所以要准确计算功率谱通常是不可能的.比较合理的目标是设法得到功率谱的一个好的估值,这就是功率谱估计. 也就是说,功率谱估计是根据平稳随机过程的有限个观测值,来估计该随机过程的功率谱密度.功率谱估计的评价指标包括客观度量和统计度量. 在客观度量中,谱分析特性是一个主要指标.谱分析是指估计普对真实谱中两个靠的很近的谱峰的分辨能力.统计度量是指估计的偏差,方差,均方误差,一致性等评价指标.但需要注意的是,对统计特性的分析方法只适用于长数据记录.所以,利用统计度量对不同的谱估计方法进行比较是不妥当的,只能用来对某种谱估计方法进行描述,并且一般只用来描述古典谱估计方法,因为现代谱估计方法往往用于短数据情况.功率谱估计可以分为经典谱估计(非参数估计)和现代谱估计(参数估计)。
通常将傅里叶变换为理论基础的谱估计方法叫做古典谱估计或经典谱估计;把不同于傅里叶分析的新的谱估计方法叫做现代谱估计或近代谱估计.前者主要有周期图法,自相关法及其改进方法. 现代功率谱估计方法主要有基于参数模型的自相关法、Burg 算法、改进的协方差方法等,基于非参数模型的MUSIC 算法、特征向量方法等。
本文选取比较有代表性周期图法, Burg 算法、Yule-Wallker 法(自相关法)算法进行计算机仿真,通过仿真发现了这些算法各自的优缺点,并进行归纳总结。
2三种算法的基本理论2.1 周期图法周期图法又称直接法,其具体步骤如下:第一步: 由获得的N 点数据构成的有限长序列()x Nn 直接求傅里叶变换,得频谱()x i N e ω,即()()-1-=0x =N i i N Nn ex n eωω∑ (1)第二步: 取频谱幅度的平方,并除以N,以此作为对()x n 的真实功率谱()i x S eω的估计,即()()21ˆ=i i xN S e X eNωω(2)综上所述,先用FFT 求出宿疾随机离散信号N 点的DFT ,再计算幅频特性的平方,然后除以N ,即得出该随机信号得功率谱估计。
南京邮电大学通达学院毕业设计(论文)题目:基于Matlab的信号功率谱估计专业:通达学院学生姓名:夏丽君班级学号: 11006811指导教师:梁涓指导单位:南京邮电大学日期: 2014 年 11 月 24 日至 2015 年 6 月 12 日摘要数字信号处理(DSP)重要的应用领域之一,是建立在周期信号和随机信号基础上的功率谱估计。
在实际应用中往往不能获得具体信号的表达式,需要根据有限的数据样本来获得较好的谱估计效果,因而谱估计被广泛的应用于各种信号处理中。
本论文研究了功率谱估计的几种常用的方法,包括经典谱估计和现代谱估计的各种方法,并给出仿真程序及仿真图。
经典法主要包括周期图法、自相关法,但这两种方法都存在缺陷,即认为观测数据之外的数据都为零,所以对经典法中的周期图法进行了加窗、平均等修正;现代谱估计的方法分类比较多,AR模型法,MA 模型法和ARMA模型法是现代功率谱估计中最主要的参数模型,本论文着重讨论了AR模型参数法。
同时论文将通过对经典谱估计和现代谱估计的实现方法及仿真图的比较,得出经典功率谱估计方法的方差性较差,分辨率较低,而现代谱估计的目标正是在于努力改善谱估计的分辨率,因此能得到较好的谱估计效果,为此应用更为广泛。
关键字:数字信号处理;功率谱估计;周期图法;自相关法。
ABSTRACTPerhaps one of the more important application areas of digital signal processing(DSP) is builting on the Power Spectral Estimation of periodic and random signals. Actually, we can’t get the expression of a specific signal, so we need to estimate the power spectral of a signal according to some sample data sequences.so spectrum estimation which is widely used in various signal processing.In this thesis, some common methods of Power Spectral Estimation, such as classical spectral estimation and modern spectral estimation, are studied. The quality of each estimation method is derived, simulation program and simulation figure is given. Classical methods of Power Spectral Estimation mainly include the Periodogram and the BT method. But both of them have a common drawback: the data sequences, beyond the area of the observed sequences, are all presumed to zero. So the Windows and the average method are introduced to improve the quality of the Periodogram. Therefore the improvement of The Periodogram estimation method is proposed. The classification of modern spectral estimation methods are more , AR,MA, and ARMA is the most important parameters of modern spectral estimation. This thesis will focus on discussion of AR model parameters method. At the same time , It can be seen from the comparison and realization of classical spectral estimation and modern spectral estimation, classical power spectrum estimation variance is poor, low resolution .The goal of modern spectral estimation is woking to improve the resolution of spectral estimation, better results of the estimation of the power spectrum can be obtained, so it is applied more widely.Keywords: digital signal processing; Power Spectrum Estimation; The Periodogram;the BT methods.目录1.绪论1.1课题背景1.2研究意义1.3研究内容2.功率谱估计的概述2.1随机变量2.2平稳随机信号2.2.1平稳随机信号定义2.2.2平稳随机信号特征2.2.3平稳随机信号的自相关函数2.2.4平稳随机信号的功率谱2.3估计质量的评价标准3. 经典功率谱估计3.1谱估计与相关函数3.1.1相关函数和功率谱3.1.2 相关函数的估计3.2 周期图法3.2.1周期图法的定义3.2.2 周期图的性能3.2.3 周期图法改进措施3.3自相关法3.4直接法和间接法的关系3.5本章小结4. 现代谱估计4.1 平稳随机信号的参数模型4.2 AR模型的正则方程与参数计算4.2.1正则方程求导4.2.2 AR模型参数求解的经典算法4.3 AR模型谱估计的实现及性质4.3.1 谱估计的步骤4.3.2 AR模型谱估计的性质4.4 MA模型谱估计4.5 ARMA模型谱估计4.6 本章小结5. MATLAB下的经典谱与现代谱估计的仿真5.1基于MATLAB经典谱估计的仿真5.2基于MATLAB现代谱估计的仿真结束语致谢参考文献第一章绪论1.1课题背景功率谱估计技术渊源流长,在过去的几十年获得了飞速的发展。
用burg 法实现功率谱估计一.数据为:12()1*2*()j n j n x n A e A e v n ωω=++实现功率谱估计的matlan 源代码:clearFs=500;n=0:1/Fs:0.5;w1=200*pi;A1=5;w2=300*pi;A2=12;xn=A1*(cos(w1*n)+j*sin(w1*n))+A2*(j*sin(w2*n)+cos(w2*n))+randn(size(n));%xn=A1*exp(jw1n)+A2*exp(jw2n);subplot(211);plot(n,xn);xlabel('n');ylabel('xn');title('xn=A1*exp(jw1n)+A2*exp(jw2n)+e(n)');ymax_xn=max(xn)+0.2;ymin_xn=min(xn)-0.2;axis([0 0.5 ymin_xn ymax_xn]);p=floor(length(xn)/3)+2;nfft=1024;[xpsd,f]=pburg(xn,p,nfft,Fs);pmax=max(xpsd);xpsd=xpsd/pmax;xpsd=10*log10(xpsd);subplot(212);plot(f,xpsd);title('Power Spectral estimate with burg');ylabel('Power Spectral estimate(dB)');xlabel('f(Hz)');grid on;ymin_psd=min(xpsd)-1;ymax_psd=max(xpsd)+1;axis([0 Fs/2 ymin_psd ymax_psd]);实验结果:(1) 下图依次是阶数为N/3,N/2二.数据为:x n x n x n x n p =-+-+-()(1)(2)......() Matlab源代码:clearclf'clcN=100;Fs=500;%产生x1信号即AR信号vn=rands(1,N);xn=zeros(1,N);xn(1)=vn(1);xn(2)=vn(2);a1=0.78;a2=-0.96;for n=3:Nxn(n)=vn(n)+a1*xn(n-1)+a2*xn(n-2) endsubplot(211);plot(xn);title('xn(n)=noise(n)+a1*xn(n-1)+a2*xn(n-2)'); p=floor(length(xn)/5);nfft=1024;[xpsd,f]=pburg(xn,p,nfft,Fs);pmax=max(xpsd);xpsd=xpsd/pmax;xpsd=10*log10(xpsd);subplot(212);plot(f,xpsd);title('Power Spectral estimate with burg'); ylabel('Power Spectral estimate(dB)');xlabel('f(Hz)');grid on;ymin_psd=min(xpsd)-1;ymax_psd=max(xpsd)+1;axis([0 Fs/2 ymin_psd ymax_psd]);。
毕业论文声明本人郑重声明:1.此毕业论文是本人在指导教师指导下独立进行研究取得的成果。
除了特别加以标注地方外,本文不包含他人或其它机构已经发表或撰写过的研究成果。
对本文研究做出重要贡献的个人与集体均已在文中作了明确标明。
本人完全意识到本声明的法律结果由本人承担。
2.本人完全了解学校、学院有关保留、使用学位论文的规定,同意学校与学院保留并向国家有关部门或机构送交此论文的复印件和电子版,允许此文被查阅和借阅。
本人授权大学学院可以将此文的全部或部分内容编入有关数据库进行检索,可以采用影印、缩印或扫描等复制手段保存和汇编本文。
3.若在大学学院毕业论文审查小组复审中,发现本文有抄袭,一切后果均由本人承担,与毕业论文指导老师无关。
4.本人所呈交的毕业论文,是在指导老师的指导下独立进行研究所取得的成果。
论文中凡引用他人已经发布或未发表的成果、数据、观点等,均已明确注明出处。
论文中已经注明引用的内容外,不包含任何其他个人或集体已经发表或撰写过的研究成果。
对本文的研究成果做出重要贡献的个人和集体,均已在论文中已明确的方式标明。
学位论文作者(签名):年月关于毕业论文使用授权的声明本人在指导老师的指导下所完成的论文及相关的资料(包括图纸、实验记录、原始数据、实物照片、图片、录音带、设计手稿等),知识产权归属华北电力大学。
本人完全了解大学有关保存,使用毕业论文的规定。
同意学校保存或向国家有关部门或机构送交论文的纸质版或电子版,允许论文被查阅或借阅。
本人授权大学可以将本毕业论文的全部或部分内容编入有关数据库进行检索,可以采用任何复制手段保存或编汇本毕业论文。
如果发表相关成果,一定征得指导教师同意,且第一署名单位为大学。
本人毕业后使用毕业论文或与该论文直接相关的学术论文或成果时,第一署名单位仍然为大学。
本人完全了解大学关于收集、保存、使用学位论文的规定,同意如下各项内容:按照学校要求提交学位论文的印刷本和电子版本;学校有权保存学位论文的印刷本和电子版,并采用影印、缩印、扫描、数字化或其它手段保存或汇编本学位论文;学校有权提供目录检索以及提供本学位论文全文或者部分的阅览服务;学校有权按有关规定向国家有关部门或者机构送交论文的复印件和电子版,允许论文被查阅和借阅。
毕业设计(论文)题目:基于BURG算法的谱估计研究及其MATLAB实现目录一、毕业设计(论文)开题报告二、毕业设计(论文)外文资料翻译及原文三、学生“毕业论文(论文)计划、进度、检查及落实表”四、实习鉴定表XX大学XX学院毕业设计(论文)开题报告题目:基于BURG算法的谱估计研究及其MATLAB实现机电系电子信息工程专业学号:学生:指导教师:(职称:讲师)(职称:)XXXX年XX月X日英文原文Parametric spectral estimation on a single FPGAABSTRACTParametric, model based, spectral estimation techniques can offer increased frequency resolution over conventional short-term fast Fourier transform methods, overcoming limitations caused by the windowing of sampled, time domain, input data. However, parametric techniques are significantly more computationally demanding than the Fourier based methods and require a wider range of arithmetic functionality; for example, operations such as division and square-root are often necessary. These arithmetic processes exhibit communication bottleneck and their hardware implementation can be inefficient when used in conjunction with multipliers. A programmable, bit-serial, multiplier/divider, which overcomes the bottleneck problems by using a data interleaving scheme, is introduced in this paper. This interleaved processor is used to show how the parametric Modified Covariance spectral estimator can be efficiently routed on a field programmable gate array for real-time applications.1. INTRODUCTIONDue to its ease of hardware and software implementation the shortterm fast Fourier transform(STFFT)is widely used for spectral estimation and is known as the conventional method. However, the technique has drawbacks in terms of spectral resolution and accuracy caused by the finite length of the input data sequence used. Windowing of input data causes spectral broadening and Gibb’s phenomenon of spectral leakage can mask the weaker frequency components of the true power spectral density(PSD)[1]. These unwanted effects can be reduced by using longer data sequence lengths, so that the transformed signal becomes a better representation of the infinite data sequence, but in real life this usually is not feasible as the characteristics of the input data may change with time. Over short periods of time the data signals can often be assumed to exhibit wide-sense stationarity, where the signal characteristics are assumed approximately constant but the spectral resolution is therefore limited. In attempts to improve the PSD estimation, windowing functions, Bartlett or Hanning for example, can be used to reduce side-lobe levels but these lower spectral resolution by broadening the main lobe of the PSD[2].Model based, parametric spectral estimation techniques can alternatively be used, where the unrealistic assumption that data is zero outside the window of interest is dropped[1]. Either knowledge of the underlying process or reasonableassumptions about the nature of the unobserved data are used to improve frequencyresolution over the conventional approaches. The computational burden of suchprocessors is however much higher than the STFFT and arithmetic functions such asdivision and square-root often become necessary. In the division and square-rootnon-restoring algorithms there is an inherent dependency that the result bits mustbe computed in a most significant bit(MSB)first manner, with the computation of abit directly dependent upon the result of the previous one[3]. This interdependencymakes it difficult to efficiently realize such arithmetic functions in hardware,and implementations are usually much slower than other basic functions such asmultiplication, addition and subtraction. Communication bottlenecks can thereforeeasily occur in systolic arrays where different types of processors are interconnected.The difficulties with hardware implementation of parametric spectral estimatorshave led to a preference of software implementation on homogeneous DSP networks[4].However, high levels of processing capacity have not been fully reflected in systemthroughput since the increased communication incurred as a result of parallelismis constrained by communication bus performance. This restricts the range ofproblems that can be computed in realtime and the software approach may sometimesbe inadequate for real-time spectral estimation.In this paper, hardware implementation of a parametric spectral estimator isaddressed. A bit-serial processor capable of division and inner product stepcomputation is developed by combining separate processors for these functions. Thedesign uses a high level of pipelining so that division can be computed at a highrate and multiplication is performed on a MSB first data stream, eliminating thebottleneck problem. The high level of pipelining allows many independentcomputations to be performed simultaneously or interleaved. The use of theinterleaving scheme is demonstrated by implementing the design of a ModifiedCovariance type of parametric spectral estimator, to produce a field programmablegate array(FPGA)based system for the spectral analysis of Doppler signals fromultrasonic blood flow detectors.2. MODIFIED COVARIANCE SPECTRAL ESTIMATIONThe model order p=4 Modified Covariance(MC)spectral estimator, proven to beoptimally cost efficient for the blood flow application where mean velocity and flowdisturbance are of interest[5], involves solving the following linear system ofcovariance matrix equations:(1) where each element j i C,is obtained from:∑∑-=--=+•++-•--=110,)()1(21N p n p N n ji j n i n j n i n N C(2) for a window of length N data samples. The k aˆ filter parameter estimates are obtained by solution of the linear system(1), using the Cholesky, forward elimination and back substitution algorithms. The signal white noise varianceestimate, 2ˆσis calculated as: k p k k c a c ,010,02ˆˆ•+=∑=σ (3)and the power spectral density(PSD), )(ˆnMC f P , is obtained from: 2122221ˆ)(ˆ)(ˆ∑=-=+==p k f j k k n n MC ne z z af A f P πσσ(4)Hence, the MC spectral estimator may be partitioned onto four different programming modules:•CMR-calculation of the elements of the covariance matrix and right-hand side vector, 5N multiply accumulates taking into account matrix symmetry.•Cholesky-solution of the linear system of equations, 6 divisions and 10 inner step products for non-square-root Cholesky, 4 divisions and 12 inner step products for solving triangular systems.•WNV-calculation of the white noise variance, 4 multiply accumulates.•PSD-computation of the power spectral density, 4N inner step products for a zero padded DFT, N multiplications to find absolute value of DFT and N/2 divisions for the PSD.The number of samples, over the fixed time duration window of 10ms, is required to be either 64, 128, 256 or 512 depending on Doppler signal conditions. Implementation of the algorithm in Matlab software proved to be in excess of a factor of 310times too slow for real-time operation and that a performance of up to 13.5 MFLOPS/s is required[4]. Execution times of MC algorithm implementation using various topologies of Texas Instruments TMS320C40 DSPs with T8 transputers as routers have also fallen short of the real-time requirements, where processing time is over 150ms too long in the worst case[4][6]. Use of a single DSP in a PC hosted system has been shown sufficient for the smaller N but the specification of N=512 could not be achieved[4], thus prompting consideration of the hardware approach.3. BIT-SERIAL INTERLEAVED PROCESSORStudy of word-parallel systolic implementations of the MC method has shown the method to provide more than adequate throughput for the specified real-time blood flow application but the cost of such a system is very high in terms of arithmetic units, communication burden and control complexity[7]. For example, a systolic array processor for non-square root Cholesky decomposition[8] requires 13 processing elements(PEs), each PE having 2 to 6 ports of either m(single precision)or 2m(double precision)lines, and control is necessary to reverse data streams before back substitution. An alternative way to approach the hardware design involves consideration of bit-serial processing techniques.The nature of multiplication algorithms normally involve the computation of least significant bits(LSBs)first and bit-serial multipliers reflect this in their output ordering. Conversely, division algorithms such as non-restoring are MSB first in nature[9]. Computation of each quotient bit can be performed from m controlled add subtract(CAS)operations, the decision on whether to add or subtract being taken given the result of the previous bit computed(except on the first operation where the signs of the input operands are used to decide). Allowing carries to ripple through therefore leads to a propagation delay greater than m CAS cells. In a bit-serial multiplier, the delay between successive bits being output is likely to be around a single full adder(FA)delay, leading to a maximum clock frequency approximately m times higher and a communication bottleneck with the divider. The clockrate of the divider can be increased to a similar maximum rate as the multiplier by pipelining the carries in each individual CAS stage. However, this means that each output bit is then available only once in every m clock cycles. There is also the problem that data streams must be reversed between multipliers and dividers. One possibility is to use registers and extra control logic to reorder the bit stream from the divider but the operation time is still limited.The efficiency of the divider with the pipelined carry can be greatly improved by using the redundant slots between the output of successive bits to perform other separate divisions. The bit-serial/word-parallel divider shown in[3]allows m+1 individual divisions to be performed simultaneously or interleaved. This decreases the mean division operation time to achieve similar performance to a bit-serial multiplier but there is still the problem of data stream matching when interfacing such devices. One way to tackle this problem is to redesign the multiplier so that it works on a MSB first data stream, rather than storing and reordering the divider outputs which increases latency and control requirement[10]. MSB first multiplication, first demonstrated by McCanny et al.[11], shows it is possible to perform multiplication on positive numbers by summing partial products(PPs)inreverse order to the norm. This also requires inclusion of an MSB first addition unit to ensure that output carries from the PPs are added into the final product. Larsson-Edefors and Marnane[12], extend the concept of MSB first multiplication to the two’s complement number system and show bit-serial architectures for this application. In order to match the divider bit-streams exactly to the multiplier bit-streams it is then just a matter of inserting extra delays along the FA sum pipeline so that the addition of PPs from a number of different multiplications can be performed simultaneously as shown by Bellis et al[13].Study of the bit-serial interleaved divider and multiplier reveals that both architectures show a large degree of similarity. Both work in load/operational phases; the loading networks for the divisor and multiplier both consist of m+1 delay feedback SISO registers and the FA sum/carry pipelines are alike. Both designs also require MSB first, half adder(HA)cell, addition stages; the divider requires m PEs, for 1’s complement error correction which occurs for negati ve dividends, and the multiplier requires m-1 HA PEs to add the output carries from the PPs. Therefore, it is possible to combine the two designs to make a programmable bitserial device which allows m+1 computations to be simultaneously interleaved, as shown in figure 1.The processor has two mode selection inputs DIVi and SUBi, which control four modes of operation i i Y Z X /0±= or i i i Y X Z Z •±=0 where i Z and 0Z are both double precision. Ldi is the load/operational mode select signal for the storage of i Y and i Z over the first m(m+1) clock cycles. Ldi switches into operational mode over the next m(m+1) clock cycles where the remaining data is input and the bulk of the computation is performed in the FA array. All control signals are fully pipelined similarly to the data, allowing the shortest possible block pipeline period of 2m(m+1) clock cycles and continuous input/output of data(i.e. while one block set of m+1 computations are being output, the next block set may be loaded in). The pipeline also allows independent functionality between each of the separate interleaves and on the same interleave a division may immediately follow an inner step product computation and vice-versa.4. INTERLEAVED PROCESSOR BASED MODIFIED COVARIANCE SYSTEMCost-bene fit analysis on systolic array implementation of the CMR and Cholesky sections of the MC spectral estimator shows that a 12 bit fixed point word-length is suf ficient for these computations[7]. Using the bit-serial processor with a 12 bit word-length results in the capacity for interleaving 13 computations.On interleaves 0 to 4 the CMR multiplications are performed over N consecutive block sets, such that the products i n n x x +• are produced on interleave)40(≤≤i i andblockset )10(-≤≤N n n . A bit-serial systolic array provides the correct input data sequencing from consecutive Doppler signal samples and a separate MSB first double precision accumulator, whose architecture is similar to that of HA section in figure1, computes the covariance matrix elements, which are then stored in RAM. The system for computing the CMR calculation is shown in figure 2.The entire Cholesky, forward elimination, back substitution and WNV computations are performed on interleave 5 on the system shown in figure 3. Here division and inner product step computation are necessary. Once the covariance matrix elements are stored in the dual port RAM after block set N the Cholesky decomposition can commence on interleave 5 while in parallel the CMR computation on the next set of data can be processed on interleaves 0 to 4. A ROM block controls the addressing of the dual port RAM for retrieval of stored data to go onto the processor inputs and storage of the processor results. To achieve good dynamic resolution for the low word-length used, a systolic array scaling module is included between the RAM and the processor, whose scaling factors are also produced by the ROM controller along with the mode control. Overall timing in the system is controlled by three counters, qi(range 0 to 12),qb(range 0 to 23) and qw(range 0 to N)corresponding to the interleaves, bit-position and input word.A zero padded point DFT is computed on interleaves 6, 7, 8 and 9. This is basically amatrix vector multiplication and is computed by using the processor in inner product step mode. The system for this section consists of a ROM to provide storage of the twiddle factor matrix n W , another ROM to control the addressing of the twiddle factors for a particular qw and 4 registers which continuously recirculate the filterparameter results(n aˆ)from the Cholesky decomposition stage. On interleave 6 thereal and imaginary parts of the first set of products 1ˆa W i N • are alternately formed.Using a single flip-flop delay the results of these computations are then fed backinto the i Z input of the interleaved processor to be added to the products 2ˆaW i N • and the DFT is built up in this way. The dynamicrange of the PSD computation is quite high compared to the rest of the system, therefore, at this stage a floating point representation of the DFT results is taken using a systolic based conversion circuit. PIPO registers are used to store the 6 bit exponents of the real and imaginary parts of the DFT, whose squares are computed on interleave 10. On interleave 11 the absolute value of the DFT is computed. The maximum of each pair of real and imaginary results from interleave 10 is fed to the i Z input while the other value is piped into the i Y to be appropriately scaled by the difference in the two squared exponents appearing on the i X input. The PSD is then computed on interleave 12, involving N/2 divisions of the WNV formed on interleave 5 with the absolute values from interleave 11. The exponents of the PSD are then easily derived from the exponents of the DFT results.5. CONCLUSIONThis paper has proposed a bit-serial interleaved processor which can be programmed for use in division or inner product step computations. The interleaving idea was introduced in order to perform bit-serial division at the same high clock rate as multiplication without resorting to carry look-ahead schemes to remove the communication bottleneck. The result is a high throughput processor which is cost ef ficient in terms of VLSI implementation, since communication between PEs in the linear array is localised and control is very simple. An application in parametric spectral estimation, namely implementation of theModi fied Covariance spectral estimator, which makes full use of the interleaving scheme, was described. This system has been programmed and simulated using VHDL. Synthesis was targeted to exploit the resources of a Xilinx XC4036EX-2 FPGA. This type of FPGA has dual port RAM capability, where a 16x1 bit dual port RAM can be implemented in a single con figurable logic block (CLB). A dual port RAM cell is an area ef ficient method to implement a 13 or 14 bit SISO register, as used in the interleaving process. Such registers would otherwise have to be implemented using the pairs of flip flops in each CLB, i.e.7 CLBs. CLBs can also be con figured as ROM blocks which are useful for generating the address signals in the Cholesky and PSD modules, and for storage of the DFT twiddle factors. The processor design exhibits mostly localised communication to make use of the fast routing resources between nearest neighbours in the FPGA’s CLB matrix and enable high speed operation. Timing analysis of the FPGA layout shows that the maximum processor clock frequency of 35MHz allows real-time spectral estimation to be performed for the speci fied constraints. There-programmable aspect of the FPGA is also useful; rather than designing control logic to switch between the different values of N, which uses resources and is likely to slow clock speed, a different bit-stream can be downloaded for each N. This idea can also be extended for changing to higher model order estimations where otherwise it would be difficult to paramete rise p in such a system.6. REFERENCES[1] S. M. Kay, Modern Spectral Estimation-Theory & Application. Prentice Hall, 1988.[2] M. Kassam, K. W. Johnston, and R. S. C. Cobbold, “Quantitative estimation of spectral broadening for the diagnosis of carotid arterial disease: Method and in vitro results, ”Ultrasound in Medicine and Biology, vol.11, pp. 425–433, 1985.[3] W .P. Marnane, S. J. Bellis, and P. Larsson-Edefors, “Bitserial interleaved high-speed division, ”Electronics Letters, vol. 33, pp. 1124–1125, June 1997.[4] M. M. Madeira, S. J. Bellis, M. G. Ruano, and W. P. Marnane, “Configurable processing for real-time spectral estimation, ”in Preprints of AARTC98,pp. 209–214, 1998.[5] M. G. Ruano and P. J. Fish, “Cost/benefit criterion for selection of pulsed Doppler ultrasound spectral mean frequency and bandwidth estimation, ”IEE E Transactions on Biomedical Engineering, vol. 40, no. 12, pp. 1338–1341, 1993.[6] M. G. Ruano, D. F. G. Nocetti, P. J. Fish, and P. J. Fleming, “Alternative parallel implementationsof an AR-modified covariance spectral estimator for diagnostic ultrasound blood flow studies, ” Parallel Computing,vol. 19, no. 4, pp. 463476, 1993.[7] S. J. Bellis, P. J. Fish, and W. P. Marnane, “Optimal systolic arrays for real-time implementation of the Modified Covariance spectral estimator, ”Parallel Algorithms and Applications, vol .11, no. 1-2, pp. 71–96, 1997.[8] S. J. Bellis, W. P. Marnane, and P. J. Fish, “Alternative systolic array for non-square-root Cholesky decomposition, ”IEE Proceedings: Computers and Digital Techniques, vol. 144, pp. 57–64, Mar. 1997.[9] K. Hwang, Computer Arithmetic: Principles Architecture and Design. John Wiley & Sons, 1979.中文译文单一FPGA上的参数谱估计摘要参数化谱估计模型技术可以提供针对传统的短期快速傅里叶变换提高频率分辨率的方法,克服了窗口采样,时域,输入数据造成的限制。