Driver Drowsiness Classification Using Fuzzy Wavelet-Packet-Based Feature-Extraction Algorithm
- 格式:pdf
- 大小:744.40 KB
- 文档页数:11
基于驾驶员面部特征的疲劳检测系统研究刘馨雨1 李庭燎21.南京审计大学 统计与数据科学学院 江苏省南京市 2100002.南京审计大学 商学院 江苏省南京市 210000摘 要: 疲劳驾驶在我国的交通事故引发率居高不下,如何对驾驶员进行科学有效的疲劳驾驶检测并及时预警已经成为了当下热议的话题。
为减少疲劳驾驶造成的交通安全风险,本文对基于驾驶员面部特征的检测方法进行了研究,通过对疲劳特征参数的提取克服了单一参数疲劳驾驶判断方法导致的判定精准度低等缺陷,且计算需求小,普适性能强。
关键词:疲劳驾驶 检测 PERCLOS 人眼定位1 引言在交通事故的伤亡事件中,由于驾驶员困倦、疲劳驾驶等原因所致的交通事故发生率迅速增加,并逐步成为引发交通事故的主要原因。
在我国,每年有40%以上的道路交通事故是由大型汽车引起的,其中死亡的比例在21%以上;如果是在高速上,时速超过160公里,那么一旦发生车祸,驾驶员的死亡率就会接近100%[1]。
但是,在发生疲劳驾驶前,通过对驾驶员的实时监测,可以获取其以前的疲劳特性,并对其进行实时提醒,从而对其进行疲劳预警,如此一来,就可以将交通事故的发生率降到最低。
迄今为止,相关研究学者明确了许多人眼定位的方法。
在此之中,具有较高代表性的方法为:Anber Salma等采用基于面部特征的Alexnet混合驾驶员疲劳和分心检测模型[2];Zhang Tao通过分析非侵入性头皮EEG信号来探讨基于样本熵特征的多核算法对疲劳和正常受试者的分类性能,构建基于样本熵的多通道脑电真实驾驶疲劳检测方法[3];Bala等相关学者在研究中明确了基于遗传算法等来支持分析眼睛定位[4];Wang等学者建立基于相位滞后指数的图形注意网络以用于检测驾驶疲劳。
[5];MamunurRashid等学者基于随机子空间K-Nn的集成分类器,利用选定的脑电通道进行驾驶员疲劳检测[6]。
总而言之,现在已开发的许多的眼部定位算法,其基本都具有较多的计算量,部分算法的实际应用十分困难,部分算法对脸部图像的转动、移动等的改变十分敏感,进而将致使算法效率的下降。
铁路平交道口智能集控预警系统周 帅,周 培,武恒颉,周灵钰(南京铁道职业技术学院,南京 210056)摘要:传统铁路道口的安全防护措施因其低效性和高昂的成本,以及潜在的安全隐患,已经无法满足当前的发展需求。
为应对这一挑战,提出一种创新性的铁路道口无线通信智能集控预警系统,旨在确保铁路道口工作人员和交通参与者的生命财产安全。
结合远距离L o R a无线射频通信技术、基于O p e n M V的车辆与行人目标识别技术,计算机联锁技术等设计一款低功耗、易部署的无人值守的铁路道口远程集中管理系统,从而提升铁路运营效率,并增强铁路运输的安全保障水平。
关键词:铁路道口;无线通信;OpenMV;计算机联锁中图分类号:U284.15+1 文献标志码:A 文章编号:1673-4440(2024)03-0106-07Intelligent Centralized Control System for Early Warning of RailwayLevel CrossingsZhou Shuai, Zhou Pei, Wu Hengjie, Zhou Lingyu(Nanjing V ocational Institute of Railway Technology, Nanjing 210056, China) Abstract: The traditional safety protection measures for railway crossings can no longer meet the current development needs due to their low efficiency, high cost and potential safety hazards. To meet this challenge, this paper proposes an innovative intelligent centralized control system based on wireless communication for early warning of railway crossings, which aims to guarantee the life and property safety of the staff and traffi c participants at railway crossings. The LoRa long-range radio-frequency communication technology, OpenMV-based vehicle and pedestrian target recognition technology and computer based interlocking technology are combined to design a low-power, easy-to-deploy remote centralized management system for unattended railway crossings, which can improve the effi ciency of railway operation, and enhance the safety of railway transport.Keywords: railway crossing; wireless communication; OpenMV; computer based interlocking DOI: 10.3969/j.issn.1673-4440.2024.03.020收稿日期:2023-10-07;修回日期:2024-02-19基金项目:南京铁道职业技术学院大学生创新训练项目(yxkc2022026)发明专利:2023实用新型专利(ZL202320583648.0)第一作者:周帅(2003—),男,主要研究方向:城市轨道车辆应用技术,邮箱:*****************。
收稿日期:2020-03-06基金项目:国家自然科学基金项目(61771178)作者简介:李冰(1994-),男,杭州电子科技大学电子信息学院硕士研究生,研究方向为机器学习;陈龙(1979-),男,博士,杭州电子科技大学电子信息学院教授、硕士生导师,研究方向为嵌入式系统设计与应用、神经网络与机器学习。
本文通信作者:陈龙。
0引言近年来,随着汽车等交通工具的普及,交通事故发生率越来越高。
其中驾驶员疲劳驾驶是引发交通事故的主因。
疲劳驾驶指驾驶在长时间连续驾车后,生理机能进入失调状态,从而影响正常驾驶[1]。
目前对于疲劳驾驶的检测方法多为单一信号检测,常用检测方法主要包括:①基于驾驶员生理信号的检测。
该方法主要通过检测不同疲劳状态下人体脑电信号、心电信号或呼吸信号等出现的变化,判别驾驶员疲劳状态。
该方法需在驾驶员体表安装相应检测装置,这会影响驾驶员安全驾驶,可行性较低;②基于驾驶员面部表情特征的检测,该方法通过视频监测驾驶员面部表情特征,如眨眼、打哈欠等动作,并运用图像处理等方法提取分析数据以判断驾驶员疲劳状态。
该方法受驾驶员佩戴口罩、眼镜等行为的影响较大,疲劳检测准确度会随之降低。
针对目前疲劳驾驶检测方法存在的问题,本文采用多源信息结合的方式,将多种检测方法的优势结基于正则极限学习机的驾驶员疲劳状态分类方法李冰,陈龙(杭州电子科技大学电子信息学院,浙江杭州310018)摘要:为避免接触式疲劳检测方法给驾驶员带来干扰,解决单一信号源对于反映疲劳程度可靠性低的问题,实现对疲劳状态高精度、高速度的检测,提出一种基于正则极限学习机的驾驶员疲劳状态分类方法。
该方法通过多普勒雷达模块采集驾驶员生理信号,包括呼吸信号和心跳信号,作为神经网络输入数据。
通过多源信息结合的方式提高疲劳状态检测可靠性。
设计正则极限学习机(RELM )模型对数据集进行训练。
实验结果显示,基于RELM 算法模型检测驾驶员疲劳状态的准确率达92%。
基于EEG小波包子带能量比的疲劳驾驶检测方法叶柠;孙宇舸【期刊名称】《东北大学学报(自然科学版)》【年(卷),期】2012(033)008【摘要】驾驶员在从正常驾驶状态向疲劳驾驶状态变化的过程中,其脑电信号中的慢波逐渐增加,快波逐渐减少;针对这一特点,提出了一种基于小波包子带能量比的疲劳驾驶状态检测方法.采集和分析受试者模拟驾驶过程中的脑电信号,利用小波包分解系数计算出β波与慢波的能量比,将其作为疲劳指标F值.实验结果表明,尽管不同受试者的F值存在较大差异,但是对于同一受试者而言,F值随着驾驶时间的延长和疲劳程度的增加而逐渐降低,其相对于正常驾驶状态的衰减程度能够有效反映驾驶人的疲劳程度.%As a driver's state varies from normal driving to fatigue driving, the slowty claanging and the rapidly changing waves, which are in his/her EEG(electroencephalogram), increase and decrease gradually, respectively. So, it is possible to detect fatigue driving based on the energy ratio of wavelet packet sub-bands. The EEGs of some subjects were acquired and analyzed during their simulating driving. The energy ratio of β waves to slowly changing waves was computed from the decomposition coefficients of wavelet packets and it was defined as a fatigue index F. Experimental result shows that F value is dependent on subjects, and there are big difference among the subjects' F values. But for any subject,his/her F value decreases slowly with the increase of driving time andfatigue level. The attenuation level of F value compared with that of normal state can reflect the driver's fatigue level.【总页数】5页(P1088-1092)【作者】叶柠;孙宇舸【作者单位】东北大学信息科学与工程学院,辽宁沈阳110819;东北大学信息科学与工程学院,辽宁沈阳110819【正文语种】中文【中图分类】TP391【相关文献】1.基于EEG频谱特征的驾驶员疲劳监测研究 [J], 胡淑燕;郑钢铁2.基于EEG的驾驶疲劳识别算法及其有效性验证 [J], 郭孜政;牛琳博;吴志敏;肖琼;史磊3.基于EEG与EOG信号的疲劳驾驶状态综合分析 [J], 王福旺;王宏;罗旭4.基于EEG的驾驶疲劳监测与音乐播放控制系统 [J], 尹晶海;陈钰华;陈瑜5.基于驾驶行为的疲劳驾驶检测方法初探 [J], 牛超因版权原因,仅展示原文概要,查看原文内容请购买。
Driver Drowsiness Classification Using Fuzzy Wavelet-Packet-Based Feature-Extraction Algorithm Rami N.Khushaba∗,Sarath Kodagoda,Sara Lal,and Gamini DissanayakeAbstract—Driver drowsiness and loss of vigilance are a major cause of road accidents.Monitoring physiological signals while driving provides the possibility of detecting and warning of drowsi-ness and fatigue.The aim of this paper is to maximize the amount of drowsiness-related information extracted from a set of elec-troencephalogram(EEG),electrooculogram(EOG),and electro-cardiogram(ECG)signals during a simulation driving test.Specifi-cally,we develop an efficient fuzzy mutual-information(MI)-based wavelet packet transform(FMIWPT)feature-extraction method for classifying the driver drowsiness state into one of predefined drowsiness levels.The proposed method estimates the required MI using a novel approach based on fuzzy memberships providing an accurate-information content-estimation measure.The quality of the extracted features was assessed on datasets collected from31 drivers on a simulation test.The experimental results proved the significance of FMIWPT in extracting features that highly corre-late with the different drowsiness levels achieving a classification accuracy of95%–97%on an average across all subjects.Index Terms—Biosignal processing,driver drowsiness,feature extraction.I.I NTRODUCTIOND ROWSINESS is an intermediate state between wakeful-ness and sleep that has been defined as a state of progres-sive impaired awareness associated with a desire or inclination to sleep[1].In certain tasks,such as driving,drowsiness is consid-ered as a significant risk factor that substantially contributes to the increasing number of motor vehicle accidents each year[2]. Critical aspects of driving impairments associated with drowsi-ness are slow reaction times,reduced vigilance,and deficits in information processing that all lead to an abnormal drivingManuscript received June1,2010;revised August6,2010;accepted September4,2010.Date of publication September20,2010;date of current version December17,2010.This work is supported in part by the ARC Centre of Excellence programme,funded by the Australian Research Council and the New South Wales State Government,and in part by and the Centre for Intelligent and Mechatronics Systems,Sydney,Australia.Asterisk indicates corresponding author.∗R.N.Khushaba is with the ARC Centre of Excellence for Au-tonomous Systems,Faculty of Engineering and Information Technology,Uni-versity of Technology,Sydney,Broadway NSW2007,Australia(e-mail: rkhushab@.au).S.Kodagoda and G.Dissanayake are with the ARC Centre of Excellence for Autonomous Systems,Faculty of Engineering and Information Technology, University of Technology,Sydney,Broadway NSW2007,Australia(e-mail: sakoda@.au;gdissa@.au).l is with the Department of Medical and Molecular Biosciences,Faculty of Science,University of Technology,Sydney,Broadway NSW2007,Australia (e-mail:l@.au).Color versions of one or more of thefigures in this paper are available online at .Digital Object Identifier10.1109/TBME.2010.2077291behavior[3],[4].Driver drowsiness is usually used interchange-ably with the term driver fatigue;however,each of these terms has its own meaning.Fatigue is considered as one of the factors that can lead to drowsiness and is a consequence of physical labor or a prolonged experience,and is defined as a disinclina-tion to continue the task at hand[5].Driver fatigue is believed to account for35%–45%of all vehicle accidents[6].Some au-thors distinguish fatigue from drowsiness as the former does not fluctuate rapidly,over periods of a few seconds,as drowsiness. Usually,rest and inactivity relieves fatigue,however,it makes drowsiness worse[7].Different studies have been reported on driver drowsiness detection including methods identifying physiological associa-tions between driver drowsiness/fatigue and the corresponding patterns of the electroencephalogram(EEG)(brain activity), electroocculogram(EOG)(eye movement),and electrocardio-gram(ECG)(heart rate)signals[8]–[12].Most of these studies reported that the physiological approach to drowsiness detection can provide very accurate results as strong correlation between these signals and the driver’s cognitive state was found in many studies[9],[13].Specifically,many of these studies suggested that the change in the cognitive state can be associated with significant changes in the EEG frequency bands,such as delta (δ:0–4Hz),theta(θ:4–8Hz),alpha(α:8–13Hz),and beta(β: 13–20Hz)[14],[15],or their combinations[16],[17],and with changes in the eyelid parameters extracted from the EOG[18]. Additionally,heart rate variability was found to be applicable for the detection of drowsiness and fatigue using the ECG power spectrum[19].The crucial step in most of the aforementioned reported stud-ies includes the extraction of a set of features that correlate with drowsiness.As an example,Fu et al.[20]utilized probabilistic principal components analysis to extract features from62EEG channels to distinguish awake,drowsy,and sleep in a driving simulation study.A significant drop in classification accuracy from97%to89%was reported when classifying the extracted features into three classes rather than two binary classes(ei-ther awake and sleep)due to thefluctuations of the drowsy state.It was also reported that features extracted from high-frequency bands are more stable than features extracted from the low-frequency bands[20].A smaller number of channels was reported in other studies,for example,Lin et al.[21]uti-lized32EEG/EOG channels and2ECG channels,and employed power spectrum and independent component analysis with fuzzy neural networks for alertness estimation.A maximum testing accuracy of91.3%on an average across several subjects was re-ported.On the other hand,Yeo et al.[22]employed support vec-tor machines for classifying different frequency-spectrum-based0018-9294/$26.00©2011IEEEfeatures reporting very high classification results of99%accu-racy with19EEG channels and an EOG recording.The authors also reported that the achieved classification accuracy would be compromised,if less EEG channels were used[22].Despite the aforementioned results,there are a number of lim-itations associated with many of these previous studies.These include,most previous studies have attempted to use the same estimators for all subjects.However,the relatively large individ-ual variability in EEG dynamics accompanying loss of alertness means that,for many drivers,group statistics cannot be used to accurately predict changes in alertness and performance as denoted by Jung et al.[23].Another limitation in studies relat-ing drowsiness/fatigue to EEG/EOG/ECG spectrum is the use of a small number of spectral bands defined a priori,rather than exploring the benefits of using the full spectrum[14]–[17].It is also noticed that conventional techniques like the fast Fourier transform(FFT)were mostly employed for detecting changes in the spectral components of the corresponding signals[17],[23]. However,it is generally known that the FFT is not very suitable to extract features localized simultaneously in the time and fre-quency domains.The features extracted by FFT are of global nature either in time or frequency domains so that the interpre-tation of the results may not be straightforward.Additionally, the FFT is more suitable for analyzing stationary signals,while physiological signals like the EEG,tend to be nonstationary ones.This might in turn limit the accuracy of the drowsiness-detection system that is based on the aforementioned estimators and methods.On the other hand,it is generally known that methods like the wavelet transform and the wavelet packet transform(WPT)are more suitable than FFT when dealing with nonstationary sig-nals.Thus,an attempt for drowsiness detection with WPT-based feature-extraction method was utilized by Zhang et al.[24] on13EEG channels,1EOG,and1ECG channels.However, Shannon entropy was applied on the WPT to select the represen-tative features,but it is generally known that Shannon entropy is not suitable for classification problems but for compression problems[25].Thus,kernel-feature-projection techniques were necessary to provide an average accuracy of91%as the origi-nal features were not powerful representatives of the underlying problem.The aim of this paper is to develop an automated method of distinguishing different levels of drowsiness based on the EEG/EOG/ECG signals collected while performing simulated driving.Specifically,we propose to employ the WPT to con-struct features that highly correlate with alertness and the dif-ferent levels of drowsiness.The WPT is chosen due to its ability to deal with stationary,nonstationary,or transitory characteris-tics of different signals including abrupt changes,spikes,drifts, and trends[26].In order to automatically select the frequency components that are most suitable for constructing the drowsi-ness/alert features,a novel mutual information(MI)estimation measure is developed based on a generalization of the concept of fuzzy entropy.The justification for the need of such a measure is that it reduces the computational cost associated with esti-mating the MI(when comparing with other methods,such as Parzen density estimators and kernel density estimators for MI calculation)while providing an accurate estimation of the in-formation contents of the different features.Thus,different fea-tures are extracted for each subject using the proposed method that automatically optimizes the full frequency spectrum of the aforementioned signals for the best representation for detecting drowsiness.Additionally,unlike most attempts in the literature necessitating a large number of channels,only3EEG,1EOG, and1ECG channels are utilized in this paper to provide a more practical approach.The structure of the subsequent sections of the paper is as follows:Section II review the available WPT-based feature-extraction methods and justifies the need for the new method. Section III presents the proposed fuzzy-entropy-based MI mea-sure.The data-collection procedure is described in Section IV. Section V describesfirst the data-collection procedure and then presents the experimental results.Finally,a conclusion is pro-vided in Section VI.II.B ACKGROUNDA.Wavelet-Packet TransformBiomedical signals usually consist of brief high-frequency components closely spaced in time,accompanied by long last-ing,low-frequency components closely spaced in frequency. Wavelets are considered appropriate for analyzing such signals as they exhibit good frequency resolution along withfinite-time resolution;thefirst to localize the low-frequency com-ponents and the second to resolve the high-frequency compo-nents[27].The wavelet-packets transform,referred to as WPT subsequently,was introduced by Coifman et al.[28]by gener-alizing the link between multiresolution approximations and wavelets.The WPT may be thought of as a tree of sub-spaces,withΩ0,0representing the original signal space,i.e., the root node of the tree.In a general notation,the node Ωj,k,with j denoting the scale and k denoting the subband index within the scale,is decomposed into two orthogonal subspaces:an approximation spaceΩj,k→Ωj+1,2k plus a detail spaceΩj,k→Ωj+1,2k+1[29].This is done by divid-ing the orthogonal basisφj(t−2j k)k∈ZofΩj,k into two new orthogonal basesφj+1(t−2j+1k)k∈ZofΩj+1,2k andψj+1(t−2j+1k)k∈ZofΩj+1,2k+1[30],whereφj,k(t)and ψj,k(t)are the scaling and wavelet functions,respectively,that are given in[30]asφj,k(t)=1|2j|φt−2j k2j(1)ψj,k(t)=1|2j|ψt−2j k2j(2)where the dilation factor2j,also known as the scaling parameter, measures the degree of compression or scaling.On the other hand,the location parameter2j k,also known as translation parameter,determines the time location of the wavelet.The difference to the wavelet transform is that,for the sub-sequent decomposition levels,the WPT not only decomposes the approximation coefficients,but also the detail coefficients.KHUSHABA et al.:DRIVER DROWSINESS CLASSIFICATION USING FUZZY W A VELET-PACKET-BASED FEATURE-EXTRACTION ALGORITHM123Fig.1.Example of the wavelet-packet decompositions ofΩ0,0into tree-structured subspaces.This process is repeated J times,where J≤log2N,with N being the number of samples in the original signal.This in turn results in J×N coefficients.Thus,at resolution level j,where j=1,2,...,J,the tree has N coefficients divided into2j co-efficient blocks or crystals.This iterative process generates a binary wavelet-packet tree-like structure,where the nodes of the tree represent subspaces with different frequency localiza-tion characteristics.This is shown schematically in Fig.1with three levels of decomposition[29].B.Wavelet-Packet-Based Feature-Extraction MethodsThe problem of WPT-based feature extraction can be de-composed into two main tasks:feature construction and bases selection.In the feature-construction step,the goal is to utilize the WPT coefficients generated at each of the WPT tree sub-spaces(shown in Fig.1)to construct variables or properties that can best represent the classes of the signals at hand.On the other hand,the bases-selection problem is related to the iden-tification of the best bases(given an ensemble of bases)from which the constructed features can highly discriminate between the signals belonging to different problem classes.Early contri-butions to thisfield included the work presented by Coifman and Wickerhauser[31]in which they proposed the joint best basis (JBB)method utilizing Shannon entropy as a cost function for bases selection.A limitation of JBB method is that it is most suitable for compression tasks rather than classification tasks. The justification is that classification problems need a measure to evaluate the power of discrimination of each subspace in the tree-structured subspaces rather than the efficiency in represen-tation.To overcome such a limitation,the local discriminant basis(LDB)algorithm was proposed by Saito[25]employing the symmetric relative entropy.Both JBB and LDB are con-cerned with the energy levels of signals and were reported by Deqiang et al.[26]to exhibit some drawbacks when it comes to accentuating the discriminatory power essential in classifica-tion tasks.A fuzzy-set-based criterion was proposed by Deqiang et al.[26]in their fuzzy wavelet-packet method(or simply the FWP)to aid in the selection of the best basis showing better performance than JBB and LDB when classifying biomedical signals.On the other hand,the optimal wavelet packet(OWP) method proposed by Wang et al.[32]using Davies–Bouldin criterion managed to outperform the FWP when classifying biomedical signals.The justification for such a performance is that the feature representation in OWP attempts to overcome the lack of the shift-invariance property observed in the feature representation of FWP,JBB,and LDB as stated by Wang et al.[32].However,the distances between the classes or clusters’centers across each feature(or dimension)are used to judge on the features suitability for classification in both OWP and FWP(in the preprocessing step).It is generally known that considering distance measures alone may not be very useful to judge the corresponding features ability in separating different classes[33].A justification is that classes’means alone cannot be considered to estimate the classification error,since the error also depends on the overlap between the class likelihoods.Thus, a more powerful measure is required to identify the information content of the different features.C.Mutual Information-Based Information EstimationOne of the most suitable approaches for estimating the ar-bitrary dependency between random variables is based on the concept of MI.The concept of MI is tightly interconnected with the concept of uncertainty,where the MI between two random variables is viewed as the capacity to reduce the uncertainty about those variables[37],[38].The entropy,being a measure of uncertainty,is usually used to represent the MI between each feature or variable(denoted as f)and the class label(denoted as C)according to the following:I(f;C)=H(f)+H(C)−H(f,C)=H(f)−H(f|C)(3) where H(f)and H(C)are the marginal entropies of f and C,respectively,and H(f,C)and H(f|C)are the joint and conditional entropies of f and C,respectively.According to Fano[34],maximizing the MI between different features and the desired target can achieve the lowest probability of error.How-ever,in order to estimate the probability distribution function associated with a random variable,a certain density-estimation method needs to be employed for which many variants exist in the literature.Examples include the histograms,k-nearest neighbors(k NN),Parzen,and kernel density estimators[35]. The simplest and most widely used is the histogram approach in which one simply constructs the bins of the histogram by count-ing the number of occurrences of the values.Unfortunately, this method requires a huge number of samples to capture the density accurately.However,this will in turn lead to higher computational complexities.As an alternative to the aforemen-tioned approaches in MI computation,we propose to employ fuzzy memberships for the calculation of MI,or the associated entropies required for MI computation(depending on the ap-proach utilized).The justification behind the proposed approach is that fuzzy entropy and MI measures are able to reflect the ac-tual distribution of the classification patterns and this will in turn reduce the generated decision regions for classification[36]and124IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING,VOL.58,NO.1,JANUARY2011 will also provide more accurate estimation of the informationcontent.III.M UTUAL I NFORMATION-B ASED W A VELET-P ACKETF EATURE E XTRACTIONThere are two classical theories of uncertainty within whichwe can define the entropy,thefirst and the most well known isbased on the notion of probability,while the second is basedon the notion of possibility[38].In the probabilistic approach,Shannon entropy is a well-known measure of uncertainty and isextensively covered in the literature.An extension to Shannonentropy is the concept of fuzzy entropy,in which fuzzy sets areused to aid the estimation of the entropies.It should be high-lighted that the measuring of the fuzzy entropy is quite differentfrom the classical Shannon entropy since fuzzy entropy containsfuzziness uncertainties(possibilistic),while Shannon entropycontains randomness uncertainties(probabilistic).However,awell-defined fuzzy entropy must satisfy the four Luca–Terminiaxioms[39]given as follows.Thefirst task was to estimatethe required memberships of the samples along each dimension(or feature)in all of the problem classes.Several methods werereported in the literature for estimating the membership func-tions including the k NN approach[40],the well-known fuzzyc-means method(employed within the FWP)[41],and manyother variants[33],[42].Given the high computation cost as-sociated with the k NN approach and the singularity problem ofthe fuzzy c-means,we propose to use the following approachfor estimating the membership values.Given a universal set with elements x k distributed in a patternspace as X={x1,x2,...,x l},where k=1,2,...,l with lbeing the total number of patterns.For simplicity,It will beuseful to describe the membership value that the k th vector hasin the i th class with the following notation:μik=μi(x k)∈[0,1].(4)Denote the mean of the data sample that belong to class i asx i and the radius of the data as rr=max x i−x k σ.(5)Then the fuzzy membershipμik can be calculated as followsμik=x i−x k σr+−2m−1(6)where m is the fuzzification parameter,and >0is a small value to avoid singularity,andσis the standard deviation involved in the distance computation.Finally,the membership of each of the samples in all of the problem classes is normalized according to ci=1μik=1.A.Fuzzy Entropy and Mutual InformationLet X={x1,x2,...,x n}be a discrete random variable with afinite alphabet set containing n symbols,and letμA(x i)be the membership degree of the element x i to fuzzy set A,and F be a set-to-point mapping F:G(2X)→[0,1].Hence,F is a fuzzy set defined on fuzzy sets.F is an entropy measure if it satisfies the following Luca–Termini axioms[36],[39],[43]:1)F(A)=0iff A∈2X,where A is a nonfuzzy set and2Xindicates the power set of set A.2)F(A)=1iffμA(x i)=0.5for all i;3)F(A)≤F(B)if A is less fuzzy than B,i.e.,ifμA(x i)≤μB(x i)whenμB(x i)≤0.5andμA(x i)≥μB(x i)when μB(x i)≥0.5;4)F(A)=F(A c);where A c=(1−μA(x1),...,1−μA(x n)).Shannon entropy satisfies the aforementioned four De Luca–Termini axioms, where for a discrete random variable X with a probability mass function p(x i),Shannon entropy is defined byH(X)=−ip(x i)log2p(x i).(7)Using the proposed membership function in(6),we construct c-fuzzy sets along each specific feature f,each of these will in turn reflect the membership degrees of the samples in each of the c problem classes.The fuzzy equivalent to the joint probability of the training patterns that belong to class i is given here asP(f,c i)=k∈A iμikNP(8)where P(f,c i)can be interpreted as the degree by which the samples that are predefined to belong to class i does actually contribute to that specific class.A i is the set of indices of the training patterns belonging to class i,and NP is the total number of patterns.The joint fuzzy entropy of the elements of class i, denoted as H(f,c i),is then equal toH(f,c i)=−P f,cilog P f,ci.(9) In order to account for the entropy along all c-classes,the aforementioned entropy has to be summed along the universal set to generate the complete fuzzy entropy H(f,C)H(f,C)=ci=1H(f,c i).(10)The aforementioned entropy satisfies the four De Luca–Termini axioms and is termed as the joint fuzzy entropy.The aforementioned equations can be applied on the samples along each feature,thus computing the entropies associated with each feature.In order tofind the marginal entropy H(f)of each feature, we add the estimated membership values of the samples along each of the c-fuzzy sets S i as follows:P(f Si)=kμikNP.(11) Then the marginal entropy is found byH(f)=−P fS ilog P fS i.(12) Similarly,in order to construct the class marginal entropy H C,wefirstfind the fuzzy equivalent to the class probability P(c i).This is found by adding the estimated membership values along all of the generated fuzzy sets shown as follows:P(c i)=k∈A i,∀SμikNP.(13)KHUSHABA et al.:DRIVER DROWSINESS CLASSIFICATION USING FUZZY W A VELET-PACKET-BASED FEATURE-EXTRACTION ALGORITHM125 Then the marginal class entropy is found byH(C)=−P ci log P ci.(14)The required MI between each feature and the class label is then measured using(3)for each of the features.Finally,to compute the MI between each two variables(to measure the dependency between the two variables),one can multiply the corresponding membership vales from each feature along each fuzzy set and then employ(11)and(12)again tofind I(f1;f2) by using(3).B.Proposed Feature-Extraction MethodGiven a dataset of collected signal records,the WPT trans-form is applied on each record of the data acquired from each channel to generate a complete tree up to a level J decom-position.Considering each of the subspacesΩj,k of the WPT decomposition tree as a feature space,then features were simply constructed by using the normalizedfilter bank energy.This is simply formed for each subspace by accumulating the squares of the transform coefficients for that subspace divided by the number of coefficients in that subspace.Additionally,the log-arithmic operator was applied to normalize the distribution of the generated features as given belowEΩj,k =logn(w Tj,k,nx)2N/2j(15)where EΩj,k is the normalized logarithmic energy of thewavelet-packet coefficients extracted from the subspaceΩj,k. w j,k x is the wavelet-packet transformed signal(or simply the coefficients)evaluated at subspaceΩj,k,and N/2j is the number of the coefficients in that specific subspace.For a dataset of n features f i,where i=1,2,...,n,one can either utilize the estimated fuzzy MI measure or its normalized variant asF i=I(C;f i)H(f i)i=1,2,...,n.(16)Using(16),calculate the fuzzy-set-based criterion on all the features to evaluate their classification ability.Finally,rank the features according to the aforementioned measure,then remove their ascending and descending nodes or features,and choose the remaining features for classification.The algorithm is then summarized,in a similar manner to that of the OWP and FWP algorithms but with different feature representation and informa-tion measure than those utilized by these algorithms,as follows: Algorithm:The Fuzzy MI-based Wavelet-PacketAlgorithm-FMIWPT)Given a training dataset consisting of labeled original signals, 1)Step0:For each labeled original signal,perform a fullWPT decomposition to the maximum level J.For all j= 0,1,...,J and k=0,1,2,...,2j−1,construct features according to(15).2)Step1:Construct the associated fuzzy sets and computethe fuzzy entropies and MI.Then calculate F(Ωj,k)ac-cording to(16),whereΩj,k is the subspace representing each of the features.3)Step2:Determine the optimal WPT decomposition X∗,being the one that corresponds to the maximum value of F.4)Step3:In descending order,sort the subspaces by F,Ω={Ω(1),Ω(2),...,Ω(l)}.Let X∗=∅.5)Step4:Movefirst element inΩto X∗.6)Step5:∀Ω(k)∈Ω,ifΩ(k)is an ascendant(father)ordescendant(child)(direct or indirect)ofΩ(j1),remove Ω(k)fromΩ.7)Step6:ifΩ=∅,stop.Otherwise go to(3)and continue.8)Step7:The set X is thefinal FMIWPT-baseddecomposition.The aforementioned algorithm is applied to optimize the WPT tree of each data channel.That is,the aforementioned algorithm is utilized to extract features from each channel and then these features are all concatenated to form one large feature vector that will be used for classification.IV.D ATA-C OLLECTION P ROCEDUREThirty-one subjects(volunteer drivers,all males)aged be-tween20–69years were recruited to perform a driving simu-lation task.All participants provided informed consent prior to participating in the study.Lifestyle appraisal questionnaire was used as a selection criteria,which required participants to have no medical contraindications such as severe concomitant dis-ease,alcoholism,drug abuse,and psychological or intellectual problems likely to limit compliance[44].Most of the studies were conducted between9:00AM and1:00PM with the total study time involving2–3h per subject.This included completing questionnaires,physiological sensor attachment,and perform-ing the driving simulator task.The required ethical approval was obtained from the University Human Ethics Committee. Participants were asked to refrain from consuming caffeine, tea,or food,as well as smoking approximately4h and alco-hol24h before the study,and reported compliance with these instructions.STISIM driver,from Systems Technology,Inc.(STI),was utilized as driving simulator.The video display showed other cars,the driving environment,the current speed,pedestrians, and other road stimuli.The driving simulator equipment con-sisted of a large display unit with in-built steering wheel,brakes, and accelerator,as shown in Fig.2.All subjects were given in-structions on the operation of the simulator prior to the study. Two driving sessions were completed by each of the drivers. The initial driving session was approximately25min of alert driving,with a track involving many cars and stimuli on the road to serve as the baseline measure.The alert driving session was followed by monotonous driving session,in which partic-ipants were required to drive continuously for approximately 1h.This session involved the participants driving with very few road stimuli in a track resembling country-side driving. Simultaneous physiological measurements were recorded during the driving sessions.FlexComp Infiniti encoder,from。