of PCA KPCA and ICA for dimensionality reduction in SVM

格式：pdf
大小：282.49 KB
文档页数：16

下载文档原格式

基于核主成分分析和深度置信网络的暂态稳定评估

基于核主成分分析和深度置信网络的暂态稳定评估#唐文权，徐武，文聪，郭兴(云南民族大学电气信息工程学院，云南昆明650500)摘要：针对电力系统暂态稳定评估实时性较差以及错误率较高的问题，提岀了一种核主成分分析结合深度置信网络的暂态稳定评估方法。

首先，构造了一组电力系统暂态稳定的特征向量;然后，基核主成分分析法对向行提取，向以及余，将的向量传输至深度置信网络;最后,进行训练分析，训程包括预训练和微调，优化网络，提深度置信网络评估精度。

1039系统结，该方法可以输的，余，暂态稳定性评估的错误率时间，、电力系统的稳态状态。

关键词：电力系统暂态稳定评估；核主成分分析；特征降维；深度置信网络中图分类号：TM712文献标志码：A文章编号：1673-6540(2021)01-0046-07doi：10.12177/emca.2020.1%5Transient Stability Assessment Based on Kernel Principal ComponentAnalysis and Deep BelieS Network*TANG Wenquan+XU%）WEN Cong,GUO Xing（School of Electrical and Information Technology,Yunnan Minzu University,Kunming650500,China）Abstract：Aiming ai the problems of poos real-time performance and high eiros rate of powcs system transieni stability assessment,a method of transient stability assessment based on kerneC principa-component analysit combined with deep belief network is proposed.Firstly,a set of eigenvectoio reflecting the transient stabOity of power system is constructed.Secondly,the feature vector set is extracted based on kernel principa-component analysis,and the dimensionlity of featura vector is reduced and the redundant features ara filtered.The reduced eicenvectors ara transmitted1w the deep belief network.Finally,training analysis is corried out.The training process includes pretraining and fine tuning t optimize network parameters,and then the evaluation accuacy of deep confidencc network is irnpaved.The sirnulation resultr of NewEngland10-machine39-bus system showthat the method can afective-y aeducethedimenKionayit,oeinputdata,aemoveaedundanteeatuaeK,aeducetheeaoaaateand teKttimeoetaanKient stability assessment,as well as accurate-y and quicky judge the steady state of power system.Key words:power system transiee：stability assessment;kernel principal componeet analysis;featrrr dimensionality reeuction;deep belief network引去复杂性，准确、快速评估电力系统暂态稳定愈发困⑴。

fulltext(8)

Application of 2DPCA Based Techniques in DCT Domain for Face RecognitionMessaoud Bengherabi1, Lamia Mezai1, Farid Harizi1, Abderrazak Guessoum2,and Mohamed Cheriet31Centre de Développement des Technologies Avancées- AlgeriaDivision Architecture des Systèmes et MultiMédiaCité 20 Aout, BP 11, Baba Hassen, Algiers-Algeria-bengherabi@, l_mezai@yahoo.fr, harizihourizi@yahoo.fr 2 Université Saad Dahlab de Blida – AlgeriaLaboratoire Traitement de signal et d’imagerieRoute De Soumaa BP 270 Blidaguessouma@3 École des Technologies Supérieur –Québec- Canada-Laboratoire d’Imagerie, de Vision et d’Intelligence Artificielle1100, Rue Notre-Dame Ouest, Montréal (Québec) H3C 1K3 Canadamohamed.cheriet@gpa.etsmtl.caAbstract.In this paper, we introduce 2DPCA, DiaPCA and DiaPCA+2DPCA in DCT domain for the aim of face recognition. The 2D DCT transform has been used as a preprocessing step, then 2DPCA, DiaPCA and DiaPCA+2DPCA are applied on the upper left corner block of the global 2D DCT transform matrix of the original images. The ORL face database is used to compare the proposed approach with the conventional ones without DCT under Four matrix similarity measures: Frobenuis, Yang, Assembled Matrix Distance (AMD) and Volume Meas-ure (VM). The experiments show that in addition to the significant gain in both the training and testing times, the recognition rate using 2DPCA, DiaPCA and DiaPCA+2DPCA in DCT do-main is generally better or at least competitive with the recognition rates obtained by applying these three 2D appearance based statistical techniques directly on the raw pixel images; espe-cially under the VM similarity measure.Keywords: Two-Dimensional PCA (2DPCA), Diagonal PCA (DiaPCA), DiaPCA+2DPCA, face recognition, 2D Discrete Cosine Transform (2D DCT).1 IntroductionDifferent appearance based statistical methods for face recognition have been pro-posed in literature. But the most popular ones are Principal Component Analysis (PCA) [1] and Linear Discriminate Analysis (LDA) [2], which process images as 2D holistic patterns. However, a limitation of PCA and LDA is that both involve eigen-decomposition, which is extremely time-consuming for high dimensional data.Recently, a new technique called two-dimensional principal component analysis 2DPCA was proposed by J. Yang et al. [3] for face recognition. Its idea is to estimate the covariance matrix based on the 2D original training image matrices, resulting in a E. Corchado et al. (Eds.): CISIS 2008, ASC 53, pp. 243–250, 2009. © Springer-Verlag Berlin Heidelberg 2009244 M. Bengherabi et al.covariance matrix whose size is equal to the width of images, which is quite small compared with the one used in PCA. However, the projection vectors of 2DPCA reflect only the variations between the rows of images, while discarding the variations of columns. A method called Diagonal Principal Component Analysis (DiaPCA) is proposed by D. Zhang et al. [4] to resolve this problem. DiaPCA seeks the projection vectors from diagonal face images [4] obtained from the original ones to ensure that the correlation between rows and those of columns is taken into account. An efficient 2D techniques that results from the combination of DiaPCA and 2DPCA (DiaPCA+2DPCA) is proposed also in [4].Discrete cosine transform (DCT) has been used as a feature extraction step in vari-ous studies on face recognition. This results in a significant reduction of computa-tional complexity and better recognition rates [5, 6]. DCT provides excellent energy compaction and a number of fast algorithms exist for calculating it.In this paper, we introduce 2DPCA, DiaPCA and DiaPCA+2DPCA in DCT do-main for face recognition. The DCT transform has been used as a feature extraction step, then 2DPCA, DiaPCA and DiaPCA+2DPCA are applied only on the upper left corner block of the global DCT transform matrix of the original images. Our proposed approach is tested against conventional approaches without DCT under Four matrix similarity measures: Frobenuis, Yang, Assembled Matrix Distance (AMD) and Vol-ume Measure (VM).The rest of this paper is organized as follows. In Section 2 we give a review of 2DPCA, DiaPCA and DiaPCA+2DPCA approaches and also we review different matrix similarity measures. In section 3, we present our contribution. In section 4 we report the experimental results and highlight a possible perspective of this work. Fi-nally, in section 5 we conclude this paper.2 Overview of 2DPCA, DiaPCA, DiaPCA+2DPCA and MatrixSimilarity Measures2.1 Overview of 2D PCA, DiaPCA and DiaPCA+2DPCA2.1.1 Two-Dimensional PCAGiven M training face images, denoted by m ×n matrices A k (k = 1, 2… M), two-dimensional PCA (2DPCA) first uses all the training images to construct the image covariance matrix G given by [3]()()∑=−−=M k k T k A A A AM G 11(1) Where A is the mean image of all training images. Then, the projection axes of 2DPCA, X opt =[x 1… x d ] can be obtained by solving the algebraic eigenvalue problem G x i =λi x i , where x i is the eigenvector corresponding to the i th largest eigenvalue of G[3]. The low dimensional feature matrix C of a test image matrix A is extracted byopt AX C = (2)In Eq.(2) the dimension of 2DPCA projector X opt is n×d , and the dimension of 2DPCA feature matrix C is m×d .Application of 2DPCA Based Techniques in DCT Domain 2452.1.2 Diagonal Principal Component AnalysisSuppose that there are M training face images, denoted by m ×n matrices A k (k = 1, 2, …, M ). For each training face image A k , we calculate the corresponding diagonal face image B k as it is defined in [4].Based on these diagonal faces, diagonal covariance matrix is defined as [4]:()()∑=−−=M k k T k DIAG B B B BM G 11 (3) Where ∑==M k k BM B 11is the mean diagonal face. According to Eq. (3), the projectionvectors X opt =[x 1, …, x d ] can be obtained by computing the d eigenvectors correspond-ing to the d biggest eigenvalues of G DIAG . The training faces A k’s are projected onto X opt , yielding m×d feature matrices.opt k k X A C = (4)Given a test face image A, first use Eq. (4) to get the feature matrix opt AX C =, then a matrix similarity metric can be used for classification.2.1.3 DiaPCA+2DPCASuppose the n by d matrix X=[x 1, …, x d ] is the projection matrix of DiaPCA. Let Y =[y 1, …, y d ] the projection matrix of 2DPCA is computed as follows: When the height m is equal to the width n , Y is obtained by computing the q eigenvectors corre-sponding to the q biggest eigenvalues of the image covarinace matrix ()()∑=−−M k kT k A A A A M 11. On the other hand, when the height m is not equal to the width n, Y is obtained by computing the q eigenvectors corresponding to the q biggest ei-genvalues of the alternative image covariance matrix ()(TM k k k A A A A M ∑=−−11. Projecting training faces A k s onto X and Y together, yielding the q ×d feature matricesX A Y D k T k = (5)Given a test face image A, first use Eq. (5) to get the feature matrix AX Y D T =, then a matrix similarity metric can be used for classification.2.2 Overview of Matrix Similarity MeasuresAn important aspect of 2D appearance based face recognition approaches is the simi-larity measure between matrix features used at the decision level. In our work, we have used four matrix similarity measures.2.2.1 Frobenius DistanceGiven two feature matrices A = (a ij )m ×d and B = (b ij )m ×d , the Frobenius distance [7] measure is given by:()()21211,⎟⎟⎠⎞⎜⎜⎝⎛−=∑∑==ij ij d j mi F b a B A d (6)246 M. Bengherabi et al.2.2.2 Yang Distance MeasureGiven two feature matrices A = (a ij )m ×d and B = (b ij )m ×d , the Yang distance [7] is given by:()()21211,⎟⎠⎞⎜⎝⎛−=∑∑==ij ij m i d j Y b a B A d (7) 2.2.3 Assembled Matrix Distance (AMD)A new distance called assembled matrix distance (AMD) metric to calculate the dis-tance between two feature matrices is proposed recently by Zuo et al [7]. Given two feature matrices A = (a ij )m ×d andB = (b ij )m ×d , the assembled matrix distance d AMD (A ,B ) is defined as follows :()()()()0,2112112>⎟⎟⎠⎞⎜⎜⎝⎛⎟⎠⎞⎜⎝⎛−=∑∑==p b a B A d d j p m i ij ij AMD (8)It was experimentally verified in [7] that best recognition rate can be obtained when p ≤0.125 while it decrease as p increases. In our work the parameter p is set equal to 0.125.2.2.4 Volume Measure (VM)The VM similarity measure is based on the theory of high-dimensional geometry space. The volume of an m ×n matrix of rank p is given by [8] ()IJ 2N J ,I A det A Vol ∑∈=(9)where A IJ denotes the submatrix of A with rows I and columns J , N is the index set of p ×p nonsingular submatrix of A , and if p =0, then Vol A = 0 by definition.3 The Proposed ApproachIn this section, we introduce 2DPCA, DiaPCA and DiaPCA+2DPCA in DCT domain for the aim of face recognition. The DCT is a popular technique in imaging and video compression, which was first applied in image compression in 1974 by Ahmed et al[9]. Applying the DCT to an input sequence decomposes it into a weighted sum of basis cosine sequences. our methodology is based on the use of the 2D DCT as a feature extraction or preprocessing step, then 2DPCA, DiaPCA and DiaPCA+2DPCA are applied to w×w upper left block of the global 2D DCT transform matrix of the original images. In this approach, we keep only a sub-block containing the first coef-ficients of the 2D DCT matrix as shown in Fig.1, from the fact that, the most signifi-cant information is contained in these coefficients. 1w ww Fig. 1. Feature extraction in our approachApplication of 2DPCA Based Techniques in DCT Domain 247 With this approach and inversely to what is presented in literature of DCT-based face recognition approaches, the 2D structure is kept and the dimensionality reduction is carried out. Then, the 2DPCA, DiaPCA and DiaPCA+2DPCA are applied to w ×w block of 2D DCT coefficients. The training and testing block diagrams describing theproposed approach is illustrated in Fig.2.Block w*w of 2D DCT coefficients Trained Model Test image Block w*w of 2D DCT coefficients Comparison using 9Frobenius 9Yang 9AMD 9VMTraining algorithm based on 92DPCA 9DiaPCA9DiaPCA+2DPCATraining data 2DDCT 2D DCT Projection of the DCT blocof the test image using theeigenvectors of92DPCA9DiaPCA9DiaPCA+2DPCA Decision 2D DCT image 2D DCT BlockFeatures2D DCT image 2D DCT BlockFeaturesFig. 2. Block diagram of 2DPCA, DiaPCA and DiaPCA+2DPCA in DCT domain4 Experimental Results and DiscussionIn this part, we evaluate the performance of 2DPCA, DiaPCA and DiaPCA+2DPCA in DCT domain and we compare it to the original 2DPCA, DiaPCA and DiaPCA+2DPCA methods. All the experiments are carried out on a PENTUIM 4 PC with 3.2GHz CPU and 1Gbyte memory. Matlab [10] is used to carry out these ex-periments. The database used in this research is the ORL [11] (Olivetti Research Laboratory) face database. This database contains 400 images for 40 individuals, for each person we have 10 different images of size 112×92 pixels. For some subjects, the images captured at different times. The facial expressions and facial appearance also vary. Ten images of one person from the ORL database are shown in Fig.3.In our experiment, we have used the first five image samples per class for training and the remaining images for test. So, the total number of training samples and test samples were both 200. Herein and without DCT the size of diagonal covariance matrix is 92×92, and each feature matrix with a size of 112×p where p varies from 1 to 92. However with DCT preprocessing the dimension of these matrices depends on the w ×w DCT block where w varies from 8 to 64. We have calculated the recognition rate of 2DPCA, DiaPCA, DiaPCA+2DPCA with and without DCT.In this experiment, we have investigated the effect of the matrix metric on the per-formance of the 2D face recognition approaches presented in section 2. We see from table 1, that the VM provides the best results whereas the Frobenius gives the worst ones, this is justified by the fact that the Frobenius metric is just the sum of the248 M. Bengherabi et al.(a)(b) Fig. 3. Ten images of one subject in the ORL face database, (a) Training, (b) TestingEuclidean distance between two feature vectors in a feature matrix. So, this measure is not compatible with the high-dimensional geometry theory [8].Table 1. Best recognition rates of 2DPCA, DiaPCA and DiaPCA+2DPCA without DCT MethodsFrobenius Yang AMD p=0,125 Volume Distance 2DPCA91.50 (112×8) 93.00 (112×7) 95.00 (112×4) 95.00 (112×3) DiaPCA91.50 (112×8) 92.50 (112×10) 91.50 (112×8) 94.00 (112×9) DiaPCA+2DPCA 92.50 (16×10) 94.00 (13×11) 93.00 (12×6) 96.00 (21×8)Tables 2, and Table 3 summarize the best performances under different 2D DCT block sizes and different matrix similarity measures.Table 2. 2DPCA, DiaPCA and DiaPCA+2DPCA under different DCT block sizes using the Frobenius and Yang matrix distanceBest Recognition rate (feature matrix dimension) 2DPCA DiaPCA DiaPCA+2DPCA 2DPCA DiaPCA DiaPCA+2DPCA 2D DCTblocksize Frobenius Yang 8×891.50 (8×8) 91.50 (8×6) 91.50 (6×6) 93.50 (8×6) 93.50 (8×5) 93.50 (8×5) 9×992.00 (9×9) 92.00 (9×5) 92.00 (9×5) 93.00 (9×6) 95.00 (9×9) 95.00 (9×9) 10×1091.50 (10×5) 92.00 (10×5) 92.00 (10×5) 94.50 (10×6)95.50 (10×9) 95.50 (10×9) 11×1192.00 (11×8) 91.50 (11×5) 92.00 (9×5) 94.00(11×6)95.50 (11×5)95.50 (11×5) 12×1292.00 (12×8) 91.50 (12×10) 91.50 (9×5) 94.50 (12×6) 95.50 (12×5) 95.50 (12×5) 13×1391.50 (13×7) 92.00 (13×11) 92.00 (12×11) 94.50 (13×6) 95.00 (13×5) 95.00 (11×5) 14×1492.00 (14×7) 91.50 (14×7) 92.00 (12×7) 94.50 (14×6) 94.50 (14×5) 95.00 (12×5) 15×1591.50 (15×5) 91.50 (15×5) 92.00 (13×15) 94.00 (15×9) 94.50 (15×5) 95.50 (12×5) 16×1692.50 (16×10) 91.50 (16×11) 92.00 (4×10) 94.00 (16×7)94.50 (16×5) 95.00 (12×5) 32×3292.00 (32×6) 91.50 (32×6) 92.00 (11×7) 93.00 (32×6) 93.50 (32×5) 95.00 (12×5) 64×6491.50 (64×6) 91.00 (32×6) 92.00 (14×12) 93.00 (64×7) 93.50 (64×5) 95.00 (12×5) From these four tables, we notice that in addition to the importance of matrix simi-larity measures, by the use of DCT we have always better performance in terms of recognition rate and this is valid for all matrix measures, we have only to choose the DCT block size and appropriate feature matrix dimension. An important remark is that a block size of 16×16 or less is sufficient to have the optimal performance. So, this results in a significant reduction in training and testing time. This significant gainApplication of 2DPCA Based Techniques in DCT Domain 249 Table 3. 2DPCA, DiaPCA and DiaPCA+2DPCA under different DCT block sizes using the AMD distance and VM similarity measure on the ORL database Best Recognition rate (feature matrix dimension) 2DPCA DiaPCA DiaPCA+2DPCA 2DPCA DiaPCA DiaPCA+2DPCA 2D DCTblock sizeAMD VM 8×894.00 (8×4) 95.00 (8×6) 95.00 (7×5) 96.00 (8×3) 93.50 (8×4) 93.50 (8×4) 9×994.50 (9×4) 94.50 (9×5) 94.50 (9×5) 95.00 (9×4) 95.00 (9×5) 95.00 (9×5) 10×1094.50 (10×4) 95.50 (10×5) 96.00 (9×7) 95.00 (10×3) 95.00 (10×4) 95.00 (10×4) 11×1195.50 (11×5) 96.00 (11×5) 96.50 (9×6) 94.50 (11×3) 95.50 (11×3) 95.50 (11×3) 12×1295.50 (12×5) 96.50 (12×7) 96.50 (9×7) 95.50 (12×5) 96.00 (12×5) 96.00 (11×5) 13×1396.00 (13×4) 95.50 (13×5) 95.50 (12×5) 96.00 (13×9) 96.00 (13×5) 96.50 (10×5) 14×1496.00 (14×4) 95.00 (14×5) 95.50 (10×5) 95.00 (14×3) 95.50 (14×5) 96.50 (10×5) 15×1596.00 (15×4) 95.00 (15×5) 96.00 (9×7) 96.00 (15×8) 96.00 (15×5) 96.50 (10×5) 16×1696.00 (16×4) 95.50 (16×5) 96.50 (12×5) 95.50 (16×8) 96.00 (16×5) 96.50 (10×5) 32×3295.50 (32×4) 95.00 (32×9) 96.00 (11×5) 95.00 (32×3) 95.50 (32×5) 96.50 (9×5) 64×64 95.00 (64×4) 94.50 (64×9) 96.00 (12×5) 95.00 (64×3) 95.00 (64×5) 96.50 (21×5)in computation is better illustrated in table 4 and table 5, which illustrate the total training and total testing time of 200 persons -in seconds - of the ORL database under 2DPCA, DiaPCA and DiaPCA+2DPCA without and with DCT, respectively. We should mention that the computation of DCT was not taken into consideration when computing the training and testing time of DCT based approaches.Table 4. Training and testing time without DCT using Frobenius matrix distanceMethods2DPCA DiaPCA DiaPCA+2DPCA Training time in sec5.837 (112×8) 5.886 (112×8) 10.99 (16×10) Testing time in sec 1.294 (112×8) 2.779 (112×8) 0.78 (16×10)Table 5. Training and testing time with DCT using the Frobenius distance and the same matrix-feature dimensions as in Table2Training time in sec Testing time in sec 2D DCTblock size2DPCA DiaPCA DiaPCA+2DPCA 2DPCA DiaPCA DiaPCA+2DPCA 8×80.047 0.047 0.047 0.655 0.704 0.61 9×90.048 0.048 0.124 0.626 0.671 0.656 10×100.047 0.048 0.094 0.611 0.719 0.625 11×110.048 0.047 0.063 0.578 0.734 0.5 12×120.063 0.046 0.094 0.641 0.764 0.657 13×130.062 0.047 0.126 0.642 0.843 0.796 14×140.079 0.062 0.14 0.656 0.735 0.718 15×150.094 0.078 0.173 0.641 0.702 0.796 16×16 0.125 0.141 0.219 0.813 0.8270.829 We can conclude from this experiment, that the proposed approach is very efficient in weakly constrained environments, which is the case of the ORL database. 5 ConclusionIn this paper, 2DPCA, DiaPCA and DiaPCA+2PCA are introduced in DCT domain. The main advantage of the DCT transform is that it discards redundant information and it can be used as a feature extraction step. So, computational complexity is signifi-cantly reduced. The experimental results show that in addition to the significant gain in both the training and testing times, the recognition rate using 2DPCA, DiaPCA and DiaPCA+2DPCA in DCT domain is generally better or at least competitive with the250 M. Bengherabi et al.recognition rates obtained by applying these three techniques directly on the raw pixel images; especially under the VM similarity measure. The proposed approaches will be very efficient for real time face identification applications such as telesurveillance and access control.References1.Turk, M., Pentland, A.: “Eigenfaces for Recognition. Journal of Cognitive Neu-rosicence 3(1), 71–86 (1991)2.Belhumeur, P.N., Hespanha, J.P., Kriegman, D.J.: Eigenfaces vs. fisherfaces: Recognitionusing class specific linear projection. IEEETrans. on Patt. Anal. and Mach. Intel. 19(7), 711–720 (1997)3.Yang, J., Zhang, D., Frangi, A.F., Yang, J.Y.: Two-Dimensional PCA: A New Approachto Appearance- Based Face Representation and Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 26(1), 131–137 (2004)4.Zhang, D., Zhou, Z.H., Chen, S.: “Diagonal Principal Component Analysis for Face Rec-ognition. Pattern Recognition 39(1), 140–142 (2006)5.Hafed, Z.M., Levine, M.D.: “Face recognition using the discrete cosine transform. Interna-tional Journal of Computer Vision 43(3) (2001)6.Chen, W., Er, M.J., Wu, S.: PCA and LDA in DCT domain. Pattern Recognition Let-ters 26(15), 2474–2482 (2005)7.Zuo, W., Zhang, D., Wang, K.: An assembled matrix distance metric for 2DPCA-basedimage recognition. Pattern Recognition Letters 27(3), 210–216 (2006)8.Meng, J., Zhang, W.: Volume measure in 2DPCA-based face recognition. Pattern Recog-nition Letters 28(10), 1203–1208 (2007)9.Ahmed, N., Natarajan, T., Rao, K.: Discrete cosine transform. IEEE Trans. on Com-puters 23(1), 90–93 (1974)10.Matlab, The Language of Technical Computing, Version 7 (2004),11.ORL. The ORL face database at the AT&T (Olivetti) Research Laboratory (1992),/facedatabase.html。

KPCA

4.KPCA在电子鼻系统中应用
传感器阵列信号 SVM 多类分类模型
特征提取
输出分类结果
比较KPCA、PCA、ICA、原始特征
4.KPCA在电子鼻系统中应用
数据集 Number of samples in the subset HCHO C6H6 C7H8 Training set 156 Testing set 52 99 33 40 13 CO NH3 NO2 35 12 29 10 18 6
=Σ , X p )T ，记 E ( X ) u ，Cov(X )
Y1 a11 X1 a12 X 2 a1p X p a1T X T Y a X a X a X a 2 21 2 22 2 2p p 2 X Y a X a X a X a T X pp p p p p1 1 p2 2
（4）令Var (Y1 )= ，则 a1
a a Σa1 Σa1 a1
X 的协方差矩阵的特征值，a1 向量。则当最大时，也即 Y1的方差达到最大，因此对应的特征向量 a1就是第一主轴方向，称 Y1 a1T X 为第一主成分。
2.PCA原理分析
5.小结
PCA：
●
线性映射方法，忽略了数据之间高于2阶的相互关系
●
●
基于特征的维度
新特征是原有特征的线性叠加，物理意义明确
KPCA：
● ● ● ●
PCA的非线性扩展算法，采用非线性的方法抽取主成分基于样本的维度（特征的数目为输入样本的维数）新特征物理意义不明确适合于解决非线性特征提取问题
5.小结
小结：求解主成分的过程就是对原始矩阵的协方差矩阵进行特征值分解，并将特征值从大到小进行排序，则前面的k个特征值对应的特征向量就是最佳的k维投影方向。

Principal Manifolds and Probabilistic

N
I
INTRODUCTION
sources. ICA's proficiency in ªblind source separationº [20] has found a particular niche in the analysis of EEG [23] and fMRI [25] signals of the brain. Nonlinear PCA (NLPCA) [21], [12], nonlinear Principal Surfaces [15], [16], ªkernelº PCA [40], and nonlinear latent variable models [14] are various extensions of these linear techniques. In the following section, we will review some of these principal manifolds, their derivation, and consequent statistical properties. In Section 3, an alternative technique using probabilistic subspaces is presented and its performance is compared to principal manifolds in Section 4. We conclude with a discussion of the pros and cons of the different recognition techniques in Section 5.
780
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,

流形学习的理论和方法

内在维数研究
❖ PCA方法基于方差比来确定约简维数 ❖ ISOMAP利用方差损失形成的拐点估计维数 ❖ 其他
最近邻域分形维 Packing Numbers 测地线最小生成树
基于Packing Numbers的内在维数研究
定量化研究
❖ 高维数据集的内在维数如何影响高维空间的流形结构？———没有一般性研究
Isomap algorithm，为Isomap自动选择领域因子 Wang Jing, Zhang Zhenyue, Zha Hongyuan. Adaptive Manifold
Learning，2004，在每个样本点上自适应地选择领域因子张军平，通过集成的方式来改进流形学习产生的不稳定性
❖ 数据流的流行学习
❖ Isomap 建立在MDS 的基础上，力求保持数据点的内在几何性质，即保持两点间的测地距离。
❖ 它同MDS 的最大区别在于，MDS 构造的距离矩阵反映的样本点之间的欧氏距离，而Isomap 构造的距离矩阵反映的是样本点之间的测地距离。
❖ 测地距离的近似计算方法如下：样本点和它的邻域点之间的测地距离用它们之间的欧氏距离来代替；样本点和它邻域外的点用流形上它们之间的最短路径来代替。
流形学习
❖ 基本思想：每个高维空间内的流形都有一个低维空间内的流形与之对应，只要找出一个光滑映射，就可以把高维原数据映射成其低维目标空间内的对应。
❖ 流形的本质是局部化，用数学语言说，就是一个局部可坐标化的拓扑空间。“局部坐标” 可以将问题分解为局部问题进行计算，而拓扑空间又能保证将局部计算结果合理、光滑地拼接起来，揭示问题的整体结构。
流形学习的方法
❖ ISOMAP ❖ LLE ❖ HLLE ❖ LE ❖ LTSA

EEG-based estimation of mental fatigue by using KPCA–HMM and complexity parameters

EEG-based estimation of mental fatigue by using KPCA–HMM and complexity parametersJianping Liu a,Chong Zhang b,Chongxun Zheng b,*a Department of Communication Engineering,The Engineering College of the Armed Police Force,710086Xi’an,Chinab Key Laboratory of Biomedical Information Engineering of Education Ministry,Xi’an Jiaotong University,710049Xi’an,China1.IntroductionMental fatigue is a common physiological phenomenon,and isinevitable for ofﬁce workers in general,which affects theindividual’s life quality on different aspects.It is usuallyaccompanied with a sense of weariness,reduced alertness,andreduced mental performance,which would lead the accidents inlife,decrease productivity in workplace and harm the health.When people become fatigued,they usually experience difﬁcultiesfor maintaining task performance at an adequate level[1].Inindustry,many incidents and accidents are related to mentalfatigue as the result of sustained performance[2].It is important tomanage and cope with mental fatigue so that workers do not harmtheir health.Therefore,the management of mental fatigue isimportant from the viewpoint of occupational risk protection,productivity,and occupational health.To date,many methods have been proposed to estimate themental fatigue.A large number of previous studies use behavioralindices or subjective measures such as reaction time,error ratio orsubjective scales.A recent tendency in ergonomic research is tochoose more objective measures to assess the mental fatigue state.These approaches focus on measuring physiological changes ofpeople,such as the electrooculogram(EOG),respiratory signals,heart beat rate,skin electric potential,and particularly,electroen-cephalographic(EEG)activities as a means of detecting the mentalfatigue states[3,4].Although numerous physiological indicatorsare available to describe an individual’s mental fatigue state,theEEG signals may be the most promising,predictive and reliable one[5,6].The EEG is widely regarded as the physiological‘‘goldstandard’’for the assessment of mental fatigue.In present,many scholars have studied the mental fatigueinduced by one single task,such as driving task,hypoxia,etc.S.K.L.Lal investigated the use of EEG as a fatigue countermeasure duringdriver fatigue[7]. C.Papadelis et ed the nonlinearelectroencephalography parameters,i.e.approximate entropy toassess hypoxia-induced EEG alterations,and found these com-plexity parameters can assess the different hypoxic levels reliablyand effectively[8].B.T.Jap used EEG spectral components to studythe EEG activities change during a monotonous driving session[9].C.Papadelis used the Shannon entropy,the Kullback–Leiblerentropy and the cross-approximate entropy to analyze the EEGdata from sleep-deprived subjects exposed to realﬁeld drivingconditions,and found these EEG parameters can assess effectivelythe brain activity alterations that occur a few seconds beforesleeping/drowsiness events in driving[10,11].However,mentalfatigue is a complex phenomenon and it is affected by manyfactors.Thus,in order to study the sensitivity of nonlinearcomplexity parameters to different types of mental fatigue,wehaveﬁrst designed three different cognitive tasks to induce threeBiomedical Signal Processing and Control5(2010)124–130A R T I C L E I N F OArticle history:Received28February2008Received in revised form30December2009Accepted5January2010Available online10February2010Keywords:Mental fatigueElectroencephalogram(EEG)Approximate entropy(ApEn)Kolmogorov complexity(Kc)KPCA–HMMA B S T R A C TTwo complexity parameters of EEG,i.e.approximate entropy(ApEn)and Kolmogorov complexity(Kc)areutilized to characterize the complexity and irregularity of EEG data under the different mental fatiguestates.Then the kernel principal component analysis(KPCA)and Hidden Markov Model(HMM)arecombined to differentiate two mental fatigue states.The KPCA algorithm is employed to extractnonlinear features from the complexity parameters of EEG and improve the generalization performanceof HMM.The investigation suggests that ApEn and Kc can effectively describe the dynamic complexity ofEEG,which is strongly correlated with mental fatigue.Both complexity parameters are signiﬁcantlydecreased(P<0.005)as the mental fatigue level increases.These complexity parameters may be used asthe indices of the mental fatigue level.Moreover,the joint KPCA–HMM method can effectively reducethe dimensionality of the feature vectors,accelerate the classiﬁcation speed and achieve higherclassiﬁcation accuracy(84%)of mental fatigue.Hence KPCA–HMM could be a promising model for theestimation of mental fatigue.Crown Copyrightß2010Published by Elsevier Ltd.All rights reserved.*Corresponding author.Tel.:+862982669055;fax:+862983237910.E-mail addresses:yuxiaolinzhang@(J.Liu),xlyu.czhang@(C.Zhang),cxzheng@(C.Zheng).Contents lists available at ScienceDirectBiomedical Signal Processing and Controlj o u r n a l h o m e p a g e:w w w.e l se v i e r.co m/l oc a t e/b s pc1746-8094/$–see front matter.Crown Copyrightß2010Published by Elsevier Ltd.All rights reserved.doi:10.1016/j.bspc.2010.01.001different types of mental fatigue,and then used the nonlinear complexity parameters to analyze different type of mental fatigue.In this study,two complexity parameters,i.e.approximate entropy(ApEn)and Kolmogorov complexity(Kc)are used to quantify the complexity and irregularity of EEG data under two mental fatigue states,i.e.one before performing a2-h mental task, and one after.For comparison,Tsallis entropy(TE)is also employed to analyze the change of mental fatigue[16].Then we propose a novel scheme of training Hidden Markov Model(HMM)for estimation of mental fatigue using the complexity parameters in ﬁve frequency bands of EEG.Considering the high-dimensionality and nonlinear nature of EEG data[12–15],the kernel principal component analysis(KPCA)is adopted to extract nonlinear features from the complexity parameters of EEG,and then to train the HMM.Thus,KPCA and HMM are combined to differentiate two mental fatigue pared with previous studies,the presented comprehensive methods would make the mental fatigue estimation much more reliable and accurate since many methods are combined.2.Materials and methods2.1.SubjectsFifty male right-dominated graduate students,between20and 27years old(M=23.0years,SD=1.6),participated in this study. Personal data(handedness,past medical history,medical family history,etc.)were acquired with a standardized interview before EEG recordings.All subjects were in good health.None of them reported on any cardiovascular disease or neurological disorders in the past or had taken any drugs known to affect the EEG.Subjects did not work night shifts and had normal sleep time.All of them were accustomed to use the computer mouse and agreed to join the study.2.2.Experiment and data acquisitionThe experimental tasks were three types of simple cognitive tasks.Theﬁrst type of task was a vigilance task.Three random numbers displayed at the same time on the CRT screen and changed once every second randomly.The subjects were asked to click the right mouse button promptly,as three different odd numbers,such as1,7,9,appeared.Sixteen subjects participated in this experiment.The second type of task was the addition and subtraction arithmetic calculation of four one-digit numbers.They were displayed on a computer monitor continuously until the subject responded.The participants solved the problemsﬁrstly, and then decided whether the result was less than,equal to,or greater than the target sum provided.Sixteen subjects participated in this experiment.The third type of task was a simple switch task.A white square,subdivided into four subsquares,was displayed continuously at the screen center.Stimulus images were presented in turn,and the image was starting from the upper left subsquare with clockwise fashion.The stimulus images were number from zero to nine randomly.The color of the stimulus images was red or blue randomly.Then the subjects should pushed the left or right mouse button related to the image color,respectively,when the stimulus image appeared in either of two upper subsquares,or related to the odd or even number identity if the stimulus appeared in either of two lower subsquares.Eighteen subjects participated in this experiment.All subjects performed the cognitive task until either they quitted from exhaustion or2h elapsed.The response time and the number of error trials,if any,were recorded.Subjects were required to abstain from alcohol and caffeine-containing substances24h before the experiment.Subjects were told the study was aimed at investigating the neural correlates of cognitive control,they were unaware the study was about mental fatigue.To avoid the inﬂuence of circadianﬂuctuations on subjects, the experiments were scheduled to be at the same time session. The experimental session started about8:00.There is no clock or watch in the laboratory.They had no knowledge about experi-mental duration.Subjects were seated in a dimly lit,sound-attenuated, electrically shielded room.Before starting the experiment,the subjects completed a brief demographic questionnaire(age, handedness,hours of sleep,etc.),and ensured that the instructions were understood.First,the psychological self-report measures of sleepiness and fatigue were conducted.Subjective sleepiness was assessed by means of the Stanford Sleepiness Scale and the Karolinska sleepiness scale,and subjective fatigue was measured with the help of the Samn–Perelli checklist,Li’s subjective fatigue scale and Borg’s CR-10scale[4,17–20].Subsequently,the subjects were required to simply relax and try to think of nothing in particular,and recorded the EEG in the eyes-closed resting state for 5min before starting the experimental session.They then performed the cognitive task either until2h elapsed or until volitional exhaustion occurred.Subjects were instructed to respond as quickly as possible,maintaining a high level of accuracy.Similar EEG recording was conducted immediately after the completion of the cognitive task.The same psychological rating was also carried out.The measurements were carried out at two epochs:pre-task,that was before task;post-task,that was immediately after task.EEGs were recorded by a Neuroscan32channel system (Neuroscan,El Paso,TX,USA)with international10–20lead systems.Fp2,Fp1,F4,F3,A2,A1,C4,C3,P4,P3,Fz,Cz and Pz leads were used with Ag/AgCl electrodes.Recordings were referenced to linked-mastoids.Two additional bipolar pairs of electrodes were placed to record horizontal and vertical EOG.Skin impedance was below5k V on all electrodes.Physiological signals wereﬁltered by band passﬁlter with bandwidth from0.01to100Hz.The signal was sampled at500Hz and digitized at16bit.Eye movement contamination was removed from EEG signal by the adaptiveﬁlter based on least mean square algorithm.Artifact rejection is done by visually inspecting the EEG.2.3.Feature extraction based on complexity parametersTwo complexity parameters:approximate entropy(ApEn)and Kolmogorov complexity(Kc)are used to quantify the complexity of EEG under two mental fatigue states[21–25].In the present study,after artifact detection and ocular correction,1-min EEG data of each trial for each subject in the session of pre-task and post-task are selected to be analyzed.The EEG signal is then down re-sampled at250Hz before it is analyzed by using wavelet packet and nonlinear complexity methods.The ﬁrst10s EEG data is chosen as basic data segment and steps by1s data.By shifting the data segment step-by-step for whole trial, 5100data segments are obtained.Wavelet packet analysis is performed to every EEG data segment[26,27].Daubechies10is adopted as the mother wavelet. After eight-octave wavelet packet decomposition,the EEG components of the followingﬁve frequency bands are obtained: total(0.5–30Hz),delta(0.5–3.5Hz),theta(4–7Hz),alpha(8–12Hz),and beta(13–30Hz).Then ApEn and Kc are calculated (Appendices A and B)for all EEG data segments inﬁve frequency bands,respectively.2.4.Reducing the dimensionality of feature vectors based on KPCA algorithmKPCA is a technique of generalizing linear PCA into nonlinear case by using the kernel method[28,29].As a nonlinear featureJ.Liu et al./Biomedical Signal Processing and Control5(2010)124–130125extractor,it has been proven powerful as a preprocessing step for classiﬁcation algorithms.The feature vectors preprocessed by KPCA have lower size,and can improve the generalization and speed of classiﬁcation in the next step.The Gaussian function is selected as the kernel function for KPCA algorithm.Here,we adopt KPCA to extract nonlinear features from complexity parameters of EEG as follows.Given a set of M centered complexity parameters of EEG x k ,k =1,...,M ,x k 2R N ,P Mk ¼1x k ¼0.KPCA is to ﬁrst map each complexity parameter x k into the higher-dimensional feature space F and then compute the covariance matrix:ˆC ¼1M XM i ¼1f ðx i Þf ðx i ÞT(1)Here f ðx i Þis the n centered nonlinear mapping of the input variables x k 2R N in the higher-dimensional feature space F .Here,we have to ﬁnd eigenvalues l !0and nonzero eigenvectors V ,satisfying the equation:l V ¼ˆCV(2)Note that all solutions V with l ¼0lie in the span of mappingsf ðx 1Þ;...;f ðx M Þ.Consequently,the equivalent relation can be written asl ðf ðx k ÞÁV Þ¼ðf ðx k ÞÁˆCVÞfor all k ¼1;...;M (3)Also,there are coefﬁcients a i ði ¼1;...;M Þsuch that V ¼X M i ¼1a i f ðx i Þ(4)Combination of Eqs.(1),(3)and (4)yieldl X M i ¼1a i ðf ðx k ÞÁf ðx i ÞÞ¼1M X M i ¼1a i f ðx k ÞÁX M j ¼1f ðx j Þ0@1A ðf ðx j ÞÁf ðx i ÞÞ8k¼1;...;M(5)Further,we deﬁne an M ÂM kernel matrix K such that K i j ¼ðf ðx i ÞÁf ðx j ÞÞ¼K ðx i ;x j Þ(6)Here,kernel function K (x i ,x j )is introduced so that the mapping off ðx i Þfrom x i is implicit [28,29].K (x i ,x j )is a kernel function satisfying Mercer’s condition [30,31].The Gaussian function is selected as the kernel function in this paper,i.e.K ðx i ;x j Þ¼exp ðÀð1=2s 2Þx i Àx j 2Þ.As K is symmetric,it has a set of eigenvectors which span the complete space,thus:M la ¼K a(7)Therefore,we only need to diagonalize K to obtain the normaliza-tion condition for a p ;...;a M :l k ða k Áa k Þ¼1(8)Finally,we can extract principal components by computing the projection of f ðx Þonto the eigenvector V k in high-dimensional space F ðk ¼p ;...;M Þ.ðV kÁf ðx ÞÞ¼X M i ¼1aki K ðx i ;x Þ(9)Therefore,we only choose the ﬁrst n nonlinear principal components,e.g.the directions which describe a desired percent-age of data variance,and thus work in an n -dimensional sub-space of feature space F .This allows us to construct a novel mentalfatigue classiﬁer,KPCA–HMM called in this study,where a preprocessing layer extracts nonlinear complexity features for classiﬁcation of mental fatigue later.2.5.Classiﬁcation using Hidden Markov Model (HMM)The HMM can be seen as a ﬁnite automaton,containing s discrete states,emitting a feature vector at every time point depending on the current state.Each feature vector is modeled using m Gaussian mixtures per state.The transition probabilities between states are described using a transition matrix.During the training phase the expectation maximization (EM)algorithm introduced by Dempster et al.[32]is used to estimate the transition matrix and the Gaussian mixtures.Based on randomly selected values for the transition matrix and an initial estimation of the mixtures the EM algorithm is performed.The estimation formulas guarantee a monotonic increase of the likelihood P ðv j HMM Þuntil reaching a local or global maximum,which end the training phase.The Gaussian mixtures are approximated based on a k -means clustering of the feature vectors.The clustering is performed using the Euclidean distance,which necessarily need feature vector components with a mean and variance within the same numerical range.The mean and variance of all feature vectors belonging to one cluster are then used to model the Gaussian mixtures with a diagonal covariance matrix.This modeling is feasible just for the noncorrelated feature vector components.In order to meet both requirements of normalized and not correlated data,the whitening transformation [33]is performed.The original data V ¼ðv ð1Þ;v ð2Þ;...;v ðT ÞÞof length T is transformed into V ¼ðv ð1Þ;v ð2Þ;...;v ðT ÞÞusing ¼FD1=2V (10)where F and D the eigenvector and eigenvector matrices,respectively,of the covariance matrix of V .Two HMMs,one representing the norm state (HMM N )and one representing the fatigue state (HMM F )are trained by using the EEG data segments recorded during the corresponding mental fatigue states.The parameters of the models are estimated by the given training data and are then used to classify the same training data.Finally,HMM N and HMM F are estimated by using the correct classiﬁed trials.Classiﬁcation of an unknown EEG data segment is based on a selection of the maximum single best path probability P p ðHMM Þcalculated via the Viterbi algorithm [34].Calculating P p ðV j HMM N Þand P p ðV j HMM F Þfor all EEG segments will result in a propagation of these probabilities,which allows us to make classiﬁcation sample by sample.Cross-validation is a commonly used standard test to test the classiﬁcation ability by using various combinations of the testing and training data sets [35,36].A 5-fold cross-validation test is applied.We randomly select 80%of recording data sets for training the classiﬁer,and the 20%remain for testing,the classiﬁcation accuracy is calculated with different random selections of training and testing set.To classify the test vectors given by our 5-fold cross-validation scheme,the likelihood of them to belong to each of two HMMs is calculated.The one having more likelihood is assigned to that mental fatigue state.Three measures,accuracy (Ac),speciﬁcity (Sp)and sensitivity (Se)are used to assess the performance of ﬁve classiﬁers [37]:Accuracy ¼TP þTNTP þFP þTN þFN Â100%(11)Specificity ¼TNTN þFP Â100%(12)Sensitivity ¼TPTP þFNÂ100%(13)J.Liu et al./Biomedical Signal Processing and Control 5(2010)124–130126where TP is the number of true positives,i.e.the HMM identiﬁes a norm state as norm;TN is the number of true negatives,i.e.HMM recognizes a fatigue state as fatigue;FP is false norm identiﬁca-tions;FN is false fatigue identiﬁcation.Accuracy indicates overall detection accuracy;speciﬁcity is deﬁned as the ability of the classiﬁer to recognize a fatigue state whereas sensitivity will indicate the classiﬁer’s ability not to generate a false detection (normal state).Fig.1shows the schematic diagram for KPCA–HMM.3.Results3.1.Subjective evaluation of mental fatigueThe results of comparison of several subjective scores between two sessions are shown in Fig.2.The self-report questionnaires reveal that subjects are not fatigue and sleepy before task and moderately to extremely fatigue and sleepy after pared with the pre-task,the subjective scores increase signiﬁcantly (P <0.005)after the completion of the task,which indicates that continuous long-term cognitive task leads to an increase in fatigue and plexity parameters of EEGTo eliminate the inﬂuences of parameter’s ﬂuctuation,the mean value within 1min is calculated to be statistically analyzed.The results of comparison of ApEn and Kc in total,alpha and beta frequency band between two sessions are shown in Fig.3.Compared with the pre-task,mean value of ApEn and Kc in total frequency band on all electrodes signiﬁcantly decrease (P <0.005),mean value of ApEn and Kc in beta frequency band on parietal electrodes signiﬁcantly decrease (P <0.05),mean value of ApEn in alpha frequency band on prefrontal electrodes signiﬁcantly decrease (P <0.05),while mean value of Kc in alpha frequency band on all electrodes signiﬁcantly decrease (P <0.05)after thecompletion of the task.However,mean value of ApEn and Kc in theta and delta frequency band on all electrodes do not change signiﬁcantly,and Tsallis entropy (TE )in ﬁve frequency band on all electrodes does not change signiﬁcantly.3.3.The classiﬁcation results by KPCA–HMMThe experimental results of subjective measure show that the level of both subjective sleepiness and fatigue increases signiﬁ-cantly after long-term cognitive work.The subjects are not fatigue and sleepy before task,correspond to a normal arousal state,and moderately to extremely fatigue and sleepy after task.In order to differentiate the normal state from fatigue state,after extracting the complexity parameters,KPCA–HMM is applied.The classiﬁca-tion accuracy is observed under the condition of the various extraction features using KPCA and PCA,respectively.The average classiﬁcation accuracies for two different HMMs are shown in Fig.4.Fig.4illustrates the extent,to which change in the selection of the number of feature dimension can inﬂuence classiﬁcation accuracy.The accuracy varies with the different number of the feature dimensions.When the dimensionality is more than 5,KPCA–HMM shows higher classiﬁcation accuracy than 82%.The maximal classiﬁcation accuracy (84%)is reached while the number of feature dimensions equals to 17.Moreover,the performance of KPCA–HMM exceeds that of PCA–HMM to a great extent.The average classiﬁcation time of KPCA–HMM was decreased greatly compared with HMM.In Table 1,the performances of three HMMs based on ApEn and Kc of EEG are compared to that of HMM based on TE .According to the records,the TE indices of EEG cannot differentiate mental fatigue effectively.Its classiﬁcation accuracy of mental fatigue is below 65%.However,the performance of KPCA–HMM based on ApEn and Kc of EEG is shown to classify mental fatigue effectively,which achieves the maximum classiﬁcation accuracy of 84%.Moreover,we observe that the Ac,Sp and Se of KPCA–HMM are great more than PCA–HMM.This demonstrates KPCA–HMM based on ApEn and Kc of EEG is a useful method for detecting the mental fatigue.4.DiscussionThe rhythm of EEG changes with the difference of activity of brain cortex.When the pace of electric activity of many neurons in brain cortex tends to be uniform,the rhythm with low frequency and high amplitude will appear.This phenomenon is called synchronization;when the pace of electric activity of many neurons is nonuniform,brain cortex will be characterized by the rhythm with high frequency and low amplitude.This phenomenon is called desynchronization.In general,desynchronization indi-cates that the excitement of the cortex is increased.Whereas synchronization indicates that the cortex progresses to the restrained process [38].EEG desynchronization is induced by information input which implies a lesser coordination between the ongoing EEG processes and more independent neural processes contribute to complex brain dynamics so that the desynchronized EEG shows the more irregular behavior [39,40].AsynchronizedFig.1.Schematic diagram for KPCA–HMM:(1)eye movement contamination is removed by adaptive ﬁltering methods;(2)features are extracted using complexity measures;(3)KPCA algorithm is used to reduce the dimensionality of features;(4)two HMMs are trained,HMM N corresponds to the norm state,and HMM F corresponds to the fatigue state;(5)the ﬁnal mental state is decided by the likelihood score of twoHMMs.parison of several subjective scores on mental fatigue between two sessions:pre-task (before task)and post-task (immediately after task).SSS,Stanford sleepiness scale;KSS,Karolinska sleepiness scale;SPC,Samn–Perelli checklist;SFS,Li’s subjective fatigue scale;CR-10,Borg’s CR-10scale.**P <0.005,statistical signiﬁcance of difference between two sessions.J.Liu et al./Biomedical Signal Processing and Control 5(2010)124–130127EEG marks speciﬁc cortical areas at rest or an idling state [40,41],which implies a more coordination between the ongoing processes so that the synchronized EEG shows the more regular behavior.Different from the classical band power method,ApEn and Kc characterize the complexity of EEG and reﬂect the desynchronized and synchronized processes from another viewpoint.In the present study the mean value of ApEn and Kc in total frequency band on all electrodes signiﬁcantly decrease after long-term cognitive work,which indicates that mental fatigue results in an increase in the synchronization degree of brain cortex,a shift to slow,high amplitude waves in the EEG and a decrease in cortical activation.The two complexity parameters ApEn and Kc of EEG show the consistent behaviors,which indicates that both complexity parameters of EEG can well quantify the complexity change of the dynamic EEG during mental fatigue.In addition,since the EEG signals were recorded in the eyes-closed resting state,alpha frequency band is the basic rhythm of EEG.So the ApEn and Kc of alpha frequency band can reﬂect the change of the total excitation and inhibition level for the brain effectively.In this study,the change rule of alpha frequency band is consistent with the total frequency band of EEG.Previous studies showed that the change of beta frequency band is related with mental fatigue closely.The beta frequency band is generally considered as the fast wave,which is related to alertness level.Some previous study results demonstrated that the activity of beta band decreases signiﬁcantly as the alertness level declines after long-term monotonous driving task or sleep deprivation [9,42].In this study,the ApEn and Kc of beta frequency band show a signiﬁcant change after long time sustained cognitive task.These results indicate that the subjects’alertness level declines greatly,and the excitement level of brain decreases after the completion of the task.It is probably the complexity change of alpha and beta frequency bands that results in the complexity change of total frequency band of EEG.EEG data is a nonstationary time series data which contains noise and artifacts.Since TE is only a generalization of Shannon entropy,it cannot effectively characterize the complexity change of EEG in mental fatigue.However,as nonlinear complexity measures,ApEn and Kc could effectively reveal the regularity and randomness in a time varying EEG arising from the brain system and gain the information regarding the dynamics of the speciﬁc regional brain subsystem.Moreover,they are more suitableforparison of ApEn and Kc in total,alpha and beta frequency band between two sessions:(a),(c)and (e)describe the complexity change of EEG by ApEn in total,alpha and beta frequency band between two sessions,respectively;(b),(d)and (f)quantify the complexity change of EEG by Kc in total,alpha and beta frequency band between two sessions,respectively:pre-task (before task)and post-task (immediately after task).Data are presented as mean ÆSEM.*P <0.05,**P <0.005,statistical signiﬁcance of difference between two sessions.J.Liu et al./Biomedical Signal Processing and Control 5(2010)124–130128。

【豆丁-热门】-基于DWT_2DPCA和KPCA的人脸识别

基于DWT,2DPCA 和KPCA 的人脸识别甘俊英,李高尚(五邑大学信息学院　广东江门　529020)摘　要:利用离散小波变换对人脸图像进行压缩,提取人脸的低频分量,有效去除人脸图像高频分量的影响;再利用二维主元分析对小波变换后的人脸低频分量实行提取特征;然后使用核主元分析再次提取特征;最后用最小距离分类器完成人脸识别。

基于ORL 人脸数据库的实验结果表明,该算法能提高人脸识别率,有效减少计算量和降低计算复杂度。

关键词:小波变换;2DPCA 算法;KPCA 算法;人脸识别中图分类号:TP391.41 文献标识码:B 文章编号:1004-373X (2009)20-051-03F ace R ecognition B ased on DWT,2DPCA and KPCAGAN J unying ,L I Gaoshang(School of Information ,Wuyi University ,Jiangmen ,529020,China )Abstract :In this paper ,human face image is compressed and low -f requency component is extracted by way of Discrete Wavelet Transform (DWT ).In this way ,the influence of high -f requency component of human face image is discarded effec 2tively.Then the features of low -f requency component are extracted by way of two dimensional principal component analysis (2DPCA ).In the mean time ,Kernel Principal Component Analysis (KPCA )is applied in feature extraction.Finally face recognition is performed by the Nearest Neighbor classifier.Experimental results on Olivetti Research Laboratory (ORL )face database show that face recognition rate is increased ,amount of computation and the complexity are reduced.K eywords :wavelet transform ;2DPCA algorithm ;KPCA algorithm ;face recognition收稿日期:2009-04-28基金项目:广东省自然科学基金资助(07010869,032356)0　引　言近年来,人脸识别技术得到了很大发展,许多优秀的方法和算法相继被提出[1,2]。

PCA，K-PCA，ICA你真的知道吗？

PCA，K-PCA，ICA你真的知道吗？今天我们给大家介绍下PCA，K-PCA以及ICA之间有什么关系，同时在R语言如何实现这几个模型。

主成分分析（PCA），是一种常用的数据分析方法。

PCA通过线性变换将原始数据变换为一组各维度线性无关的表示，可用于取主成分（主要信息），摒弃冗余信息（次要信息），常用于高维数据的降维。

本质是将方差最大的方向作为主要特征，并且在各个正交方向上将数据“离相关”，也就是让它们在不同正交方向上没有相关性。

主要应用于高斯分布的线性数据的降维。

核主成分分析（K-PCA），是PCA的升级版主要是解决线性数据的限制，它可以将非线性可分的数据转换到一个适合对齐进行线性分类的新的低维子空间上。

其本质同PCA。

独立成分分析（ICA），指在只知道混合信号，而不知道源信号、噪声以及混合机制的情况下，分离或近似地分离出源信号的一种分析过程；是盲信号分析领域的一个强有力方法，也是求非高斯分布数据隐含因子的方法ICA与PCA区别：1）PCA是将原始数据降维并提取出不相关的属性，而ICA是将原始数据降维并提取出相互独立的属性。

2）PCA目的是找到这样一组分量表示，使得重构误差最小，即最能代表原事物的特征。

ICA的目的是找到这样一组分量表示，使得每个分量最大化独立，能够发现一些隐藏因素。

由此可见，ICA的条件比PCA更强些。

3）ICA要求找到最大独立的方向，各个成分是独立的；PCA要求找到最大方差的方向，各个成分是正交的。

4）ICA认为观测信号是若干个统计独立的分量的线性组合，ICA 要做的是一个解混过程。

而PCA是一个信息提取的过程，将原始数据降维，现已成为ICA将数据标准化的预处理步骤。

接下来我们介绍下这几种算法在R语言如何实现：1. PCA的实现需要安装包graphics,其中的核心函数是prcomp。

具体的操作步骤大家可以参考我们前期的教程《R语言之主成分分析》。

2. KPCA的实现需要安装包BKPC,其中的kPCA函数可以实现核主成分分析。

Nonlinear component analysis as a Kernel eigenvalue problem

M X j
=1
xj x>: j
(1)
To do this, one has to solve the eigenvalue equation v = Cv (2) P for eigenvalues 0 and v 2 RN nf0g. As C v = M M (xj v)xj , all solutions v j with 6= 0 must lie in the span of x : : : xM , hence (2) in that case is equivalent to (xk v) = (xk C v) for all k = 1; : : : ; M: (3) In the remainder of this section, we describe the same computation in another dot product space F , which is related to the input space by a possibly nonlinear map : RN ! F; x 7! X: (4) Note that F , which we will refer to as the feature space, could have an arbitrarily large, possibly in nite, dimensionality. Here and in the following, upper case characters are used for elements of F , while lower case characters denote elements of RN . P Again, we assume that we are dealing with centered data, i.e. M (xk ) = 0 | k we shall return to this point later. Using the covariance matrix in F , M X (x ) (x )>; (5) C= 1

核PCA特征提取方法及其应用研究

II
南京航空航天大学硕士学位论文
图表清单
图 2.1 ORL 人脸数据库中的部分人脸图像…………………………………………………………13 图 3.1 核方法框架示意图……………………………………………………………………………15 图 3.2 KPCA 算法流程图………………………………………………………………………………19 图 4.1 图像检测系统原理图…………………………………………………………………………23 图 4.2 VS-808HC 微型彩色高清工业摄像机实物图…………………………………………………25 图 4.3 MV-E8800 PCI-E 8 路高清实时图像采集卡实物图…………………………………………25 图 4.4 PCA 与 KPCA 各主元累积贡献率比较图………………………………………………………28 图 5.1 航空发动机主滑油滤位置示意图……………………………………………………………32 图 5.2 滑油滤及金属屑图像…………………………………………………………………………33 图 5.3 滑油滤图像……………………………………………………………………………………33 图 5.4 滑油滤清洗和磨屑收集系统结构原理图……………………………………………………35 图 5.5 滑油滤清洗和磨屑收集系统实物图…………………………………………………………35 图 5.6 滑油滤清洗装备结构设计图…………………………………………………………………36 图 5.7 带磁环的滑油滤磨屑图像……………………………………………………………………37 图 5.8 取下磁环的滑油滤磨屑图像…………………………………………………………………37 图 5.9 DH－HV02 系列数字摄像机实物图……………………………………………………………37 图 5.10 正常滑油滤典型图像 ………………………………………………………………………38 图 5.11 异常滑油滤典型图像 ………………………………………………………………………38 图 5.12 PCA 与 KPCA 各主元累积贡献率比较图……………………………………………………39 图 6.1 ZT-3 多功能转子模拟实验台…………………………………………………………………43 图 6.2 ZT-3 多功能转子故障模拟实验台……………………………………………………………44 图 6.3 实验装置信号采集原理图……………………………………………………………………44 图 6.4 不平衡故障信预处理后的频谱(100 个点) …………………………………………………47 图 6.5 不对中故障信预处理后的频谱(100 个点)…………………………………………………47 图 6.6 碰摩故障信预处理后的频谱(100 个点)……………………………………………………47 图 6.7 油膜涡动故障信预处理后的频谱(100 个点)………………………………………………47 图 6.8 PCA 的主元方向特征矢量的投影图…………………………………………………………50 图 6.9 KPCA1 的主元方向特征矢量的投影图………………………………………………………50 图 6.10 KPCA2 的主元方向特征矢量的投影图………………………………………………………50 图 6.11 KPCA3 的主元方向特征矢量的投影图………………………………………………………50

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

a Institute
Abstract Recently, support vector machine (SVM) has become a popular tool in time series forecasting. In developing a successful SVM forecastor, the ÿrst step is feature extraction. This paper proposes the applications of principal component analysis (PCA), kernel principal component analysis (KPCA) and independent component analysis (ICA) to SVM for feature extraction. PCA linearly transforms the original inputs into new uncorrelated features. KPCA is a nonlinear PCA developed by using the kernel methodห้องสมุดไป่ตู้ In ICA, the original inputs are linearly transformed into features which are mutually statistically independent. By examining the sunspot data, Santa Fe data set A and ÿve real futures contracts, the experiment shows that SVM by feature extraction using PCA, KPCA or ICA can perform better than that without feature extraction. Furthermore, among the three methods, there is the best performance in KPCA feature extraction, followed by ICA feature extraction. c 2003 Elsevier B.V. All rights reserved.
0925-2312/03/$ - see front matter c 2003 Elsevier B.V. All rights reserved. doi:10.1016/S0925-2312(03)00433-8
322
L.J. Cao et al. / Neurocomputing 55 (2003) 321 – 336
Neurocomputing 55 (2003) 321 – 336 /locate/neucom
A comparison of PCA, KPCA and ICA for dimensionality reduction in support vector machine
Keywords: Support vector machines; Principal component analysis; Kernel principal component analysis; Independent component analysis
∗
Corresponding author. E-mail address: caolj@.sg (L.J. Cao).
1. Introduction Recently, support vector machine (SVM) has become a popular tool in time series forecasting [18–20,26,27], due to its remarkable characteristics such as good generalization performance, the absence of local minima and the sparse representation of solution. Unlike most of the traditional methods which implement the Empirical Risk Minimization Principal, SVM implements the Structural Risk Minimization Principal which seeks to minimize an upper bound of the generalization error rather than minimize the training error [34]. This eventually results in better generalization performance in SVM than other traditional methods. As the training of SVM is equivalent to solving a linearly constrained convex quadratic programming problem, the solution of SVM is always global optimal and absent from local minima. Actually, the solution is determined by only support vectors which are a subset of training data points, so it can be represented sparsely. In developing a SVM forecastor, the ÿrst important step is feature selection (new features are selected from the original inputs) or feature extraction (new features are transformed from the original inputs). In the modeling, all available indicators can be used as the inputs of SVM, but irrelevant features or correlated features could deteriorate the generalization performance of SVM. In the framework of SVM, several approaches for feature selection are also available. In [3], Bradley and Mangasarian ÿnd that SVM with 1-norm regularized term are an indirect approach of feature selection. In [9], the recursive feature elimination method is proposed to SVM for feature selection. Weston et al. [36] also propose the gradient descent method to SVM for feature selection. In our previous work [28,29], saliency analysis and genetic algorithm are also proved to be useful for selecting important features in SVM. In summary, all these approaches are in the domain of feature selection. Principal component analysis (PCA) is a well-known method for feature extraction. By calculating the eigenvectors of the covariance matrix of the original inputs, PCA linearly transforms a high-dimensional input vector into a low-dimensional one whose components are uncorrelated. Nonlinear PCA has also been developed by using di erent algorithms [6]. Kernel principal component analysis (KPCA) is one type of nonlinear PCA developed by generalizing the kernel method into PCA [23]. The kernel method is originally used for SVM. Later, it has been generalized into many algorithms having the term of dot products such as PCA. Speciÿcally, KPCA ÿrstly maps the original inputs into a high-dimensional feature space using the kernel method and then calculates PCA in the high-dimensional feature space. The linear PCA in the high-dimensional feature space corresponds to a nonlinear PCA in the original input space. Recently, another linear transformation method called independent component analysis (ICA) is also developed [4]. Instead of transforming uncorrelated components, ICA attempts to achieve statistically independent components in the transformed vectors. ICA is originally developed for blind source separation. Later, it has been generalized for feature extraction [2,10,13]. The purpose of this paper is to compare the performance of PCA, KPCA and ICA for feature extraction in the context of SVM. By using one of the three methods, the original higher-dimensional inputs will be transformed into other lower-dimensional