当前位置:文档之家› 基于子空间聚类的高维数据可视分析方法综述

基于子空间聚类的高维数据可视分析方法综述

2018,54(13)基于子空间聚类的高维数据可视分析方法综述

田帅,陈谊

TIAN Shuai,CHEN Yi

北京工商大学计算机与信息工程学院食品安全大数据技术北京市重点实验室,北京100048

Beijing Key Laboratory of Big Data Technology for Food Safety,School of Computer and Information Engineering,Beijing Technology and Business University,Beijing 100048,China

TIAN Shuai,CHEN Yi.A survey of high dimensional data visual analysis methods based on subspace https://www.doczj.com/doc/8112004731.html,puter Engineering and Applications,2018,54(13):19-26.

Abstract :With the rapid development of information technology and the advent of big data era,the data show the com-plex features of high dimensionality and nonlinearity.For high-dimensional data,it is often difficult to find feature regions that reflect distribution patterns in full-dimensional space,but most of the traditional clustering algorithms only have good scalability for low-dimensional data.Therefore,when the traditional clustering algorithm processes high-dimensional data,the clustering results may not meet the needs of the current stage.The subspace clustering algorithm searches for clusters existing in the high-dimensional data subspace,and divides the original feature space of data into different subsets of fea-tures to reduce the influence of uncorrelated features and preserve the main features in the original data.The subspace clustering method can find the information that is not easy to show in high-dimensional data and display the internal struc-ture of data attributes and dimensions through visualization techniques,which provides an effective method for visual analysis of high-dimensional data.This paper summarizes the research progress of high-dimensional data visual analysis methods based on subspace clustering in recent years,and elaborates three different methods based on feature selection,subspace exploration and subspace clustering.Then,the methods and applications of its interaction analysis are analyzed,and the future development trends of visual analysis methods of high-dimensional data are prospected.

Key words :high dimensional data;visual analysis;subspace exploration;subspace clustering

摘要:随着信息技术的飞速发展和大数据时代的来临,数据呈现出高维性、非线性等复杂特征。对于高维数据来说,在全维空间上往往很难找到反映分布模式的特征区域,而大多数传统聚类算法仅对低维数据具有良好的扩展性。因此,传统聚类算法在处理高维数据的时候,产生的聚类结果可能无法满足现阶段的需求。而子空间聚类算法搜索存在于高维数据子空间中的簇,将数据的原始特征空间分为不同的特征子集,减少不相关特征的影响,保留原数据中的主要特征。通过子空间聚类方法可以发现高维数据中不易展现的信息,并通过可视化技术展现数据属性和维度的内在结构,为高维数据可视分析提供了有效手段。总结了近年来基于子空间聚类的高维数据可视分析方法研究进展,从基于特征选择、基于子空间探索、基于子空间聚类的3种不同方法进行阐述,并对其交互分析方法和应用进行分析,同时对高维数据可视分析方法的未来发展趋势进行了展望。

关键词:高维数据;可视分析;子空间探索;子空间聚类

文献标志码:A 中图分类号:TP391doi :10.3778/j.issn.1002-8331.1802-0186

基金项目:“十二五”国家科技支撑计划(No.2012BAD29B01-2);国家科技基础性工作专项(No.2015FY111200);北京市科技计划

课题(No.Z151100001615041);虚拟现实技术与系统国家重点实验室开放基金(No.BUAA-VR-17KF-07);2018年研究生科研能力提升计划项目。

作者简介:田帅(1991—),男,硕士研究生,CCF 学生会员,主要研究领域为可视化与可视分析;陈谊(1963—),通讯作者,女,博

士,教授,博士生导师,CCF 杰出会员,主要研究领域为信息可视化与可视分析、食品安全数据分析、虚拟现实等,E-mail :chenyi@https://www.doczj.com/doc/8112004731.html, 。

收稿日期:2018-02-26修回日期:2018-05-15文章编号:1002-8331(2018)13-0019-08

Computer Engineering and Applications 计算机工程与应用

19

万方数据

相关主题
文本预览
相关文档 最新文档