首页 | 本学科首页   官方微博 | 高级检索  
   检索      

一种基于线性流形的基因表达数据的聚类方法
引用本文:黎刚果,王正志,王广云,倪青山,强波.一种基于线性流形的基因表达数据的聚类方法[J].国防科技大学学报,2010,32(4):150-156.
作者姓名:黎刚果  王正志  王广云  倪青山  强波
作者单位:国防科技大学,机电工程与自动化学院,湖南,长沙,410073
基金项目:国家自然科学基金资助项目 
摘    要:由于基因表达数据的稀疏性和噪声性,传统聚类算法对其聚类时不能取得好的效果。针对这一问题,一种新的线性流形方法被提出,它的基本思想是搜索数据集中的线流形聚类,再将其中某些线流形聚类融合构造高维流形聚类。该算法将切向距离和法向距离作为线性流形的距离度量,运用空间近邻信息,采用聚类基因的平均表达水平作为转移向量,提高了聚类的准确度。实验结果表明,该算法的聚类准确性优于其它聚类算法,并且对带有噪声的数据可以保持较高的聚类准确度;在对Hela基因表达数据聚类时,算法得到了具有显著生物学意义的聚类。这些都说明提出的算法对基因表达数据聚类的适用性和有效性。

关 键 词:基因表达数据  线性流形  子空间聚类  线流形
收稿时间:1/5/2010 12:00:00 AM

A Clustering Method for Gene Expression Data Based on Linear Manifold
LI Gangguo,WANG Zhengzhi,WANG Guangyun,NI Qingshan and QIANG Bo.A Clustering Method for Gene Expression Data Based on Linear Manifold[J].Journal of National University of Defense Technology,2010,32(4):150-156.
Authors:LI Gangguo  WANG Zhengzhi  WANG Guangyun  NI Qingshan and QIANG Bo
Institution:College of Mechatronics Engineering and Automation, National Univ. of Defense Technology, Changsha 410073, China;College of Mechatronics Engineering and Automation, National Univ. of Defense Technology, Changsha 410073, China;College of Mechatronics Engineering and Automation, National Univ. of Defense Technology, Changsha 410073, China;College of Mechatronics Engineering and Automation, National Univ. of Defense Technology, Changsha 410073, China;College of Mechatronics Engineering and Automation, National Univ. of Defense Technology, Changsha 410073, China
Abstract:Conventional clustering methods fail to obtain good clustering performances for gene expression data due to the inherent sparsity of data and the existence of noise. A new linear manifold clustering method was proposed to address this problem. The basic idea of this method is to search the line manifold clusters hidden in datasets and then fuse some of the line manifold clusters to construct higher dimensional manifold clusters. The method considers the orthogonal distance and the tangent distance as the linear manifold distance metrics, utilizes spatial neighbor information and takes the real gene expression profile as the transition vector. The experimental results show the superiority of this method over other competing clustering methods in terms of clustering accuracy and the anti-noise capability of this method. Moreover, the proposed method is able to obtain some clusters with significant biological meaning for Hela gene expression data. All these demonstrate the method proposed is suitable and valid for the gene expression data clustering.
Keywords:
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《国防科技大学学报》浏览原始摘要信息
点击此处可从《国防科技大学学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号