DocumentCode :
742379
Title :
Hybrid Fuzzy Cluster Ensemble Framework for Tumor Clustering from Biomolecular Data
Author :
Zhiwen Yu ; Hantao Chen ; Jane You ; Guoqiang Han ; Le Li
Author_Institution :
High Educ. Megacenter, South China Univ. of Technol., Guangzhou, China
Volume :
10
Issue :
3
fYear :
2013
Firstpage :
657
Lastpage :
670
Abstract :
Cancer class discovery using biomolecular data is one of the most important tasks for cancer diagnosis and treatment. Tumor clustering from gene expression data provides a new way to perform cancer class discovery. Most of the existing research works adopt single-clustering algorithms to perform tumor clustering is from biomolecular data that lack robustness, stability, and accuracy. To further improve the performance of tumor clustering from biomolecular data, we introduce the fuzzy theory into the cluster ensemble framework for tumor clustering from biomolecular data, and propose four kinds of hybrid fuzzy cluster ensemble frameworks (HFCEF), named as HFCEF-I, HFCEF-II, HFCEF-III, and HFCEF-IV, respectively, to identify samples that belong to different types of cancers. The difference between HFCEF-I and HFCEF-II is that they adopt different ensemble generator approaches to generate a set of fuzzy matrices in the ensemble. Specifically, HFCEF-I applies the affinity propagation algorithm (AP) to perform clustering on the sample dimension and generates a set of fuzzy matrices in the ensemble based on the fuzzy membership function and base samples selected by AP. HFCEF-II adopts AP to perform clustering on the attribute dimension, generates a set of subspaces, and obtains a set of fuzzy matrices in the ensemble by performing fuzzy c-means on subspaces. Compared with HFCEF-I and HFCEF-II, HFCEF-III and HFCEF-IV consider the characteristics of HFCEF-I and HFCEF-II. HFCEF-III combines HFCEF-I and HFCEF-II in a serial way, while HFCEF-IV integrates HFCEF-I and HFCEF-II in a concurrent way. HFCEFs adopt suitable consensus functions, such as the fuzzy c-means algorithm or the normalized cut algorithm (Ncut), to summarize generated fuzzy matrices, and obtain the final results. The experiments on real data sets from UCI machine learning repository and cancer gene expression profiles illustrate that 1) the proposed hybrid fuzzy cluster ensemble frameworks work well on real d- ta sets, especially biomolecular data, and 2) the proposed approaches are able to provide more robust, stable, and accurate results when compared with the state-of-the-art single clustering algorithms and traditional cluster ensemble approaches.
Keywords :
bioinformatics; fuzzy set theory; genetics; learning (artificial intelligence); molecular biophysics; pattern clustering; tumours; UCI machine learning repository; affinity propagation algorithm; biomolecular data; cancer gene expression profiles; ensemble generator; fuzzy c-means algorithm; fuzzy matrices; hybrid fuzzy cluster ensemble framework; normalized cut algorithm; tumor clustering; Cluster ensemble; cancer discovery; gene expression profiles; tumor clustering;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2013.59
Filename :
6517195
Link To Document :
بازگشت