Title :
Microarray Time-Series Data Clustering Using Rough-Fuzzy C-Means Algorithm
Author :
Maji, Pradipta ; Paul, Sushmita
Author_Institution :
Machine Intell. Unit, Indian Stat. Inst., Kolkata, India
Abstract :
Clustering is one of the important analysis in functional genomics that discovers groups of co-expressed genes from microarray data. In this paper, the application of rough-fuzzy c-means (RFCM) algorithm is presented to discover co-expressed gene clusters. One of the major issues of the RFCM based microarray data clustering is how to select initial prototypes of different clusters. To overcome this limitation, a method is proposed to select initial cluster centers. It enables the RFCM algorithm to converge to an optimum or near optimum solutions and helps to discover co-expressed gene clusters. A method is also introduced based on Dunn´s cluster validity index to identify optimum values of different parameters of the initialization method and the RFCM algorithm. The effectiveness of the RFCM algorithm, along with a comparison with other related methods, is demonstrated on five yeast gene expression time-series data sets using Silhouette index, Davies-Bouldin index, and gene ontology based analysis.
Keywords :
fuzzy set theory; genomics; pattern clustering; rough set theory; time series; Davies-Bouldin index; Dunn cluster validity index; Silhouette index; functional genomics; gene ontology based analysis; microarray time-series data clustering; rough-fuzzy c-means algorithm; Algorithm design and analysis; Approximation methods; Clustering algorithms; Gene expression; Indexes; Ontologies; Partitioning algorithms; Clustering; Fuzzy Sets; Microarray; Rough Sets;
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2011 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4577-1799-4
DOI :
10.1109/BIBM.2011.14