DocumentCode
3074429
Title
A Novel Approach for Automatic Number of Clusters Detection in Microarray Data Based on Consensus Clustering
Author
Vinh, Nguyen Xuan ; Epps, Julien
Author_Institution
Sch. of Electr. Eng. & Telecommun., Univ. of New South Wales, Sydney, NSW, Australia
fYear
2009
fDate
22-24 June 2009
Firstpage
84
Lastpage
91
Abstract
Estimating the true number of clusters in a data set is one of the major challenges in cluster analysis. Yet in certain domains,knowing the true number of clusters is of high importance. For example, in medical research, detecting the true number of groups and sub-groups of cancer would be of utmost importance for their effective treatment. In this paper we propose a novel method to estimate the number of clusters in a micro array data set based on the consensus clustering approach. Although the main objective of consensus clustering is to discover a robust and high quality cluster structure in a data set, closer inspection of the set of clusterings obtained can often give valuable information about the appropriate number of clusters present. More specifically, the set off clusterings obtained when the specified number of clusters coincides with the true number of clusters tends to be less diverse.To quantify this diversity we develop a novel index, namely the Consensus Index (CI), which is built upon a suitable clustering similarity measure such as the well known Adjusted Rand Index (ARI)or our recently developed, information theoretic based index, namely the Adjusted Mutual Information (AMI). Our experiments on both synthetic and real microarray data sets indicate that the CI is a useful indicator for determining the appropriate number of clusters.
Keywords
information theory; pattern clustering; adjusted mutual information; adjusted rand index; automatic number; cluster analysis; clustering similarity measure; clusters detection; consensus clustering approach; consensus index; high quality cluster structure; information theoretic based index; microarray data set; robust cluster structure; Bioinformatics; Biomedical engineering; Cancer detection; Clustering algorithms; Clustering methods; Inspection; Medical treatment; Mutual information; Robustness; Shape; adjusted mutual information (AMI); gene clustering; model selection; number of cluster detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Bioinformatics and BioEngineering, 2009. BIBE '09. Ninth IEEE International Conference on
Conference_Location
Taichung
Print_ISBN
978-0-7695-3656-9
Type
conf
DOI
10.1109/BIBE.2009.19
Filename
5211310
Link To Document