Title :
Comparative study on dimension reduction techniques for cluster analysis of microarray data
Author :
Araújo, Daniel ; Neto, Adrião Dória ; Martins, Allan ; Melo, Jorge
Author_Institution :
Dept. of Comput. & Autom., Fed. Univ. of Rio Grande do Norte, Natal, Brazil
fDate :
July 31 2011-Aug. 5 2011
Abstract :
This paper proposes a study on the impact of the use of dimension reduction techniques (DRTs) in the quality of partitions produced by cluster analysis of microarray datasets. We tested seven DRTs applied to four microarray cancer datasets and ran four clustering algorithms using the original and reduced datasets. Overall results showed that using DRTs provides a improvement in performance of all algorithms tested, specially in the hierarchical class. We could see that, despite Principal Component Analysis (PCA) being the most widely used DRT, its was overcome by other nonlinear methods and it did not provide a substantial performance increase in the clustering algorithms. On the other hand, t-distributed Stochastic Embedding (t-SNE) and Laplacian Eigenmaps (LE) achieved good results for all datasets.
Keywords :
pattern clustering; principal component analysis; statistical distributions; Laplacian eigenmaps; data cluster analysis; dimension reduction technique; microarray cancer dataset; principal component analysis; t-distributed stochastic embedding; Algorithm design and analysis; Cancer; Clustering algorithms; Indexes; Kernel; Partitioning algorithms; Principal component analysis;
Conference_Titel :
Neural Networks (IJCNN), The 2011 International Joint Conference on
Conference_Location :
San Jose, CA
Print_ISBN :
978-1-4244-9635-8
DOI :
10.1109/IJCNN.2011.6033447