DocumentCode :
2509410
Title :
Gene Expression Analysis Using Clustering
Author :
Dhiraj, Kumar ; Rath, Santanu Kumar ; Pandey, Abhishek
Author_Institution :
Dept of Comput. Sci. & Eng., Nat. Inst. of Technol. Rourkela, Rourkela, India
fYear :
2009
fDate :
11-13 June 2009
Firstpage :
1
Lastpage :
4
Abstract :
Data mining has become an important topic in effective analysis of gene expression data due to its wide application in the biomedical industry. In this paper, k-means clustering algorithm has been extensively studied for gene expression analysis. Since our purpose is to demonstrate the effectiveness of the k-means algorithm for a wide variety of data sets, we have chosen two pattern recognition data and thirteen microarray data sets with both overlapping and non-overlapping cluster boundaries, where the number of features/genes ranges from 4 to 7129 and number of sample ranges from 32 to 683. The number of clusters ranges from two to eleven. We use the clustering error rate (or, clustering accuracy) as evaluation metrics to measure the performance of k-means algorithm.
Keywords :
data mining; genetics; lab-on-a-chip; medical computing; pattern clustering; biomedical industry; data mining; gene expression analysis; k-means clustering algorithm; microarray data sets; nonoverlapping cluster boundaries; overlapping cluster boundaries; pattern recognition; Breast; Cancer; Clustering algorithms; Clustering methods; Fungi; Gene expression; Iris; Lungs; Partitioning algorithms; Pattern recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedical Engineering , 2009. ICBBE 2009. 3rd International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-2901-1
Electronic_ISBN :
978-1-4244-2902-8
Type :
conf
DOI :
10.1109/ICBBE.2009.5162877
Filename :
5162877
Link To Document :
بازگشت