DocumentCode :
1636479
Title :
Feature Selection for Cancer Classification on Microarray Expression Data
Author :
Hsu, Hui-Huang ; Lu, Ming-Da
Author_Institution :
Dept. of Comput. Sci. & Inf. Eng., Tamkang Univ., Taipei
Volume :
3
fYear :
2008
Firstpage :
153
Lastpage :
158
Abstract :
Microarray is an important tool in gene analysis research. It can help identify genes that might cause various cancers. In this paper, we use feature selection methods and the support vector machine (SVM) to search for the disease-causing genes in microarray data of three different cancers. The feature selection methods are based on Euclidian distance (ED) and Pearson correlation coefficient(PCC). We investigated the effect on prediction results by training the SVM with different numbers of features and different kinds of kernels. The results show that linear kernel is the fittest kernel for this problem. Also, equal or higher accuracy can be achieved with only 15 to 100 features which are selected from 7129 or more features of the original data sets.
Keywords :
cancer; data handling; feature extraction; genetics; medical diagnostic computing; pattern classification; support vector machines; Euclidian distance; Pearson correlation coefficient; cancer classification; disease-causing genes; feature selection; gene analysis; linear kernel; microarray expression data; support vector machine; Bioinformatics; Cancer; Data analysis; Data mining; Diseases; Filters; Gene expression; Kernel; Support vector machine classification; Support vector machines; Cancer Classification; Feature Selection; Microarray; Pearson Correlation Coefficient; Support Vector Machine;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems Design and Applications, 2008. ISDA '08. Eighth International Conference on
Conference_Location :
Kaohsiung
Print_ISBN :
978-0-7695-3382-7
Type :
conf
DOI :
10.1109/ISDA.2008.198
Filename :
4696454
Link To Document :
بازگشت