DocumentCode :
2870726
Title :
A Novel Multiclass Gene Selection Method based on SVM/MLP Cross Validation
Author :
Zhang, Junying ; Zhang, Hongyi ; Liu, Shenling ; Wang, Yue Joseph
Author_Institution :
Sch. of Comput. Sci. & Eng., Xidian Univ., Xi´´an
fYear :
2006
fDate :
25-28 June 2006
Firstpage :
2205
Lastpage :
2210
Abstract :
Gene selection is one of the major challenges of biochip technology for resolution of curse of dimensionality which occurs especially in DNA microarray dataset where there are more than thousands of genes and only a few experiments (samples), and for gene diagnosis where only a gene subset is enough for diagnosis of diseases. This paper presents a gene selection method by training linear SVM (support vector machine)/nonlinear MLP (multi-layer perceptron) classifiers and testing them with cross validation for finding gene subset which is optimal/suboptimal for diagnosis of binary/multiple disease classes. The process is to select genes with linear SVM classifier incrementally for the diagnosis of each binary disease class pair, by testing its generalization ability with leave-one-out cross validation; the union of them is used as initialized gene subset for the discrimination of all the disease classes, from which genes are deleted one by one decrementally by removing the gene which brings the greatest decrease of the generalization power after the removal, where generalization is measured by leave-one-out and leave-4-out cross validation. For real DNA microarray data with 2308 genes and only 64 labelled samples belonging to 4 disease classes, only 6 genes are selected to be diagnostic genes. The diagnostic genes are tested with 6-2-4 MLP with both leave-one-out and leave-4-out cross validation, resulting in no misclassification
Keywords :
DNA; biotechnology; genetics; multilayer perceptrons; support vector machines; DNA microarray dataset; MLP; SVM; biochip technology; gene diagnosis; linear support vector machine; multiclass gene selection; nonlinear multi-layer perceptron; Biology computing; Computer science; DNA; Data analysis; Data engineering; Diseases; Gene expression; Support vector machine classification; Support vector machines; Testing; DNA microarray data; MLP; SVM; cross validation; curse of dimensionality; diagnostic genes; gene selection;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Mechatronics and Automation, Proceedings of the 2006 IEEE International Conference on
Conference_Location :
Luoyang, Henan
Print_ISBN :
1-4244-0465-7
Electronic_ISBN :
1-4244-0466-5
Type :
conf
DOI :
10.1109/ICMA.2006.257654
Filename :
4026440
Link To Document :
بازگشت