Title :
A mixed integer programming approach for gene selection
Author :
Lizhen Shao ; Jieli Wang ; Guangda Hu ; Jiwei Liu
Author_Institution :
Key Lab. of Adv. Control of Iron & Steel Process (Minist. of Educ.), Univ. of Sci. & Technol. Beijing, Beijing, China
Abstract :
It is known that for most of gene expression data for cancer classification, the number of samples is quite small compared to the number of genes. Therefore, feature selection is an essential pre-processing step and a challenging problem to remove the irrelevant or redundant genes before classification. In this paper, we model the gene selection problem as a mixed integer programming problem based on 1-norm support vector machine (SVM). This problem is difficult to solve because the number of integer variables (usually tens of thousands or even hundreds of thousands) is very big compared to the desired number of genes. To solve this problem, we propose an iterative mixed integer optimization algorithm to gradually select a subset of genes. We test the proposed approach on colon cancer and leukemia cancer gene expression datasets. The results show that our proposed algorithm performs better than fisher criterion, T-statistics, standard 2-norm SVM and SVM recursive feature elimination (SVM-RFE) methods. The selected gene subset has better classification accuracy and better generalization capability.
Keywords :
cancer; integer programming; iterative methods; medical computing; pattern classification; support vector machines; 1-norm support vector machine; SVM; cancer classification; colon cancer gene expression datasets; gene expression data; gene selection; iterative mixed integer optimization algorithm; leukemia cancer gene expression datasets; mixed integer programming approach;
Conference_Titel :
Computational Problem-solving (ICCP), 2013 International Conference on
Conference_Location :
Jiuzhai
DOI :
10.1109/ICCPS.2013.6893583