Title :
Information gain with chaotic genetic algorithm for gene selection and classification problem
Author :
Yang, Cheng-San ; Chuang, Li-Yeh ; Li, Jung-Chike ; Yang, Cheng-Hong
Author_Institution :
Inst. of Biomed. Eng., Nat. Cheng-Kung Univ., Tainan
Abstract :
For microarray data classification problem, selecting relevant genes from microarray data pose a formidable challenge to researchers due to the high-dimensionality of features, multi-class categories being involved and the usually small sample size. In order to correctly analyze microarray data, the goal of feature (gene) selection is to select those subsets of differentially expressed genes that are potentially relevant for distinguishing the sample classes. In this paper, information gain and chaotic genetic algorithm are proposed to select the relevant genes, and a K-nearest neighbor with the leave-one-out cross-validation method serves as a classifier. Chaotic genetic algorithm is modified by using the chaotic mutation operator to increase the population diversity. The experimental results show that the proposed method not only effectively reduced the number of gene expression levels, but also achieved lower classification error rates.
Keywords :
bioinformatics; error statistics; feature extraction; genetic algorithms; genetics; pattern classification; sampling methods; K-nearest neighbor method; chaotic genetic algorithm; chaotic mutation operator; error rate; gene classification problem; gene expression level; gene feature selection problem; information gain; leave-one-out cross-validation method; microarray data classification problem; multiclass category; population diversity; sample size; Biomedical engineering; Chaos; Chemical engineering; Data engineering; Error analysis; Filters; Gene expression; Genetic algorithms; Production; Proteins; K-nearest neighbor; chaotic genetic algorithm; feature selection; information gain; microarray data;
Conference_Titel :
Systems, Man and Cybernetics, 2008. SMC 2008. IEEE International Conference on
Conference_Location :
Singapore
Print_ISBN :
978-1-4244-2383-5
Electronic_ISBN :
1062-922X
DOI :
10.1109/ICSMC.2008.4811433