Title of article :
Feature Selection For Genomic Data By Combining Filter And Wrapper Approaches
Author/Authors :
ALI EL AKADI ، نويسنده , , AOUATIF AMINE، نويسنده , , ABDELJALIL EL OUARDIGHI، نويسنده , , Driss Aboutajdine، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2009
Abstract :
. Gene expression data usually contains a large number of genes, but a small number of samples. Feature selection for gene expression data aims at finding a set of genes that best discriminate biologicalsamples of different types. In this paper, we propose a two-stage selection algorithm for genomic data bycombining MRMR (Minimum Redundancy Maximum Relevance) and GA (Genetic Algorithm): In thefirst stage, MRMR is used to filter noisy and redundant genes in high dimensional microarray data. In thesecond stage, the GA uses the classifier accuracy as a fitness function to select the highly discriminatinggenes. The proposed method is tested on five open datasets: NCI, Lymphoma, Lung, Leukemia andColon using Support Vector Machine and Naïve Bayes classifiers. The comparison of the MRMR-GAwith MRMR filter and GA wrapper shows that our method is able to find the smallest gene subset thatgives the most classification accuracy in leave-one-out cross-validation (LOOCV).
Keywords :
Genetic algorithm , MRMR , Support vector machine , Naïve Bayes classifier , LOOCV , feature selection
Journal title :
INFOCOMP Journal of Computer Science
Journal title :
INFOCOMP Journal of Computer Science