• DocumentCode
    1863468
  • Title

    Hybrid feature selection method using gene expression data

  • Author

    Chuang, Li-Yeh ; Wu, Kuo-Chuan ; Yang, Cheng-Hong

  • Author_Institution
    Dept. of Chem. Eng., I-Shou Univ., Kaohsiung
  • fYear
    2008
  • fDate
    25-27 June 2008
  • Firstpage
    199
  • Lastpage
    204
  • Abstract
    Gene expression profiles, which represent the state of a cell at a molecular level, have great potential as a medical diagnosis tool. Compared to the number of genes involved available training data sets generally have a fairly small sample size in cancer type classification. These training data limitations constitute a challenge to certain classification methodologies. The gene (feature) selection can extract genes which influence classification accuracy effectively, to eliminate the useless genes, and to improve the calculate performance and the classification accuracy. This paper presents hybrid feature selection method - Taguchi-Genetic algorithm to find optimal feature subset, to appraise feature set using K-nearest neighbor with leave-one-out cross-validation based on Euclidean distance calculation. Experimental results show that our method simplifies features effectively and obtains a higher classification accuracy compared to other classification methods from the literature.
  • Keywords
    Taguchi methods; biology computing; feature extraction; genetic algorithms; genetics; pattern classification; Euclidean distance; K-nearest neighbor; Taguchi-genetic algorithm; classification accuracy; gene expression data; hybrid feature selection method; molecular level; optimal feature subset; Appraisal; Biomedical engineering; Cancer; DNA; Euclidean distance; Gene expression; Medical diagnosis; Production; Proteins; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Soft Computing in Industrial Applications, 2008. SMCia '08. IEEE Conference on
  • Conference_Location
    Muroran
  • Print_ISBN
    978-1-4244-3782-5
  • Electronic_ISBN
    978-4-9904-2590-6
  • Type

    conf

  • DOI
    10.1109/SMCIA.2008.5045960
  • Filename
    5045960