• DocumentCode
    1153570
  • Title

    Multiple SVM-RFE for gene selection in cancer classification with expression data

  • Author

    Duan, Kai-Bo ; Rajapakse, Jagath C. ; Wang, Haiying ; Azuaje, Francisco

  • Author_Institution
    BioInformatics Res. Centre, Nanyang Technol. Univ., Singapore
  • Volume
    4
  • Issue
    3
  • fYear
    2005
  • Firstpage
    228
  • Lastpage
    234
  • Abstract
    This paper proposes a new feature selection method that uses a backward elimination procedure similar to that implemented in support vector machine recursive feature elimination (SVM-RFE). Unlike the SVM-RFE method, at each step, the proposed approach computes the feature ranking score from a statistical analysis of weight vectors of multiple linear SVMs trained on subsamples of the original training data. We tested the proposed method on four gene expression datasets for cancer classification. The results show that the proposed feature selection method selects better gene subsets than the original SVM-RFE and improves the classification accuracy. A Gene Ontology-based similarity assessment indicates that the selected subsets are functionally diverse, further validating our gene selection method. This investigation also suggests that, for gene expression-based cancer classification, average test error from multiple partitions of training and test sets can be recommended as a reference of performance quality.
  • Keywords
    cancer; genetics; medical diagnostic computing; molecular biophysics; statistical analysis; support vector machines; cancer classification; feature ranking score; feature selection; gene expression; gene ontology-based similarity assessment; gene selection; multiple support vector machine recursive feature elimination; statistical analysis; Biology computing; Cancer; Educational technology; Gene expression; Ontologies; Statistical analysis; Support vector machine classification; Support vector machines; Testing; Training data; Cancer classification; feature selection; gene expression; gene ontology; semantic similarity; support vector machine recursive feature elimination (SVM-RFE); Algorithms; Artificial Intelligence; Databases, Protein; Diagnosis, Computer-Assisted; Gene Expression Profiling; Humans; Neoplasm Proteins; Neoplasms; Oligonucleotide Array Sequence Analysis; Pattern Recognition, Automated; Reproducibility of Results; Sensitivity and Specificity; Tumor Markers, Biological;
  • fLanguage
    English
  • Journal_Title
    NanoBioscience, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1536-1241
  • Type

    jour

  • DOI
    10.1109/TNB.2005.853657
  • Filename
    1501840