• DocumentCode
    1784990
  • Title

    Incremental wrapper based gene selection with Markov blanket

  • Author

    Aiguo Wang ; Ning An ; Guilin Chen ; Jing Yang ; Lian Li ; Alterovitz, Gil

  • Author_Institution
    Sch. of Comput. & Inf., Hefei Univ. of Technol., Hefei, China
  • fYear
    2014
  • fDate
    2-5 Nov. 2014
  • Firstpage
    74
  • Lastpage
    79
  • Abstract
    Gene selection plays a crucial role in the analysis of microarray data with high dimensionality and small sample size. Incremental wrapper based feature subset selection (FSS) methods, among various feature selection approaches, tend to obtain high quality feature subset and better classification accuracy than filter methods, while it is much more time consuming since the interdependence and redundancy between features is evaluated in a wrapper way. In this paper, we explore to introduce Markov Blanket (MB) into incremental wrapper based FSS process. Rather than evaluate the quality of all the features ranked by a filter method, our proposal eliminates features that are redundant to the newly selected one via MB during the wrapper evaluation process to reduce the number of wrappers, enabling us to select the relevant features and eliminate redundant ones efficiently. To verify the effectiveness and efficiency of the proposed approach, experimental comparisons on six publicly available microarray data are conducted with two typical classifiers with different metrics, Naïve Bayes and 1-Nearest-Neighbor. Experimental results demonstrate that our approach greatly speeds up the feature selection process, obtains more compact feature subset and achieves better classification accuracy compared to that without MB for both two-category and multi-category problems.
  • Keywords
    Bayes methods; Internet; Markov processes; bioinformatics; biological techniques; data mining; feature extraction; feature selection; genomics; information filtering; lab-on-a-chip; 1-Nearest-Neighbor metrics; MB wrapper-based gene selection; Markov blanket; Naïve Bayes metrics; classification accuracy; compact feature subset; efficient feature elimination; feature interdependence; feature redundancy; feature selection approaches; feature selection process; feature selection speed; feature subset selection methods; filter methods; high quality feature subset; incremental wrapper-based FSS methods; incremental wrapper-based FSS process; incremental wrapper-based gene selection; microarray data analysis; microarray data dimensionality; microarray data sample size; multicategory feature selection problems; publicly available microarray data; ranked feature quality evaluation; redundant feature elimination; two-category feature selection problems; typical classifiers; wrapper evaluation process; wrapper number reduction; Accuracy; Classification algorithms; Filtering algorithms; Markov processes; Training; Tumors; Uncertainty; Markov Blanket; gene selection; microarray data; symmetric uncertainty; wrapper;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine (BIBM), 2014 IEEE International Conference on
  • Conference_Location
    Belfast
  • Type

    conf

  • DOI
    10.1109/BIBM.2014.6999251
  • Filename
    6999251