• DocumentCode
    3673216
  • Title

    Hybrid feature selection methods for online biomedical publication classification

  • Author

    Long Ma;Yanqing Zhang;Raj Sunderraman;Peter T. Fox;Angela R. Laird;Jessica A. Turner;Matthew D. Turner

  • Author_Institution
    Department of Computer Science, Georgia State University, Atlanta, Georgia, USA
  • fYear
    2015
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    We review several feature selection methods: Recursive Feature Elimination, Select K Best, and Random Forests, as elements of a processing chain for feature selection in a text mining task. The text mining task is a multi-label classification problem of label assignment; metadata that is usually applied to published scientific papers by expert curators. In the formulation of this classification task, a feature space that is dramatically larger than the available training data occurs naturally and inevitably. We explore ways to reduce the dimension of the feature space, and show that sequential feature selection does substantially improve performance for this complex type of data.
  • Keywords
    "Radio frequency","Metadata","Training","Support vector machines","Training data","Vocabulary","Text mining"
  • Publisher
    ieee
  • Conference_Titel
    Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2015 IEEE Conference on
  • Type

    conf

  • DOI
    10.1109/CIBCB.2015.7300320
  • Filename
    7300320