DocumentCode
3673216
Title
Hybrid feature selection methods for online biomedical publication classification
Author
Long Ma;Yanqing Zhang;Raj Sunderraman;Peter T. Fox;Angela R. Laird;Jessica A. Turner;Matthew D. Turner
Author_Institution
Department of Computer Science, Georgia State University, Atlanta, Georgia, USA
fYear
2015
Firstpage
1
Lastpage
8
Abstract
We review several feature selection methods: Recursive Feature Elimination, Select K Best, and Random Forests, as elements of a processing chain for feature selection in a text mining task. The text mining task is a multi-label classification problem of label assignment; metadata that is usually applied to published scientific papers by expert curators. In the formulation of this classification task, a feature space that is dramatically larger than the available training data occurs naturally and inevitably. We explore ways to reduce the dimension of the feature space, and show that sequential feature selection does substantially improve performance for this complex type of data.
Keywords
"Radio frequency","Metadata","Training","Support vector machines","Training data","Vocabulary","Text mining"
Publisher
ieee
Conference_Titel
Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2015 IEEE Conference on
Type
conf
DOI
10.1109/CIBCB.2015.7300320
Filename
7300320
Link To Document