DocumentCode
3714435
Title
Supporting HIV literature screening with data sampling and supervised learning
Author
Hayda Almeida;Marie-Jean Meurs;Leila Kosseim;Adrian Tsang
Author_Institution
Centre for Structural and Functional Genomics, Concordia University, Canada
fYear
2015
Firstpage
491
Lastpage
496
Abstract
This paper presents a supervised learning approach to support the screening of HIV literature. The manual screening of biomedical literature is an important task in the process of systematic reviews. Researchers and curators have the very demanding, time-consuming and error-prone task of manually identifying documents that must be included in a systematic review concerning a specific problem. We implemented a supervised learning approach to support screening tasks, by automatically flagging potentially selected documents in a list retrieved by a literature database search. To overcome the main issues associated with the automatic literature screening task, we evaluated the use of data sampling, feature combinations, and feature selection methods, generating a total of 105 classification models. The models yielding best results were composed by the Logistic Model Trees classifier, a fairly balanced training set, and feature combination of Bag-Of-Words and MeSH terms. According to our results, the system correctly labels the great majority of relevant documents, and it could be used to support HIV systematic reviews to allow researchers to assess a greater number of documents in less time.
Keywords
"Support vector machines","Niobium","Biological system modeling","Systematics","XML","Classification algorithms"
Publisher
ieee
Conference_Titel
Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on
Type
conf
DOI
10.1109/BIBM.2015.7359733
Filename
7359733
Link To Document