DocumentCode :
2767062
Title :
A comparative study of text classification approaches for personalized retrieval in PubMed
Author :
Pitigala, Sachintha ; Li, Cen ; Seo, Suk
fYear :
2011
fDate :
12-15 Nov. 2011
Firstpage :
919
Lastpage :
921
Abstract :
Retrieval of the information relevant to one´s need from PubMed is becoming increasingly challenging due to its large volume and rapid growth. The traditional information search techniques based on keyword matching are insufficient for large databases such as PubMed. A personalized article retrieval system that is tailored to individual researcher´s specific interests and selects only highly relevant articles can be a helpful tool in the field of Bioinformatics. The text classification methods developed in the text mining community have shown good results in differentiating relevant articles from the irrelevant ones. This study compares two text classification methods, Naïve Bayes and Support Vector Machines, in order to study the effectiveness of the two methods on classifying full text articles in the case when only a small set of training data is available. The comparison results show that the Naïve Bayes method is a better choice than Support Vector Machines in building a personalized article retrieval system which can learn (train) from a small set of full text articles.
Keywords :
Bayes methods; bioinformatics; classification; data mining; information retrieval; medical information systems; support vector machines; text analysis; very large databases; PubMed; bioinformatics; information search techniques; keyword matching; large databases; naïve Bayes; personalized article retrieval system; personalized retrieval; relevant articles; relevant information retrieval; support vector machines; text classification approaches; text classification methods; text mining community; Accuracy; Bioinformatics; Kernel; Support vector machine classification; Text categorization; Training;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine Workshops (BIBMW), 2011 IEEE International Conference on
Conference_Location :
Atlanta, GA
Print_ISBN :
978-1-4577-1612-6
Type :
conf
DOI :
10.1109/BIBMW.2011.6112503
Filename :
6112503
Link To Document :
بازگشت