DocumentCode
3256261
Title
Comparison of Two Methods for Finding Biomedical Categories in Medline
Author
Yeganova, Lana ; Kim, Won ; Comeau, Donald C. ; Wilbur, W. John
Author_Institution
Nat. Libr. of Med., Nat. Inst. of Health, Bethesda, MD, USA
Volume
2
fYear
2011
fDate
18-21 Dec. 2011
Firstpage
96
Lastpage
99
Abstract
In this paper we describe and compare two methods for automatically learning meaningful biomedical categories in Medline®. The first approach is a simple statistical method that uses part-of-speech and frequency information to extract a list of frequent headwords from noun phrases in Medline. The second method implements an alignment-based technique to learn frequent generic patterns that indicate a hyponymy/hypernymy relationship between a pair of noun phrases. We then apply these patterns to Medline to collect frequent hypernyms, potential biomedical categories. We study and compare these two alternative sets of terms to identify semantic categories in Medline. Our method is completely data-driven.
Keywords
document handling; information retrieval; learning (artificial intelligence); medical computing; statistical analysis; biomedical categories; frequency information; frequent headwords; frequent hypernyms; medical literature analysis and retrieval system online; medline biomedical categories; noun phrases; part-of-speech information; semantic categories; statistical method; two method comparison; Diseases; Feature extraction; Ontologies; Semantics; Statistical analysis; Unified modeling language; Vectors;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
Conference_Location
Honolulu, HI
Print_ISBN
978-1-4577-2134-2
Type
conf
DOI
10.1109/ICMLA.2011.50
Filename
6147055
Link To Document