• DocumentCode
    3256261
  • Title

    Comparison of Two Methods for Finding Biomedical Categories in Medline

  • Author

    Yeganova, Lana ; Kim, Won ; Comeau, Donald C. ; Wilbur, W. John

  • Author_Institution
    Nat. Libr. of Med., Nat. Inst. of Health, Bethesda, MD, USA
  • Volume
    2
  • fYear
    2011
  • fDate
    18-21 Dec. 2011
  • Firstpage
    96
  • Lastpage
    99
  • Abstract
    In this paper we describe and compare two methods for automatically learning meaningful biomedical categories in Medline®. The first approach is a simple statistical method that uses part-of-speech and frequency information to extract a list of frequent headwords from noun phrases in Medline. The second method implements an alignment-based technique to learn frequent generic patterns that indicate a hyponymy/hypernymy relationship between a pair of noun phrases. We then apply these patterns to Medline to collect frequent hypernyms, potential biomedical categories. We study and compare these two alternative sets of terms to identify semantic categories in Medline. Our method is completely data-driven.
  • Keywords
    document handling; information retrieval; learning (artificial intelligence); medical computing; statistical analysis; biomedical categories; frequency information; frequent headwords; frequent hypernyms; medical literature analysis and retrieval system online; medline biomedical categories; noun phrases; part-of-speech information; semantic categories; statistical method; two method comparison; Diseases; Feature extraction; Ontologies; Semantics; Statistical analysis; Unified modeling language; Vectors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
  • Conference_Location
    Honolulu, HI
  • Print_ISBN
    978-1-4577-2134-2
  • Type

    conf

  • DOI
    10.1109/ICMLA.2011.50
  • Filename
    6147055