• DocumentCode
    1554537
  • Title

    Using combinatory categorial grammar to extract biomedical information

  • Author

    Park, Jong C.

  • Author_Institution
    Dept. of Comput. Sci., Korea Adv. Inst. of Sci. & Technol., Yusong-Gu, South Korea
  • Volume
    16
  • Issue
    6
  • fYear
    2001
  • Firstpage
    62
  • Lastpage
    67
  • Abstract
    Extracting information from biology databases manually can be an overwhelming task. GenBank, the US National Institutes of Health database containing all publicly available DNA sequences, has more than 14 billion bases in 13 million genetic-sequence records. Medline, a literature database available through PubMed, has over 11 million journal citations. In a May 2001 search request for "cytokine" (regulatory proteins in the immune system), PubMed returned 296556 articles. Given the quantity and complexity of biomedical literature, demands for computational tools to extract specific information are increasing. The author reviews biomedical information extraction methods and presents research done by KAIST\´s natural language processing group on a system that shows encouraging performance using combinatory categorial grammar as a natural language grammar formalism.
  • Keywords
    bibliographic systems; category theory; grammars; information retrieval; medical information systems; natural languages; GenBank; KAIST; Medline; PubMed; bioinformatics; biology databases; biomedical information extraction; combinatory categorial grammar; computational tools; genetic-sequence records; literature database; natural language grammar formalism; natural language processing; publicly available DNA sequences; Amino acids; Biomedical measurements; DNA; Data mining; Databases; Electric shock; Muscles; Natural language processing; Natural languages; Proteins;
  • fLanguage
    English
  • Journal_Title
    Intelligent Systems, IEEE
  • Publisher
    ieee
  • ISSN
    1541-1672
  • Type

    jour

  • DOI
    10.1109/5254.972092
  • Filename
    972092