DocumentCode
1554537
Title
Using combinatory categorial grammar to extract biomedical information
Author
Park, Jong C.
Author_Institution
Dept. of Comput. Sci., Korea Adv. Inst. of Sci. & Technol., Yusong-Gu, South Korea
Volume
16
Issue
6
fYear
2001
Firstpage
62
Lastpage
67
Abstract
Extracting information from biology databases manually can be an overwhelming task. GenBank, the US National Institutes of Health database containing all publicly available DNA sequences, has more than 14 billion bases in 13 million genetic-sequence records. Medline, a literature database available through PubMed, has over 11 million journal citations. In a May 2001 search request for "cytokine" (regulatory proteins in the immune system), PubMed returned 296556 articles. Given the quantity and complexity of biomedical literature, demands for computational tools to extract specific information are increasing. The author reviews biomedical information extraction methods and presents research done by KAIST\´s natural language processing group on a system that shows encouraging performance using combinatory categorial grammar as a natural language grammar formalism.
Keywords
bibliographic systems; category theory; grammars; information retrieval; medical information systems; natural languages; GenBank; KAIST; Medline; PubMed; bioinformatics; biology databases; biomedical information extraction; combinatory categorial grammar; computational tools; genetic-sequence records; literature database; natural language grammar formalism; natural language processing; publicly available DNA sequences; Amino acids; Biomedical measurements; DNA; Data mining; Databases; Electric shock; Muscles; Natural language processing; Natural languages; Proteins;
fLanguage
English
Journal_Title
Intelligent Systems, IEEE
Publisher
ieee
ISSN
1541-1672
Type
jour
DOI
10.1109/5254.972092
Filename
972092
Link To Document