DocumentCode :
3256084
Title :
Unsupervised Grammar Induction of Clinical Report Sublanguage
Author :
Kate, Rohit J.
Author_Institution :
Dept. of Health Inf. & Adm., Univ. of Wisconsin-Milwaukee, Milwaukee, WI, USA
Volume :
2
fYear :
2011
fDate :
18-21 Dec. 2011
Firstpage :
53
Lastpage :
58
Abstract :
Clinical reports are written using a subset of natural language while employing many domain-specific terms, such a language is also known as a sub language for a scientific or a technical domain. In this paper, we present a method which automatically induces a grammar for the sub language of a given genre of clinical reports from a corpus of reports with no annotations. The method first identifies the semantic classes of the clinical terms used in the reports, then it induces a grammar that is based on these semantic classes and part-of-speech tags. Experiments show that the induced grammar is able to parse novel sentences and obtains a reasonable accuracy.
Keywords :
grammars; medical information systems; natural language processing; unsupervised learning; clinical report sublanguage; clinical term; natural language; part-of-speech tags; semantic class; unsupervised grammar induction; Encoding; Grammar; Production; Semantics; Syntactics; Training; Unified modeling language; clinical reports; health informatics; natural language processing; unsupervised parsing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on
Conference_Location :
Honolulu, HI
Print_ISBN :
978-1-4577-2134-2
Type :
conf
DOI :
10.1109/ICMLA.2011.150
Filename :
6147048
Link To Document :
بازگشت