مرکز منطقه ای اطلاع رساني علوم و فناوري - An initial study of full parsing of clinical text using the Stanford Parser

DocumentCode :

2766024

Title :

An initial study of full parsing of clinical text using the Stanford Parser

Author :

Xu, Hua ; AbdelRahman, Samir ; Jiang, Min ; Fan, Jung-wei ; Huang, Yang

Author_Institution :

Dept. of Biomed. Inf., Vanderbilt Univ., Nashville, TN, USA

fYear :

2011

fDate :

12-15 Nov. 2011

Firstpage :

607

Lastpage :

614

Abstract :

Full parsing recognizes a sentence and generates a syntactic structure of it (a parse tree), which is useful for many natural language processing (NLP) applications. The Stanford Parser is one of the state-of-art parsers in the general English domain. However, there is no formal evaluation of its performance in clinical text that often contains ungrammatical structures. In this study, we randomly selected 50 sentences in the clinical corpus from 2010 i2b2 NLP challenge and manually annotated them to create a gold standard of parse trees. Our evaluation showed that the original Stanford Parser achieved a bracketing F-measure (BF) of 77% on the gold standard. Moreover, we assessed the effect of part-of-speech (POS) tags on parsing and our results showed that manually corrected POS tags achieved a maximum BF of 81%. Furthermore, we analyzed errors of the Stanford Parser and provided valuable insights to large-scale parse tree annotation for clinical text.

Keywords :

grammars; medical computing; natural language processing; text analysis; 2011; F-measure; Stanford parser; clinical corpus; clinical text parsing; natural language processing application; parse tree; part-of-speech tag; Gold; Guidelines; Manuals; Medical services; Natural language processing; Syntactics; Tagging;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Bioinformatics and Biomedicine Workshops (BIBMW), 2011 IEEE International Conference on

Conference_Location :

Atlanta, GA

Print_ISBN :

978-1-4577-1612-6

Type :

conf

DOI :

10.1109/BIBMW.2011.6112438

Filename :

6112438

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2766024