DocumentCode :
182881
Title :
Exploiting word meaning for negation identification in electronic health records
Author :
Barbantan, Ioana ; Potolea, Rodica
Author_Institution :
Comput. Sci., Tech. Univ. of Cluj-Napoca, Cluj-Napoca, Romania
fYear :
2014
fDate :
22-24 May 2014
Firstpage :
1
Lastpage :
7
Abstract :
Topic extraction from Electronic Health Records is a sensitive step of the knowledge extraction process. As the meaning of a discourse can be completely distorted by negations, the correct identification of terms vs negated terms is mandatory. Our work is an attempt of automated negation identification in unstructured health records. We analyzed a corpus of medical documents containing 5103 sentences and we found that while adverbs have a distribution of 3%, the negation covers almost 2% of the words used in the corpus, justifying an in depth analysis of negation. The main contribution of the paper addresses the existing drawback of negation identification approaches in the literature that do not consider negation represented with negation prefixes. In this paper we address the tasks of syntactic and morphologic negation identification. In order to identify morphologic negation we propose the PreNex algorithm that consists in breaking down the terms into prefix and root word and the analysis of the root´s validity using additional available resources (WordNet). The syntactic negation identification relies on a pattern matching approach where the negated concepts are identified based on a predefined Ust of negation identifiers. The results we obtained are promising and ensure a reliable negation identification approach for medical documents. We report a precision of 92.62% and recall of 93.60% in case of the morphologic negation identification and an overall performance in the morphologic and syntactic negation identification of 95.96% precision and 94.23% recall.
Keywords :
document handling; electronic health records; information retrieval; knowledge acquisition; pattern matching; PreNex algorithm; discourse meaning; electronic health records; knowledge extraction process; medical documents; morphologic negation identification; negation identification approach; pattern matching approach; syntactic negation identification; topic extraction; word meaning; Algorithm design and analysis; Compounds; Lead; Medical diagnostic imaging; Semantics; Surgery; Text mining; Electronic Health Records; Text Mining; WordNet; negation identification; prefix;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Automation, Quality and Testing, Robotics, 2014 IEEE International Conference on
Conference_Location :
Cluj-Napoca
Print_ISBN :
978-1-4799-3731-8
Type :
conf
DOI :
10.1109/AQTR.2014.6857880
Filename :
6857880
Link To Document :
بازگشت