DocumentCode
2060952
Title
Knowledge Discovery and Data Mining of Free Text Radiology Reports
Author
Friedlin, Jeffrey ; Mahoui, Malika ; Jones, Josette ; Jamieson, Patrick
Author_Institution
Regenstrief Inst., Indiana Univ., Indianapolis, IN, USA
fYear
2011
fDate
26-29 July 2011
Firstpage
89
Lastpage
96
Abstract
Medical Knowledge Discovery and Data Mining (KDD) over text is a promising yet difficult technology for unlocking meaning and uncovering associations in vast clinical text repositories. We report our experience in developing a new text analytic system called MEDAT or Medical Exploratory Data Analysis over Text, which overcomes several problems in text mining. The MEDAT system employs an annotated semantic index with a large number of assertions (propositions). The semantic index is able to capture complex assertions which encapsulate conceptual relationships including their modifiers at a granular level. The index represents semantically equivalent sentences with the same symbols, a necessary component for KDD semantic queries, including semantic Boolean and correlation queries. The graphical user interface enables users to perform complex semantic analysis of the Roentgen corpus, consisting of 594,000 de-identified radiology reports with 4.3 million sentences, without having to learn a programming language. The MEDAT architecture offers a novel framework for text mining in other medical domains.
Keywords
Boolean functions; brain; computational linguistics; data mining; graphical user interfaces; medical administrative data processing; medical computing; patient diagnosis; radiology; semantic networks; text analysis; KDD semantic query; MEDAT; Medical Exploratory Data Analysis over Text; Roentgen corpus; annotated semantic index; clinical text repository; complex assertion; complex semantic analysis; conceptual relationship; correlation query; data association; data mining; free text radiology reports; granular level; graphical user interface; knowledge discovery; modifier; semantic Boolean query; semantically equivalent sentences; text analytic system; text mining; Educational institutions; Heart; Indexes; Medical diagnostic imaging; Ontologies; Radiology; Semantics; Corpus Linguistics; Data Mining; Knowledge Discovery; Natural Language Processing; Semantic Annotation; Semantic Search; Text Analytics; Text Mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Healthcare Informatics, Imaging and Systems Biology (HISB), 2011 First IEEE International Conference on
Conference_Location
San Jose, CA
Print_ISBN
978-1-4577-0325-6
Electronic_ISBN
978-0-7695-4407-6
Type
conf
DOI
10.1109/HISB.2011.31
Filename
6061459
Link To Document