• DocumentCode
    2060952
  • Title

    Knowledge Discovery and Data Mining of Free Text Radiology Reports

  • Author

    Friedlin, Jeffrey ; Mahoui, Malika ; Jones, Josette ; Jamieson, Patrick

  • Author_Institution
    Regenstrief Inst., Indiana Univ., Indianapolis, IN, USA
  • fYear
    2011
  • fDate
    26-29 July 2011
  • Firstpage
    89
  • Lastpage
    96
  • Abstract
    Medical Knowledge Discovery and Data Mining (KDD) over text is a promising yet difficult technology for unlocking meaning and uncovering associations in vast clinical text repositories. We report our experience in developing a new text analytic system called MEDAT or Medical Exploratory Data Analysis over Text, which overcomes several problems in text mining. The MEDAT system employs an annotated semantic index with a large number of assertions (propositions). The semantic index is able to capture complex assertions which encapsulate conceptual relationships including their modifiers at a granular level. The index represents semantically equivalent sentences with the same symbols, a necessary component for KDD semantic queries, including semantic Boolean and correlation queries. The graphical user interface enables users to perform complex semantic analysis of the Roentgen corpus, consisting of 594,000 de-identified radiology reports with 4.3 million sentences, without having to learn a programming language. The MEDAT architecture offers a novel framework for text mining in other medical domains.
  • Keywords
    Boolean functions; brain; computational linguistics; data mining; graphical user interfaces; medical administrative data processing; medical computing; patient diagnosis; radiology; semantic networks; text analysis; KDD semantic query; MEDAT; Medical Exploratory Data Analysis over Text; Roentgen corpus; annotated semantic index; clinical text repository; complex assertion; complex semantic analysis; conceptual relationship; correlation query; data association; data mining; free text radiology reports; granular level; graphical user interface; knowledge discovery; modifier; semantic Boolean query; semantically equivalent sentences; text analytic system; text mining; Educational institutions; Heart; Indexes; Medical diagnostic imaging; Ontologies; Radiology; Semantics; Corpus Linguistics; Data Mining; Knowledge Discovery; Natural Language Processing; Semantic Annotation; Semantic Search; Text Analytics; Text Mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Healthcare Informatics, Imaging and Systems Biology (HISB), 2011 First IEEE International Conference on
  • Conference_Location
    San Jose, CA
  • Print_ISBN
    978-1-4577-0325-6
  • Electronic_ISBN
    978-0-7695-4407-6
  • Type

    conf

  • DOI
    10.1109/HISB.2011.31
  • Filename
    6061459