Title :
Secure information extraction from clinical documents using SNOMED CT gazetteer and natural language processing
Author :
Hina, Saman ; Atwell, Eric ; Johnson, Owen
Author_Institution :
Sch. of Comput., Univ. of Leeds, Leeds, UK
Abstract :
Patient Data is critical in healthcare domain and it should be secure, consistent and coded for the secure transfer from one potential user to another. SNOMED CT (Systematized Nomenclature of Medicine - Clinical Terms) is a standardized reference terminology that consists of millions of SNOMED CT concepts with SNOMED CT codes. This paper describes the extraction of natural language concepts from free text discharge summary reports and mapping with SNOMED CT codes. For the evaluation of the medical concepts, we selected 300 discharge summaries corpus provided by University of Pittsburgh Medical Centre, and compared it with the SNOMED CT concept file which is preprocessed and cleaned file listing SNOMED CT concepts. In this paper we present the ongoing research on SNOMED CT concept extraction from discharge summaries using natural language processing and introducing SNOMED CT core concepts as a gazetteer list for concept extraction. Out of 390023 concepts, 21563 concepts were found in the test set of discharge summaries from SNOMED CT core concepts gazetteer list.
Keywords :
document handling; health care; information retrieval; medical information systems; natural language processing; security of data; SNOMED CT gazetteer; University of Pittsburgh medical centre; clinical documents; clinical terms; healthcare domain; natural language processing; patient data; secure information extraction; systematized nomenclature of medicine; Data mining; Discharges; Logic gates; Medical diagnostic imaging; Medical services; Natural language processing;
Conference_Titel :
Internet Technology and Secured Transactions (ICITST), 2010 International Conference for
Conference_Location :
London
Print_ISBN :
978-1-4244-8862-9
Electronic_ISBN :
978-0-9564263-6-9