Title :
A Textual Representation Scheme for Identifying Clinical Relationships in Patient Records
Author :
Rezarta Islamaj Doan;Aurelie Neveol;Zhiyong Lu
Author_Institution :
Nat. Center for Biotechnol. Inf., Nat. Institutes of Health, Bethesda, MD, USA
Abstract :
The identification of relationships between clinical concepts in patient records is a preliminary step for many important applications in medical informatics, ranging from quality of care to hypothesis generation. In this work we describe an approach that facilitates the automatic recognition of relationships defined between two different concepts in text. Unlike the traditional bag-of-words representation, in this work, a relationship is represented with a scheme of five distinct context-blocks based on the position of concepts in the text. This scheme was applied to eight different relationships, between medical problems, treatments and tests, on a set of 349 patient records from the 4th i2b2 challenge. Results show that the context-block representation was very successful (F-Measure = 0.775) compared to the bag-of-words model (F-Measure = 0.402). The advantage of this representation scheme was the correct management of word position information, which may be critical in identifying certain relationships.
Keywords :
"Context modeling","Machine learning","Semantics","Feature extraction","Bioinformatics","Diseases","Support vector machines"
Conference_Titel :
Machine Learning and Applications (ICMLA), 2010 Ninth International Conference on
Print_ISBN :
978-1-4244-9211-4
DOI :
10.1109/ICMLA.2010.164