Title :
Semi-supervised feature learning from clinical text
Author :
Wang, Zhuoran ; Shawe-Taylor, John ; Shah, Anoop
Author_Institution :
Dept. of Comput. Sci., Univ. Coll. London, London, UK
Abstract :
This paper is focused on the automated identification of the clinical free-text records that contain useful information (e.g. symptoms, modifiers, diagnosis, etc) of a certain disease. We introduce a novel semi-supervised machine learning algorithm to address this problem, by training the set covering machine in a bootstrapping procedure. The advantage of the proposed technique is that not only can it find the documents of interest more accurately than searching based on diagnostic codes, the features it learned could also be directly used as a knowledge representation of the given topic and to assist either further machine learning algorithms or manual post-processing and analysis.
Keywords :
diseases; feature extraction; knowledge representation; learning (artificial intelligence); medical diagnostic computing; medical information systems; patient diagnosis; text analysis; bootstrapping; clinical free-text record; disease diagnosis; disease modifier; disease symptom; knowledge representation; medical text processing; semisupervised feature learning; semisupervised machine learning algorithm; text analysis; Cancer; Diseases; Feature extraction; Machine learning; Medical diagnostic imaging; Pain; Training; medical text processing; semi-supervised learning; set covering machine; sparse feature learning;
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2010 IEEE International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-8306-8
Electronic_ISBN :
978-1-4244-8307-5
DOI :
10.1109/BIBM.2010.5706610