Title :
Combining labeled and unlabeled data for biomédical event extraction
Author :
Jian Wang ; Qian Xu ; Hongfei Lin ; Zhihao Yang ; Yanpeng Li
Author_Institution :
Sch. of Comput. Sci. & Technol., Dalian Univ. of Technol., Dalian, China
Abstract :
In biomédical event extraction domain, there is a small amount of labeled data along with a large pool of unlabeled data. Many supervised learning algorithms for bio-event extraction have been affected by the data sparseness. In this paper, we present a new solution to perform biomédical event extraction from scientific documents, applying a semi-supervised approach to extract features from unlabeled data using labeled data features as a reference. This strategy is evaluated via experiments in which the data from the BioNLP2011 and PubMed are applied. To the best of our knowledge, it is the first time that the combination of labeled and unlabeled data are used for biomédical event extraction and our experimental results demonstrate the state-of-the-art performance in this task.
Keywords :
document handling; feature extraction; learning (artificial intelligence); medical computing; BioNLP2011; PubMed; biomedical event extraction domain; data sparseness; feature extraction; labeled data features; scientific documents; semisupervised approach; supervised learning algorithms; unlabeled data; Couplings; Data mining; Feature extraction; Proteins; Support vector machines; Training; Vectors; bio-event extraction; data sparseness; unlabeled data;
Conference_Titel :
Bioinformatics and Biomedicine Workshops (BIBMW), 2012 IEEE International Conference on
Conference_Location :
Philadelphia, PA
Print_ISBN :
978-1-4673-2746-6
Electronic_ISBN :
978-1-4673-2744-2
DOI :
10.1109/BIBMW.2012.6470206