Title :
Biomedical Named Entity Recognition with Tri-Training Learning
Author :
Cai, YueHong ; Cheng, Xianyi
Author_Institution :
Foreign Language Learning Center, JIANGSU Univ., Zhengjiang, China
Abstract :
In order to solve the data scarcity problem, this paper presented a co-training style method for biomedical named entity recognition. We proposed a novel selection method for tri-training learning, using three classifiers: CRFs,SVMs and ME. In tri-training process, we select new newly labeled samples based on the selection model maximizing training utility, and compute the agreement according to the agreement scoring function. Experiments on GENIA corpus show that our proposed tri-training learning approach can more effectively and stably exploit unlabeled data to improve the generalization ability than Co-training and the standard Tri-training.
Keywords :
biology computing; medical information systems; GENIA corpus; biomedical named entity recognition; co-training style method; data scarcity problem; scoring function; tri-training learning; Biomedical computing; Biomedical engineering; Computer science; Data engineering; Data mining; Entropy; Hidden Markov models; Labeling; Machine learning; Semisupervised learning;
Conference_Titel :
Biomedical Engineering and Informatics, 2009. BMEI '09. 2nd International Conference on
Conference_Location :
Tianjin
Print_ISBN :
978-1-4244-4132-7
Electronic_ISBN :
978-1-4244-4134-1
DOI :
10.1109/BMEI.2009.5304799