DocumentCode :
2348502
Title :
Boosting performance of gene mention tagging system by classifiers ensemble
Author :
Li, Lishuang ; Sun, Jing ; Huang, Degen
Author_Institution :
Sch. of Comput. Sci. & Eng., Dalian Univ. of Technol., Dalian, China
fYear :
2010
fDate :
21-23 Aug. 2010
Firstpage :
1
Lastpage :
4
Abstract :
To further improve the tagging performance of single classifiers, a classifiers ensemble experimental framework is presented for gene mention tagging. In the framework, six classifiers are constructed by four toolkits (CRF++, YamCha, Maximum Entropy (ME) and MALLET) with different training methods and feature sets and then combined with a two-layer stacking algorithm. The recognition results of different classifiers are regarded as input feature vectors to be incorporated, and then a high-powered model is obtained. Experiments carried out on the corpus of BioCreative II GM task show that the classifiers ensemble method is effective and our best combination method achieves an F-score of 88.09%, which outperforms most of the top-ranked Bio-NER systems in the BioCreAtIvE II GM challenge.
Keywords :
bioinformatics; data mining; maximum entropy methods; pattern classification; text analysis; Bio-NER systems; BioCreative II GM task; CRF++; F-score; MALLET; YamCha; classifiers ensemble; gene mention tagging system; input feature vectors; maximum entropy; two layer stacking algorithm; Biology; Educational institutions; Software; Classifiers Ensemble; Gene Mention Tagging; Named Entity Recognition; Text Mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering (NLP-KE), 2010 International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4244-6896-6
Type :
conf
DOI :
10.1109/NLPKE.2010.5587822
Filename :
5587822
Link To Document :
بازگشت