Title :
Protein-Protein Interaction extraction based on ensemble kernel model and active learning strategy
Author :
Li, Lishuang ; Huang, Degen ; Wang, Min
Author_Institution :
Dalian Univ. of Technol., Dalian, China
Abstract :
Protein-Protein Interaction (PPI) extraction from biomedicine literature can supply the biomedicine researcher with useful information rapidly. This paper presents a PPI extraction system based on the ensemble kernel model and active learning. Firstly, the ensemble kernel within SVM classifier combines the lexical feature-based kernel and the path-based kernel. Experimental results show that the F-score of PPI extraction using ensemble kernel model on AIMED, IEPA and BCPPI corpora are 64.50%, 69.74% and 60.38% respectively with 10-fold cross-validation, which are better than the lexical feature-based kernel and the path-based kernel separately. As the above ensemble kernel model based on SVM needs large labeled data and it is expensive to label data manually, we integrate active learning into the ensemble kernel model. The active learning method uses the uncertainty-based sampling strategy. The experimental results integrating the active learning show that the F-score on AIMED, IEPA and BCPPI corpora are 65.24%, 70.19% and 61.87% respectively, which are better than those using the ensemble kernel model with the passive learning, and meantime reduce the labeling data by 20%, 30% and 30%, respectively.
Keywords :
biology computing; learning (artificial intelligence); pattern classification; proteins; sampling methods; support vector machines; 10-fold cross-validation; F-score; SVM classifier; active learning strategy; biomedicine; ensemble kernel model; lexical feature-based kernel; passive learning; path-based kernel; protein-protein interaction extraction; support vector machines; uncertainty-based sampling strategy; Data models; Feature extraction; Kernel; Learning systems; Protein engineering; Training; Training data; Active Learning; Combined Kernel; PPI; SVM;
Conference_Titel :
Natural Language Processing andKnowledge Engineering (NLP-KE), 2011 7th International Conference on
Conference_Location :
Tokushima
Print_ISBN :
978-1-61284-729-0
DOI :
10.1109/NLPKE.2011.6138105