DocumentCode :
2179369
Title :
Active Learning Algorithm for Threshold of Decision Probability on Imbalanced Text Classification Based on Protein-Protein Interaction Documents
Author :
Xu, Guixian ; Niu, Zhendong ; Gao, Xu ; Cao, Yujuan ; Zhao, Yumin
Author_Institution :
Sch. of Comput. Sci., Beijing Inst. of Technol., Beijing, China
fYear :
2010
fDate :
9-10 Feb. 2010
Firstpage :
78
Lastpage :
82
Abstract :
The study of host pathogen protein-protein interactions (PPIs) is essential to understand the disease-causing mechanisms of human pathogens. A large number of scientific findings about PPIs are generated in the biomedical literatures. Building a document classification system can accelerate the process of mining and curation of PPI knowledge. With more and more imbalanced dataset appearing, how to handle the imbalanced classification problem is becoming a hot topic in machine learning field. In this paper, we propose an Active Learning algorithm for Threshold of Decision Probability (ALTDP) to solve problem of misclassifying the minority class based on imbalanced host pathogen PPIs data set. The results demonstrate the proposed approach is significant to improve the accuracy of classification on imbalanced data set.
Keywords :
data mining; learning (artificial intelligence); pattern classification; proteins; active learning algorithm; decision probability threshold; document classification system; imbalanced host pathogen PPIs data set; imbalanced text classification; protein-protein interaction documents; Acceleration; Classification tree analysis; Costs; Humans; Machine learning; Machine learning algorithms; Pathogens; Protein engineering; Sampling methods; Text categorization; imbalanced text classification; machine learning; protein-protein interaction;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Storage and Data Engineering (DSDE), 2010 International Conference on
Conference_Location :
Bangalore
Print_ISBN :
978-1-4244-5678-9
Type :
conf
DOI :
10.1109/DSDE.2010.28
Filename :
5452631
Link To Document :
بازگشت