Title :
Ensemble Learning Based on Active Example Selection for Solving Imbalanced Data Problem in Biomedical Data
Author :
Lee, Min Su ; Oh, Sangyoon ; Zhang, Byoung-Tak
Author_Institution :
Sch. of Comput. Sci. & Eng., Seoul Nat. Univ., Seoul, South Korea
Abstract :
The imbalanced data problem is popular in biomedical classification tasks. Since trained classifiers using imbalanced data are mostly derived from the majority class, their prediction performance is poor for the minority class. In this paper, we propose a novel ensemble learning method based on an active example selection algorithm to resolve the imbalanced data problem. To compensate a possible sub-optimal classifier, our proposed ensemble learning methods aggregates classifiers built by the active example selection algorithm. We implement this ensemble learning method based on the active example selection algorithm using incremental naive Bayes classifiers. Our empirical results show that we greatly improve the performance of classification models trained by five real world imbalanced biomedical data. The proposed ensemble learning methods outperforms other approaches by 0.03~0.15 in terms of AUC which solve imbalanced data problem.
Keywords :
Bayes methods; learning (artificial intelligence); medical computing; pattern classification; biomedical data classification; ensemble learning; example selection algorithm; imbalanced data problem; incremental naive Bayes classifier; prediction performance; trained classifier; Bioinformatics; Biomedical engineering; Computational efficiency; Computer science; Data engineering; Iterative algorithms; Learning systems; Machine learning algorithms; Sampling methods; Training data; Active example selectioin; Ensemble learning; Imbalanced data problem; Incremental naive Bayes;
Conference_Titel :
Bioinformatics and Biomedicine, 2009. BIBM '09. IEEE International Conference on
Conference_Location :
Washington, DC
Print_ISBN :
978-0-7695-3885-3
DOI :
10.1109/BIBM.2009.44