DocumentCode :
3105728
Title :
Boosting for Learning Multiple Classes with Imbalanced Class Distribution
Author :
Sun, Yanmin ; Kamel, Mohamed S. ; Wang, Yang
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Waterloo, Waterloo, ON
fYear :
2006
fDate :
18-22 Dec. 2006
Firstpage :
592
Lastpage :
602
Abstract :
Classification of data with imbalanced class distribution has posed a significant drawback of the performance attainable by most standard classifier learning algorithms, which assume a relatively balanced class distribution and equal misclassification costs. This learning difficulty attracts a lot of research interests. Most efforts concentrate on bi-class problems. However, bi-class is not the only scenario where the class imbalance problem prevails. Reported solutions for bi-class applications are not applicable to multi-class problems. In this paper, we develop a cost-sensitive boosting algorithm to improve the classification performance of imbalanced data involving multiple classes. One barrier of applying the cost-sensitive boosting algorithm to the imbalanced data is that the cost matrix is often unavailable for a problem domain. To solve this problem, we apply Genetic Algorithm to search the optimum cost setup of each class. Empirical tests show that the proposed cost-sensitive boosting algorithm improves the classification performances of imbalanced data sets significantly.
Keywords :
data mining; genetic algorithms; learning (artificial intelligence); pattern classification; boosting algorithm; classifier learning algorithm; cost-sensitive boosting algorithm; data classification; genetic algorithm; imbalanced class distribution; multiple classes imbalance learning; Boosting; Classification algorithms; Cost function; Data mining; Drives; Iterative algorithms; Software standards; Software systems; Sun; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location :
Hong Kong
ISSN :
1550-4786
Print_ISBN :
0-7695-2701-7
Type :
conf
DOI :
10.1109/ICDM.2006.29
Filename :
4053085
Link To Document :
بازگشت