Title :
A strategy for classifying imbalanced data sets based on particle swarm optimization
Author :
Ceballes-Serrano, C.C. ; García-López, S. ; Jaramillo-Garzón, J.A. ; Castellanos-Domínguez, Y.G.
Author_Institution :
Signal Process. & Recognition Group, Univ. Nac. de Colombia, Medellín, Colombia
Abstract :
Learning from imbalanced data has taken great interest on machine learning community because it is often present on many practical applications and reliability of learning algorithms is affected. A dataset is imbalanced if there is a great difference between observations from each class. Classification methods that do not consider this phenomenon are prone to produce decision boundaries totally biased towards the majority class. Today, assembly methods like DataBoost-IM combine sampling strategies with Boosting, and oversampling methods. However, when the input data has much noise these algorithms tend to reduce their performances. This work present a new method to deal with imbalanced data called SwarmBoost that combines Bossting, oversampling, and sub sampling based in optimization criteria to select samples. The results show that SwarmBoost has a better performance than DataBoost-IM and Smote for several databases.
Keywords :
learning (artificial intelligence); particle swarm optimisation; pattern classification; reliability theory; SwarmBoost method; assembly methods; classification methods; decision boundaries; imbalanced data set classification; learning algorithms; machine learning; oversampling methods; particle swarm optimization; sampling strategies; Boosting; Breast; Heart; Media; Single photon emission computed tomography; Vectors; DataBoost-IM; Smote; SwarmBoost and PSO;
Conference_Titel :
Image, Signal Processing, and Artificial Vision (STSIVA), 2012 XVII Symposium of
Conference_Location :
Antioquia
Print_ISBN :
978-1-4673-2759-6
DOI :
10.1109/STSIVA.2012.6340585