Title :
Combination approach of SMOTE and biased-SVM for imbalanced datasets
Author_Institution :
Coll. of E-Bus., South China Univ. of Technol., Guangzhou
Abstract :
A new approach to construct the classifiers from imbalanced datasets is proposed by combining SMOTE (synthetic minority over-sampling technique) and Biased-SVM (biased support vector machine) approaches. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of ldquonormalrdquo examples with only a small percentage of ldquoabnormalrdquo or ldquointerestingrdquo examples. The cost of misclassifying an abnormal (interesting) example into a normal example is often much higher than that of the reverse error. It was known as a means of increasing the sensitivity of a classifier to the minority class using SMOTE over-sampling in minority class. But in this paper, it gives a good means of increasing the sensitivity of a classifier to the minority class by using SMOTE approaches within support vectors. As for support vector over-sampling, this paper proposes two different over-sampling algorithms to deal with the support vectors being over-sampled by its neighbors from the k nearest neighbors, not only within the support vectors but also within the entire minority class. Some experimental results confirms that the proposed combination approach of SMOTE and biased-SVM can achieve better classifier performance.
Keywords :
data analysis; pattern classification; support vector machines; SMOTE; biased support vector machine; biased-SVM; classification categories; imbalanced datasets; k nearest neighbors; synthetic minority over-sampling technique; Neural networks;
Conference_Titel :
Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-1820-6
Electronic_ISBN :
1098-7576
DOI :
10.1109/IJCNN.2008.4633794