DocumentCode :
2771292
Title :
Improving SVM Classification on Imbalanced Data Sets in Distance Spaces
Author :
Koknar-Tezel, Suzan ; Latecki, Longin Jan
Author_Institution :
Dept. of Comput. Sci., St. Joseph´´s Univ., Philadelphia, PA, USA
fYear :
2009
fDate :
6-9 Dec. 2009
Firstpage :
259
Lastpage :
267
Abstract :
Imbalanced data sets present a particular challenge to the data mining community. Often, it is the rare event that is of interest and the cost of misclassifying the rare event is higher than misclassifying the usual event. When the data is highly skewed toward the usual, it can be very difficult for a learning system to accurately detect the rare event. There have been many approaches in recent years for handling imbalanced data sets, from under-sampling the majority class to adding synthetic points to the minority class in feature space. Distances between time series are known to be non-Euclidean and nonmetric, since comparing time series requires warping in time. This fact makes it impossible to apply standard methods like SMOTE to insert synthetic data points in feature spaces. We present an innovative approach that augments the minority class by adding synthetic points in distance spaces. We then use Support Vector Machines for classification. Our experimental results on standard time series show that our synthetic points significantly improve the classification rate of the rare events, and in many cases also improves the overall accuracy of SVM.
Keywords :
data mining; learning systems; pattern classification; support vector machines; SMOTE; SVM classification; data mining community; distance spaces; imbalanced data sets; learning system; support vector machines; synthetic data points; time series; Computer science; Costs; Data mining; Event detection; Learning systems; Petroleum; Sampling methods; Support vector machine classification; Support vector machines; USA Councils; imbalanced data sets; support vector machines; time series;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining, 2009. ICDM '09. Ninth IEEE International Conference on
Conference_Location :
Miami, FL
ISSN :
1550-4786
Print_ISBN :
978-1-4244-5242-2
Electronic_ISBN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2009.59
Filename :
5360251
Link To Document :
بازگشت