DocumentCode
2775645
Title
Applying adaptive over-sampling technique based on data density and cost-sensitive SVM to imbalanced learning
Author
Wang, Senzhang ; Li, Zhoujun ; Chao, Wenhan ; Cao, Qinghua
Author_Institution
State Key Lab. of Software Dev. Environ., Beihang Univ., Beijing, China
fYear
2012
fDate
10-15 June 2012
Firstpage
1
Lastpage
8
Abstract
Resampling method is a popular and effective technique to imbalanced learning. However, most resampling methods ignore data density information and may lead to overfitting. A novel adaptive over-sampling technique based on data density (ASMOBD) is proposed in this paper. Compared with existing resampling algorithms, ASMOBD can adaptively synthesize different number of new samples around each minority sample according to its level of learning difficulty. Therefore, this method makes the decision region more specific and can eliminate noise. What´s more, to avoid over generalization, two smoothing methods are proposed. Cost- Sensitive learning is also an effective technique to imbalanced learning. In this paper, ASMOBD and Cost-Sensitive SVM are combined. Experiments show that our methods perform better than various state-of-art approaches on 9 UCI datasets by using metrics of G-mean and area under the receiver operation curve (AUC).
Keywords
data analysis; learning (artificial intelligence); sampling methods; smoothing methods; support vector machines; G-mean of; UCI dataset; adaptive over-sampling technique; cost-sensitive SVM; cost-sensitive learning; data density information; decision region; imbalanced learning; learning difficulty; minority sample; over generalization; overfitting; receiver operation curve; resampling method; smoothing method; Algorithm design and analysis; Classification algorithms; Interpolation; Measurement; Noise; Smoothing methods; Support vector machines; Cost-sensitive SVM; imbalanced learning; over-sampling;
fLanguage
English
Publisher
ieee
Conference_Titel
Neural Networks (IJCNN), The 2012 International Joint Conference on
Conference_Location
Brisbane, QLD
ISSN
2161-4393
Print_ISBN
978-1-4673-1488-6
Electronic_ISBN
2161-4393
Type
conf
DOI
10.1109/IJCNN.2012.6252696
Filename
6252696
Link To Document