DocumentCode :
2957106
Title :
ADASYN: Adaptive synthetic sampling approach for imbalanced learning
Author :
He, Haibo ; Bai, Yang ; Garcia, Edwardo A. ; Li, Shutao
Author_Institution :
Dept. of Electr. & Comput. Eng., Stevens Inst. of Technol., Hoboken, NJ
fYear :
2008
fDate :
1-8 June 2008
Firstpage :
1322
Lastpage :
1328
Abstract :
This paper presents a novel adaptive synthetic (ADASYN) sampling approach for learning from imbalanced data sets. The essential idea of ADASYN is to use a weighted distribution for different minority class examples according to their level of difficulty in learning, where more synthetic data is generated for minority class examples that are harder to learn compared to those minority examples that are easier to learn. As a result, the ADASYN approach improves learning with respect to the data distributions in two ways: (1) reducing the bias introduced by the class imbalance, and (2) adaptively shifting the classification decision boundary toward the difficult examples. Simulation analyses on several machine learning data sets show the effectiveness of this method across five evaluation metrics.
Keywords :
learning (artificial intelligence); pattern classification; sampling methods; statistical distributions; adaptive synthetic sampling approach; classification decision boundary; imbalanced data classification; imbalanced data set learning; weighted distribution; Bioinformatics; Boosting; Cancer; Data analysis; Data mining; Decision trees; Helium; Machine learning; Sampling methods; Space technology;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
Conference_Location :
Hong Kong
ISSN :
1098-7576
Print_ISBN :
978-1-4244-1820-6
Electronic_ISBN :
1098-7576
Type :
conf
DOI :
10.1109/IJCNN.2008.4633969
Filename :
4633969
Link To Document :
بازگشت