Title :
ADASYN: Adaptive synthetic sampling approach for imbalanced learning
Author :
He, Haibo ; Bai, Yang ; Garcia, Edwardo A. ; Li, Shutao
Author_Institution :
Dept. of Electr. & Comput. Eng., Stevens Inst. of Technol., Hoboken, NJ
Abstract :
This paper presents a novel adaptive synthetic (ADASYN) sampling approach for learning from imbalanced data sets. The essential idea of ADASYN is to use a weighted distribution for different minority class examples according to their level of difficulty in learning, where more synthetic data is generated for minority class examples that are harder to learn compared to those minority examples that are easier to learn. As a result, the ADASYN approach improves learning with respect to the data distributions in two ways: (1) reducing the bias introduced by the class imbalance, and (2) adaptively shifting the classification decision boundary toward the difficult examples. Simulation analyses on several machine learning data sets show the effectiveness of this method across five evaluation metrics.
Keywords :
learning (artificial intelligence); pattern classification; sampling methods; statistical distributions; adaptive synthetic sampling approach; classification decision boundary; imbalanced data classification; imbalanced data set learning; weighted distribution; Bioinformatics; Boosting; Cancer; Data analysis; Data mining; Decision trees; Helium; Machine learning; Sampling methods; Space technology;
Conference_Titel :
Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
Conference_Location :
Hong Kong
Print_ISBN :
978-1-4244-1820-6
Electronic_ISBN :
1098-7576
DOI :
10.1109/IJCNN.2008.4633969