• DocumentCode
    2957106
  • Title

    ADASYN: Adaptive synthetic sampling approach for imbalanced learning

  • Author

    He, Haibo ; Bai, Yang ; Garcia, Edwardo A. ; Li, Shutao

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Stevens Inst. of Technol., Hoboken, NJ
  • fYear
    2008
  • fDate
    1-8 June 2008
  • Firstpage
    1322
  • Lastpage
    1328
  • Abstract
    This paper presents a novel adaptive synthetic (ADASYN) sampling approach for learning from imbalanced data sets. The essential idea of ADASYN is to use a weighted distribution for different minority class examples according to their level of difficulty in learning, where more synthetic data is generated for minority class examples that are harder to learn compared to those minority examples that are easier to learn. As a result, the ADASYN approach improves learning with respect to the data distributions in two ways: (1) reducing the bias introduced by the class imbalance, and (2) adaptively shifting the classification decision boundary toward the difficult examples. Simulation analyses on several machine learning data sets show the effectiveness of this method across five evaluation metrics.
  • Keywords
    learning (artificial intelligence); pattern classification; sampling methods; statistical distributions; adaptive synthetic sampling approach; classification decision boundary; imbalanced data classification; imbalanced data set learning; weighted distribution; Bioinformatics; Boosting; Cancer; Data analysis; Data mining; Decision trees; Helium; Machine learning; Sampling methods; Space technology;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Neural Networks, 2008. IJCNN 2008. (IEEE World Congress on Computational Intelligence). IEEE International Joint Conference on
  • Conference_Location
    Hong Kong
  • ISSN
    1098-7576
  • Print_ISBN
    978-1-4244-1820-6
  • Electronic_ISBN
    1098-7576
  • Type

    conf

  • DOI
    10.1109/IJCNN.2008.4633969
  • Filename
    4633969