• DocumentCode
    3581358
  • Title

    An empirical experimental evaluation on imbalanced data sets with varied imbalance ratio

  • Author

    Imran, Mohammad ; Mahmood, Ali Mirza ; Abdul Moiz Qyser, Ahmed

  • Author_Institution
    Muffakham Jah Coll. of Eng. & Technol., Hyderabad, India
  • fYear
    2014
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    Class imbalance presents a problem when traditional Classification algorithms are applied .In the previous years there are most important substitution and change has been carried out on data classification. Classification of data becomes difficult because of its unbalanced nature. The problem of imbalance class has developed into significant data mining issue. The class imbalance situation arises when one class is rare compared to the other, take place frequently in machine learning applications. Dataset of unbalanced learning is a new concept of machine learning which has applicability in real time, since all the datasets of real time are of unbalanced in nature. Researchers have rigorously studied several techniques to alleviate the problem of class imbalance, including resampling algorithms, ensemble learning and algorithmic modification for transforming vast amounts of skewed data efficiently into information and knowledge representation. In this paper, we conducted an empirical study on imbalance datasets. Experimental Results shows conclusion of some findings using Area Under Curve (AUC), precision, F-Measure, TN-rate TP-rate evaluation metrics.
  • Keywords
    data mining; knowledge representation; learning (artificial intelligence); pattern classification; sampling methods; AUC; F-Measure; TN-rate evaluation metrics; TP-rate evaluation metrics; algorithmic modification; area under curve; class imbalance; classification algorithms; data classification; data mining issue; ensemble learning; imbalanced data sets; information representation; knowledge representation; machine learning applications; resampling algorithms; unbalanced learning; varied imbalance ratio; Accuracy; Algorithm design and analysis; Classification algorithms; Measurement; Niobium; Sampling methods; Support vector machines; Classification; Imbalance Ratio (IR); Skewed data; Unbalanced data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and Communications Technologies (ICCCT), 2014 International Conference on
  • Type

    conf

  • DOI
    10.1109/ICCCT2.2014.7066742
  • Filename
    7066742