DocumentCode :
3581358
Title :
An empirical experimental evaluation on imbalanced data sets with varied imbalance ratio
Author :
Imran, Mohammad ; Mahmood, Ali Mirza ; Abdul Moiz Qyser, Ahmed
Author_Institution :
Muffakham Jah Coll. of Eng. & Technol., Hyderabad, India
fYear :
2014
Firstpage :
1
Lastpage :
7
Abstract :
Class imbalance presents a problem when traditional Classification algorithms are applied .In the previous years there are most important substitution and change has been carried out on data classification. Classification of data becomes difficult because of its unbalanced nature. The problem of imbalance class has developed into significant data mining issue. The class imbalance situation arises when one class is rare compared to the other, take place frequently in machine learning applications. Dataset of unbalanced learning is a new concept of machine learning which has applicability in real time, since all the datasets of real time are of unbalanced in nature. Researchers have rigorously studied several techniques to alleviate the problem of class imbalance, including resampling algorithms, ensemble learning and algorithmic modification for transforming vast amounts of skewed data efficiently into information and knowledge representation. In this paper, we conducted an empirical study on imbalance datasets. Experimental Results shows conclusion of some findings using Area Under Curve (AUC), precision, F-Measure, TN-rate TP-rate evaluation metrics.
Keywords :
data mining; knowledge representation; learning (artificial intelligence); pattern classification; sampling methods; AUC; F-Measure; TN-rate evaluation metrics; TP-rate evaluation metrics; algorithmic modification; area under curve; class imbalance; classification algorithms; data classification; data mining issue; ensemble learning; imbalanced data sets; information representation; knowledge representation; machine learning applications; resampling algorithms; unbalanced learning; varied imbalance ratio; Accuracy; Algorithm design and analysis; Classification algorithms; Measurement; Niobium; Sampling methods; Support vector machines; Classification; Imbalance Ratio (IR); Skewed data; Unbalanced data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and Communications Technologies (ICCCT), 2014 International Conference on
Type :
conf
DOI :
10.1109/ICCCT2.2014.7066742
Filename :
7066742
Link To Document :
بازگشت