DocumentCode :
2711796
Title :
Unbalanced data classification using extreme outlier elimination and sampling techniques for fraud detection
Author :
Padmaja, T. Maruthi ; Dhulipalla, Narendra ; Bapi, Raju S. ; Krishna, P. Radha
Author_Institution :
Inst. for Dev. & Res. in Banking Technol., Hyderabad
fYear :
2007
fDate :
18-21 Dec. 2007
Firstpage :
511
Lastpage :
516
Abstract :
Detecting fraud from the highly overlapped and imbalanced fraud dataset is a challenging task. To solve this problem, we propose a new approach called extreme outlier elimination and hybrid sampling technique, k reverse ´nearest neighbors (kRNNs) concept used as a data cleaning method for eliminating extreme outliers in minority regions. Hybrid sampling technique, a combination of SMOTE to over-sample the minority data (fraud samples) and random under- sampling to under-sample the majority data (non-fraud samples) is used for improving the fraud detection accuracy. This method was evaluated in terms of True Positive rate and True Negative rate on the insurance fraud dataset. We conducted the experiments with classifiers namely C4.5, naive Bayes, k-NN and Radial Basis Function networks and compared the performance of our approach against simple hybrid sampling technique. Obtained results shown that extreme outlier elimination from minority class, produce high predictions for both fraud and non-fraud classes.
Keywords :
fraud; pattern classification; sampling methods; security of data; data cleaning method; extreme outlier elimination; fraud detection; k reverse nearest neighbor; synthetic minority over-sampling technique; unbalanced data classification; Association rules; Banking; Cleaning; Computer crime; Costs; Data mining; Insurance; Nearest neighbor searches; Radial basis function networks; Sampling methods;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Computing and Communications, 2007. ADCOM 2007. International Conference on
Conference_Location :
Guwahati, Assam
Print_ISBN :
0-7695-3059-1
Type :
conf
DOI :
10.1109/ADCOM.2007.74
Filename :
4426020
Link To Document :
بازگشت