مرکز منطقه ای اطلاع رساني علوم و فناوري - A comparison for handling imbalanced datasets

DocumentCode :

1793653

Title :

A comparison for handling imbalanced datasets

Author :

Syaripudin, Arif ; Khodra, Masayu Leylia

Author_Institution :

Sch. of Electr. Eng. & Inf., Inst. Teknol. Bandungw Bandung, Bandung, Indonesia

fYear :

2014

fDate :

20-21 Aug. 2014

Firstpage :

293

Lastpage :

298

Abstract :

In various real case, imbalanced datasets problems are inevitable, such as in metal detecting security or diagnosis of disease. With the limitations of existing learning algorithms when faced with imbalanced datasets, the prediction error is caused by the dominance of the majority against the minority class. Various techniques have been made to address the above circumstances. This paper compares those techniques of handling imbalanced datasets with resample and ensembles. From a different standpoint, this paper examines how much influence the number of instances, number of attributes, the attributes data types, the number of the target class, and missing attribute values affect the classification results with performance analysis using f-measure. An experiment has resulted that the criteria regarding the number of attributes, attribute data types, and the number of the target class do not affect the classification results. While the missing attribute with values have an affect classification result. For better high F-measure, the experiment shows that the best performer is combination of SMOTE 5000/0 and AdaBoostMl.

Keywords :

data handling; learning (artificial intelligence); pattern classification; AdaBoostMl; SMOTE 5000/0; attributes data types; f-measure; imbalanced dataset handling; learning algorithms; missing attribute values; performance analysis; prediction error; Conferences; Decision support systems; Error analysis; Informatics; Nickel; Training; ensembles; imbalanced dataset; resamples;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Advanced Informatics: Concept, Theory and Application (ICAICTA), 2014 International Conference of

Conference_Location :

Bandung

Print_ISBN :

978-1-4799-6984-5

Type :

conf

DOI :

10.1109/ICAICTA.2014.7005957

Filename :

7005957

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1793653