مرکز منطقه ای اطلاع رساني علوم و فناوري - The problem of classification in imbalanced data sets in knowledge discovery

DocumentCode :

2869754

Title :

The problem of classification in imbalanced data sets in knowledge discovery

Author :

Haifeng Sui ; Bingru Yang ; Yun Zhai ; Wu Qu ; Bing An

Author_Institution :

Sch. of Inf. Eng., Univ. of Sci. & Technol. Beijing, Beijing, China

Volume :

fYear :

2010

fDate :

22-24 Oct. 2010

Abstract :

It has been observed that classification in imbalanced data sets have drawn more attention to researchers in knowledge discovery and data mining fields. In such problems, almost all the samples are labeled as one class, while far fewer samples are labeled as the other class, which are usually more important. But traditional classifiers that try to pursue whole accurate performance over a full range of samples are not suitable to deal with classification in imbalanced data sets, since they tend to biases towards majority class while pay less attention to the rare one. In the present work, we perform a review of the most important research lines on this topic and point out several directions for further investigation.

Keywords :

data mining; pattern classification; sampling methods; data mining; data set classification; imbalanced data set; knowledge discovery; sample labeling; Accuracy; Boosting; Classification algorithms; Data mining; Prediction algorithms; Training; classification; ensemble; imbalanced data sets; knowledge discovery; sampling;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Computer Application and System Modeling (ICCASM), 2010 International Conference on

Conference_Location :

Taiyuan

Print_ISBN :

978-1-4244-7235-2

Electronic_ISBN :

978-1-4244-7237-6

Type :

conf

DOI :

10.1109/ICCASM.2010.5622948

Filename :

5622948

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2869754