DocumentCode :
3261577
Title :
Reducing performance Bias for Unbalanced Text Mining
Author :
Zhuang, Ling ; Dai, Honghua
Author_Institution :
Sch. of Eng. & Inf. Technol., Deakin Univ., Burwood, Vic.
fYear :
2006
fDate :
Dec. 2006
Firstpage :
770
Lastpage :
774
Abstract :
In text categorization applications, class imbalance, which refers to an uneven data distribution where one class is represented by far more less instances than the others, is a commonly encountered problem. In such a situation, conventional classifiers tend to have a strong performance bias, which results in high accuracy rate on the majority class but very low rate on the minorities. An extreme strategy for unbalanced, learning is to discard the majority instances and apply one-class classification to the minority class. However, this could easily cause another type of bias, which increases the accuracy rate on minorities by sacrificing the majorities. This paper aims to investigate approaches that reduce these two types of performance bias and improve the reliability of discovered classification rules. Experimental results show that the inexact field learning method and parameter optimized one-class classifiers achieve more balanced performance than the standard approaches
Keywords :
data mining; learning (artificial intelligence); pattern classification; text analysis; class imbalance; classification rules; data distribution; field learning; one-class classification; performance bias; text categorization; text mining; Conferences; Data mining; Text mining;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining Workshops, 2006. ICDM Workshops 2006. Sixth IEEE International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2702-7
Type :
conf
DOI :
10.1109/ICDMW.2006.139
Filename :
4063729
Link To Document :
بازگشت