Title :
The Method of Text Categorization on Imbalanced Datasets
Author :
Xin-fu, Li ; Yan, Yu ; Peng, Yin
Author_Institution :
Coll. of Math. & Comput. Sci., Hebei Univ., Baoding
Abstract :
In practical applications, datasets are usually imbalanced, but traditional approaches usually lead a low recognition rate. To address this problem, in this paper, over-sampling of the minority class has been proposed to increase the number of minority class, so as to achieve balance, thereby enhancing recognition rate of minority class. The experiments show that this approach achieved satisfactory results.
Keywords :
classification; text analysis; imbalanced dataset; minority class recognition; text categorization; Application software; Artificial intelligence; Computer science; Educational institutions; Information retrieval; Machine learning; Mathematics; Seminars; Text categorization; Text recognition; SVM; TF-IDF; imbalanced dataset; text categorization;
Conference_Titel :
Communication Software and Networks, 2009. ICCSN '09. International Conference on
Conference_Location :
Macau
Print_ISBN :
978-0-7695-3522-7
DOI :
10.1109/ICCSN.2009.70