DocumentCode :
2119650
Title :
The Method of Text Categorization on Imbalanced Datasets
Author :
Xin-fu, Li ; Yan, Yu ; Peng, Yin
Author_Institution :
Coll. of Math. & Comput. Sci., Hebei Univ., Baoding
fYear :
2009
fDate :
27-28 Feb. 2009
Firstpage :
650
Lastpage :
653
Abstract :
In practical applications, datasets are usually imbalanced, but traditional approaches usually lead a low recognition rate. To address this problem, in this paper, over-sampling of the minority class has been proposed to increase the number of minority class, so as to achieve balance, thereby enhancing recognition rate of minority class. The experiments show that this approach achieved satisfactory results.
Keywords :
classification; text analysis; imbalanced dataset; minority class recognition; text categorization; Application software; Artificial intelligence; Computer science; Educational institutions; Information retrieval; Machine learning; Mathematics; Seminars; Text categorization; Text recognition; SVM; TF-IDF; imbalanced dataset; text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Communication Software and Networks, 2009. ICCSN '09. International Conference on
Conference_Location :
Macau
Print_ISBN :
978-0-7695-3522-7
Type :
conf
DOI :
10.1109/ICCSN.2009.70
Filename :
5076934
Link To Document :
بازگشت