DocumentCode :
2329385
Title :
Using Weight-Retouching and Under-Sampling SVM Approaches for Text Categorization on Imbalanced Data
Author :
Wang He-Yong
Author_Institution :
Coll. of E-Bus., South China Univ. of Technol., Guangzhou, China
fYear :
2009
fDate :
23-24 May 2009
Firstpage :
1
Lastpage :
4
Abstract :
More and more textual documents are available, which makes it more difficult to manage text data and to retrieve useful information from document contents. Text categorization is an important way to help resolve this problem, which is an increasingly important field and has been extensively studied. In this paper, we pay attention to the performance of the minority text class and attempt to improve its precision by using some techniques of imbalanced data processing. A weight-retouching and under-sampling Support Vector Machine(SVM) approaches have been taken into account. And it shows that the processing approaches of imbalanced text data by using weight-retouching and under-sampling SVM will make improvement on precision of minority class, while it won´t blemish the global performance.
Keywords :
support vector machines; text analysis; document content; imbalanced text data processing; minority text class; support vector machine; text categorization; text data management; textual document; under-sampling SVM; weight-retouching; Appraisal; Computational efficiency; Content based retrieval; Content management; Data processing; Frequency conversion; Information retrieval; Support vector machine classification; Support vector machines; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
E-Business and Information System Security, 2009. EBISS '09. International Conference on
Conference_Location :
Wuhan
Print_ISBN :
978-1-4244-2909-7
Electronic_ISBN :
978-1-4244-2910-3
Type :
conf
DOI :
10.1109/EBISS.2009.5138143
Filename :
5138143
Link To Document :
بازگشت