DocumentCode
2188387
Title
A Novel Hybrid system for Large-Scale Chinese Text Classification Problem
Author
Gao, Zhong ; Lu, Guanming ; Gu, Daquan
Author_Institution
Coll. of Telecommun. & Inf. Eng., Nanjing Univ. of Posts & Telecommun., Nanjing, China
fYear
2008
fDate
27-28 Dec. 2008
Firstpage
121
Lastpage
124
Abstract
Most of the Chinese text classification systems are all based on the technology of bag of words (BW) which is a valid probability tool for text representation and can provide a better semantic architecture. But the weakness in classification accuracy is still unconquerable. Support vector machine (SVM) has become a popular classification tool and can be applied in the scheme, but the main disadvantages of SVM algorithms are their large memory requirement and computation time to deal with very large datasets. In this paper, we propose a hybrid system based on BW and a novel cascade SVM with feedback that can be splitting the problem into smaller subsets and training a network to assign samples of different subsets. The proposed parallel training algorithm on large-scale classification problems where multiple SVM classifiers are applied speeds up the process of training SVM and increase the classification accuracy.
Keywords
classification; feedback; natural language processing; probability; support vector machines; text analysis; word processing; bag of words; feedback; large-scale Chinese text classification; probability tool; semantic architecture; support vector machine; text representation; Computer architecture; Educational institutions; Feedback; Large-scale systems; Machine learning; Natural language processing; Quadratic programming; Support vector machine classification; Support vector machines; Text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Frontier of Computer Science and Technology, 2008. FCST '08. Japan-China Joint Workshop on
Conference_Location
Nagasahi
Print_ISBN
978-1-4244-3418-3
Type
conf
DOI
10.1109/FCST.2008.29
Filename
4736518
Link To Document