DocumentCode
476201
Title
Chinese text categorization based on CCIPCA and SMO
Author
Li, Xin-fu ; He, Hai-bin ; Zhao, Lei-lei
Author_Institution
Coll. of Math. & Comput. Sci., Hebei Univ., Baoding
Volume
5
fYear
2008
fDate
12-15 July 2008
Firstpage
2514
Lastpage
2518
Abstract
Vector space model is usually used to express text for text categorization. How to reduce the dimensionality of feature space is a very key problem for practical text classification. The classical decomposition algorithms are incapable of dealing with the high-dimensional and large-scale text categorization problems. In this paper an approach to improving the performance of text categorization is presented by using candid incremental principal component analysis and sequential minimization optimization algorithm. The experimental result shows that the proposed method for Chinese text categorization is practicable and effective.
Keywords
minimisation; natural language processing; principal component analysis; text analysis; Chinese text categorization; candid incremental principal component analysis; dimensionality reduction; sequential minimization optimization algorithm; text classification; Covariance matrix; Cybernetics; Feature extraction; Frequency; Indexing; Large-scale systems; Machine learning; Minimization methods; Principal component analysis; Text categorization; Candid incremental principal component analysis (CCIPCA); Dimension reduction; Sequential minimization optimization algorithm (SMO); Text categorization;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2008 International Conference on
Conference_Location
Kunming
Print_ISBN
978-1-4244-2095-7
Electronic_ISBN
978-1-4244-2096-4
Type
conf
DOI
10.1109/ICMLC.2008.4620831
Filename
4620831
Link To Document