DocumentCode
3039307
Title
Design of Chinese Text Categorization Classifier Based on Attribute Bagging
Author
Zhang, Xiang ; Zhou, Mingquan ; Dong, Lili ; Ye, Na
Author_Institution
Coll. of Inf. Sci. & Technol., Northwest Univ., Xi´´an, China
fYear
2009
fDate
24-26 July 2009
Firstpage
201
Lastpage
204
Abstract
In order to improve the precise rate and recall rate of Chinese text classifier, an improved bagging algorithm - attribute bagging is used in this paper. Document is represented by vector space model and information gain is used to do the feature selection. Re-sampling attributes is used to get multiple training sets and the kNN is selected as the individual classifier. The classification result is attained by voting. Experiments show that the attribute bagging gets lower errors and better performance than bagging and kNN in Chinese text categorization.
Keywords
support vector machines; text analysis; Chinese text categorization classifier; attribute bagging algorithm; information gain; multiple training set; resampling attributes; support vector machine; vector space model; Algorithm design and analysis; Bagging; Boosting; Control engineering; Educational institutions; Frequency; Information science; Machine learning; Space technology; Text categorization; Chinese text categorization; attribute bagging; information gain; vector space model;
fLanguage
English
Publisher
ieee
Conference_Titel
Business Intelligence and Financial Engineering, 2009. BIFE '09. International Conference on
Conference_Location
Beijing
Print_ISBN
978-0-7695-3705-4
Type
conf
DOI
10.1109/BIFE.2009.55
Filename
5208903
Link To Document