DocumentCode :
3394074
Title :
A New Model for Chinese Short-text Classification Considering Feature Extension
Author :
Fan, Xinghua ; Hu, Hongge
Author_Institution :
Coll. of Comput. Sci. & Technol., Chongqing Univ. of Posts & Telecommun., Chongqing, China
Volume :
2
fYear :
2010
fDate :
23-24 Oct. 2010
Firstpage :
7
Lastpage :
11
Abstract :
This paper presents a new model for classifying Chinese short-text that have weak concept signal, in which three key factors on feature extension, which would determine the classification performance of short-text, are considered. For the sake of determining the three extension factors, this paper studied the three key issues as follows: (1) how we do feature extension for short-text; (2) what influence the different ways of feature extension do to classification performance of short-text; (3) how we control the degree of feature extension for short text. In the stage of classification, a short-text is first extended by adding new features or modifying the weights of initial features according to the relationship between non-feature terms and feature extension mode; meanwhile, we would improve the effect of feature extension by controlling the degree of feature extension, and then classify the extended short-text with the new model. The experimental results show that the new model proposed for short-text classification considering feature extension can obtain higher classification performance comparing with the conventional classification methods.
Keywords :
pattern classification; text analysis; Chinese short text; classification performance; concept signal; feature extension; short text classification; Analytical models; Computational modeling; Feature extraction; Libraries; Noise; Testing; Training; Chinese short-text classification; feature extension; high-quality feature extension mode; new classification model;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Artificial Intelligence and Computational Intelligence (AICI), 2010 International Conference on
Conference_Location :
Sanya
Print_ISBN :
978-1-4244-8432-4
Type :
conf
DOI :
10.1109/AICI.2010.125
Filename :
5655287
Link To Document :
بازگشت