DocumentCode :
3306718
Title :
Research on short text classification for web forum
Author :
Xiaochun He ; Conghui Zhu ; Tiejun Zhao
Author_Institution :
MOE-MS Key Lab. of Natural Language Process. & Speech, Harbin Inst. of Technol., Harbin, China
Volume :
2
fYear :
2011
fDate :
26-28 July 2011
Firstpage :
1052
Lastpage :
1056
Abstract :
The unique characteristic of short text makes short text classification quite different from traditional long text processing. The feature space of short text is so sparse, which makes it notoriously difficult to extract sufficient and effective features. In this paper, aiming to classify the short text on web forum accurately, a novel short-text-processing method based on semantic extension is introduced to enhance the content of the original short text, which effectively solves the problem of feature sparse. In addition, we put forward the concept of Key-Pattern (KP) and propose a new text feature representation approach based on KP, which extracts phrase with powerful semantic information as the text features. Traditional classifier model are applied to estimate the text´s classification, experimental results show that the proposed method is effective to improve the accuracy and recall of short text classification.
Keywords :
Internet; feature extraction; pattern classification; text analysis; Web forum; classifier model; feature extraction; feature sparse problem; key-pattern concept; long text processing; semantic extension; short text classification; short-text-processing method; text feature representation approach; Classification algorithms; Feature extraction; Internet; Noise measurement; Semantics; Text categorization; Key-Pattern; Semantic extension; Short text classification; Text representation; Web forum;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-180-9
Type :
conf
DOI :
10.1109/FSKD.2011.6019652
Filename :
6019652
Link To Document :
بازگشت