DocumentCode :
2550427
Title :
Text classification using multi-word features
Author :
Zhang, Wen ; Yoshida, Taketoshi ; Tang, Xijin
Author_Institution :
Japan Adv. Inst. of Sci. & Technol., Ishikawa
fYear :
2007
fDate :
7-10 Oct. 2007
Firstpage :
3519
Lastpage :
3524
Abstract :
We carried out a series of experiments on text classification using multi-word features. An automated method was proposed to extract the multi-words from text data set and two different strategies were developed to normalize the multi-words into two different versions of multi-word features. After the texts were represented respectively using these two different multi-word features, text classification was conducted in contrast to examine the effectiveness of these two strategies. Also the linear and nonlinear polynomial kernel of support vector machine (SVM) was compared on the performance of text classification task.
Keywords :
feature extraction; pattern classification; text analysis; multi word feature extraction; nonlinear polynomial kernel; support vector machine; text classification; text dataset; Data mining; Data preprocessing; Feature extraction; Kernel; Logic; Ontologies; Personnel; Support vector machine classification; Support vector machines; Text categorization;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on
Conference_Location :
Montreal, Que.
Print_ISBN :
978-1-4244-0990-7
Electronic_ISBN :
978-1-4244-0991-4
Type :
conf
DOI :
10.1109/ICSMC.2007.4414208
Filename :
4414208
Link To Document :
بازگشت