DocumentCode :
605948
Title :
Optimization for Vietnamese text classification problem by reducing features set
Author :
Ha Nguyen Thi Thu ; Quynh Nguyen Huu ; Khanh Nguyen Thi Hong ; Hung Le Manh
Author_Institution :
Dept. of Comput. Sci., Vietnam Electr. Power Univ., Hanoi, Vietnam
fYear :
2012
fDate :
23-25 Oct. 2012
Firstpage :
209
Lastpage :
212
Abstract :
Vietnamese is the single syllable language, so that process of word segmentation is relatively complex, if split word based on whitespaces, it is not accuracy, on the other hand Vietnamese segmentation tools are not high effective. In this paper, we propose a new method that used only topic word for calculating to increase accuracy of the Vietnameses text classification system and optimize the process of calculating. The experimental results show that our method more effective than the proposed approach, higher accuracy and reduce the computational complexity.
Keywords :
classification; computational complexity; natural language processing; optimisation; text analysis; word processing; Vietnamese segmentation tools; Vietnamese text classification problem; Vietnamese text classification system accuracy calculation; calculation process optimization; computational complexity reduction; feature set reduction; split word; topic word; whitespaces; word segmentation; Vietnamese text classification; feature set reduction; syllable language; topic word;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Science and Service Science and Data Mining (ISSDM), 2012 6th International Conference on New Trends in
Conference_Location :
Taipei
Print_ISBN :
978-1-4673-0876-2
Type :
conf
Filename :
6528628
Link To Document :
بازگشت