DocumentCode :
130979
Title :
A novel unsupervised non-iterative approach to word segmentation
Author :
Hanshi Wang ; Haining Xu ; Lizhen Liu ; Wei Song ; Jingli Lu
Author_Institution :
Inf. & Eng. Coll., Capital Normal Univ., Beijing, China
fYear :
2014
fDate :
27-29 June 2014
Firstpage :
824
Lastpage :
827
Abstract :
Word segmentation is the crucial first step of natural language understanding (NLU) for Chinese corpora. Our early paper presented "Evaluation Selection and Adjustment" (ESA), an unsupervised approach to word segmentation. In this article, we present a novel non-iterative variation of ESA, comparing it with other similar methods and we get better performance. Besides that, we analyze "Balancing" and compare it with "Standardizing", another algorithm, to solve the problem of "how to evaluate the words of different lengths and compare them with each other for statistical methods of word segmentation". The experimental results show that "Balancing" is more effective than "Standardizing" for the task.
Keywords :
feature selection; natural language processing; text analysis; word processing; Chinese corpora; ESA; NLU; evaluation selection and adjustment; natural language understanding; unsupervised noniterative approach; word segmentation; non-iterative; standardizing; unsupervised; word segmentation;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering and Service Science (ICSESS), 2014 5th IEEE International Conference on
Conference_Location :
Beijing
ISSN :
2327-0586
Print_ISBN :
978-1-4799-3278-8
Type :
conf
DOI :
10.1109/ICSESS.2014.6933693
Filename :
6933693
Link To Document :
بازگشت