Title :
A novel unsupervised non-iterative approach to word segmentation
Author :
Hanshi Wang ; Haining Xu ; Lizhen Liu ; Wei Song ; Jingli Lu
Author_Institution :
Inf. & Eng. Coll., Capital Normal Univ., Beijing, China
Abstract :
Word segmentation is the crucial first step of natural language understanding (NLU) for Chinese corpora. Our early paper presented "Evaluation Selection and Adjustment" (ESA), an unsupervised approach to word segmentation. In this article, we present a novel non-iterative variation of ESA, comparing it with other similar methods and we get better performance. Besides that, we analyze "Balancing" and compare it with "Standardizing", another algorithm, to solve the problem of "how to evaluate the words of different lengths and compare them with each other for statistical methods of word segmentation". The experimental results show that "Balancing" is more effective than "Standardizing" for the task.
Keywords :
feature selection; natural language processing; text analysis; word processing; Chinese corpora; ESA; NLU; evaluation selection and adjustment; natural language understanding; unsupervised noniterative approach; word segmentation; non-iterative; standardizing; unsupervised; word segmentation;
Conference_Titel :
Software Engineering and Service Science (ICSESS), 2014 5th IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4799-3278-8
DOI :
10.1109/ICSESS.2014.6933693