Title :
A Pragmatic Approach to Increase Accuracy of Chinese Word-Segmentation
Author :
Wenyu, Chen ; Biao, Chen ; Tao, Xiang ; Zhongquan, Zhang
Author_Institution :
Sch. of Comput. Sci. & Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu, China
Abstract :
Chinese word segmentation is important for understanding and dealing with Chinese natural language, and it is also a important part of search engineer, text retrieval, speech recognition, automatic translation. Chinese word segmentation is challenging because there is no space or physical means to mark the boundaries of words. It is often difficult to define what constitutes a word in Chinese. Currently, we have not yet fully mature and practical-oriented available Chinese word segmentation system, especially in the word-segmentation accuracy. This article presents a pragmatic approach to Chinese word segmentation to increase the word-segmentation accuracy. We introduce the combining mechanism of hybrid dictionary and universal dictionary, we design the practical data structure and describe this word segmentation algorithm, and give the test results.
Keywords :
character recognition; dictionaries; image segmentation; language translation; natural language processing; word processing; Chinese natural language; automatic translation; chinese word segmentation; data structure; hybrid dictionary; search engineer; speech recognition; text retrieval; universal dictionary; Accuracy; Arrays; Computational modeling; Dictionaries; History; Indexes; Pragmatics; Chinese word segmentation; hybrid dictionary; search engineer; word-segmentation accuracy;
Conference_Titel :
Information Technology and Applications (IFITA), 2010 International Forum on
Conference_Location :
Kunming
Print_ISBN :
978-1-4244-7621-3
Electronic_ISBN :
978-1-4244-7622-0
DOI :
10.1109/IFITA.2010.262