DocumentCode :
1954452
Title :
A Dictionary Mechanism for Chinese Word Segmentation Based on the Finite Automata
Author :
Yang, Wu ; Ren, Li-Yun ; Tang, Rong
Author_Institution :
Inf. & Educ. Technol. Center, Chongqing Univ. of Technol., Chongqing, China
fYear :
2010
fDate :
28-30 Dec. 2010
Firstpage :
39
Lastpage :
42
Abstract :
Dictionary mechanism is the basis of Chinese word segmentation, and its quality directly affects the speed and efficiency of Chinese word segmentation. In existing dictionary mechanisms, there are such shortages as space wasting, low efficiency, and difficult maintenance, and therefore, how to establish an effective mechanism is an urgent problem for Chinese word segmentation. In this paper, the idea of finite-state automaton is firstly studied, then a new kind of dictionary mechanism is proposed to save space and improve the speed of Chinese word segmentation as possible, and finally, the performances of various dictionary mechanisms are analyzed with theoretical study and experimental comparison. The result shows that compared with other mechanisms, the dictionary mechanism based on finite-state automaton proposed in the paper improves in space complexity and time complexity.
Keywords :
dictionaries; finite state machines; linguistics; text analysis; word processing; Chinese word segmentation; dictionary mechanism; finite state automata; space complexity; time complexity; Algorithm design and analysis; Automata; Complexity theory; Dictionaries; Indexing; Presses; Chinese word segmentation; complexity; dictionary mechanism; finite-state automaton;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Asian Language Processing (IALP), 2010 International Conference on
Conference_Location :
Harbin
Print_ISBN :
978-1-4244-9063-9
Type :
conf
DOI :
10.1109/IALP.2010.52
Filename :
5681563
Link To Document :
بازگشت