Title :
An Improved Chinese Segmentation Algorithm Based on Segmentation Dictionary
Author :
Niu, Yan ; Li, Lala
Author_Institution :
Comput. Sch., Hubei Univ. of Technol., Wuhan, China
Abstract :
Based on the analysis of the traditional forward maximum matching word segmentation algorithm and the characteristics of the principle on the basis of the results of the use of word frequency statistics, we design a new structure of the dictionary, a dictionary based on the new structure to improve the matching algorithm are the largest. After time complexity analysis and experiments, the improved forward maximum matching algorithm can further improve the efficiency of segmentation.
Keywords :
dictionaries; linguistics; natural language processing; Chinese segmentation algorithm; forward maximum matching word segmentation algorithm; segmentation dictionary; time complexity analysis; word frequency statistics; Algorithm design and analysis; Dictionaries; Frequency; Handicapped aids; Information analysis; Information processing; Machine assisted indexing; Natural languages; Statistical analysis; White spaces; Chinese information processing; Chinese word segmentation; FMM algorithm; two-word root;
Conference_Titel :
Computer Technology and Development, 2009. ICCTD '09. International Conference on
Conference_Location :
Kota Kinabalu
Print_ISBN :
978-0-7695-3892-1
DOI :
10.1109/ICCTD.2009.125