DocumentCode :
2164555
Title :
A word-length adaptive method of Chinese dictionary construction
Author :
Zhan, Haisheng ; Yang, Liping ; Wang, Qihu
Author_Institution :
School of Computer Xidian University, Xian China
fYear :
2010
fDate :
4-6 Dec. 2010
Firstpage :
1811
Lastpage :
1814
Abstract :
The Chinese dictionary is critical to the storage and search of text information in Chinese search engine. Basing on the Exclusive-OR algorithm, the machine code combined with stroke number of Chinese character was employed to hash the words with different length into the corresponding space of hash value with according to the probability statistics results of a word presenting. As a result, the collision rate of hash value is brought down to 0.034% and the search efficiency is up. This method can be used in the construction of large-scale dynamic dictionary as well as the other task of natural language processing such as the construction of Chinese corpus, Chinese word input method design and so on.
Keywords :
Algorithm design and analysis; Dictionaries; Feature extraction; Information processing; Probability; Search engines; Vocabulary; Chinese Dictionary; Exclusive-OR Algorithm; Hash Function; Self-Adaptive of word length;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Science and Engineering (ICISE), 2010 2nd International Conference on
Conference_Location :
Hangzhou, China
Print_ISBN :
978-1-4244-7616-9
Type :
conf
DOI :
10.1109/ICISE.2010.5691889
Filename :
5691889
Link To Document :
بازگشت