DocumentCode :
2039417
Title :
Research of segmentation of Chinese texts in Chinese search engine
Author :
Lixin, Zhou
Author_Institution :
Inst. of Comput. Technol., Acad. Sinica, Beijing, China
Volume :
4
fYear :
2001
fDate :
2001
Firstpage :
2627
Abstract :
Segmenting Chinese texts into Chinese words is a very difficult problem. In this paper, a framework for a Chinese Internet search engine is presented. It discusses the characteristics and difficulties of segmentation of Chinese texts in Chinese search engines. The paper concludes that the correctness of Chinese segmentation is most important, and puts forward tactics for processing disambiguation of segmentation strings, new unknown words and stop words, and presents methods which satisfy the consistency of Chinese segmentation
Keywords :
Internet; search engines; text analysis; Chinese internet search engine; Chinese text segmentation; Chinese words; new unknown words; processing disambiguation; segmentation strings; stop words; Computers; Content based retrieval; Context modeling; Dictionaries; Indexing; Information retrieval; Internet; Natural languages; Search engines; Sorting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems, Man, and Cybernetics, 2001 IEEE International Conference on
Conference_Location :
Tucson, AZ
ISSN :
1062-922X
Print_ISBN :
0-7803-7087-2
Type :
conf
DOI :
10.1109/ICSMC.2001.972960
Filename :
972960
Link To Document :
بازگشت