مرکز منطقه ای اطلاع رساني علوم و فناوري - Research of automatic Chinese word segmentation

DocumentCode :

389283

Title :

Research of automatic Chinese word segmentation

Author :

Liu, Kai-Ying ; Zheng, Jia-heng

Author_Institution :

Comput. Sci. Dept., Shanxi Univ., China

Volume :

fYear :

2002

fDate :

2002

Firstpage :

805

Abstract :

Automatic Chinese word segmentation is the fundamental task of Chinese information processing. At present ambiguous phrase segmentation and proper name recognition are two obstacles to the performances of Chinese word segmentation systems. We apply a corpus-based method to extract various language phenomena from real texts, and combine a statistical model with rules in Chinese word segmentation, which has increased the precision of segmentation by improving ambiguous phrase segmentation and unknown word recognition, and finally, we describe a Chinese word segmentation system developed by Shanxi University.

Keywords :

character recognition; inference mechanisms; knowledge based systems; learning (artificial intelligence); natural languages; text analysis; Chinese information processing; Shanxi University; ambiguous phrase segmentation; automatic Chinese word segmentation; corpus-based method; language phenomena; proper name recognition; statistical model; unknown word recognition; Character recognition; Computer science; Data mining; Information processing; Information retrieval; Large-scale systems; Modems; Natural languages; Text categorization; Text recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Machine Learning and Cybernetics, 2002. Proceedings. 2002 International Conference on

Print_ISBN :

0-7803-7508-4

Type :

conf

DOI :

10.1109/ICMLC.2002.1174493

Filename :

1174493

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=389283