Title :
Resolution to Chinese combinational ambiguity combined corpus-based method with linguistics knowledge
Author :
Liu, JiangYang ; Liu, Ying
Author_Institution :
Dept. of Chinese Language & Literature, Tsinghua Univ., Beijing, China
Abstract :
Combinational ambiguity is a challenging issue in Chinese word segmentation in that its disambiguation depends on the contextual information. This paper collects contextual information of 28 typical combinational ambiguity strings, and makes use of lexical, syntactic and semantic knowledge and large scale corpus to summarize the rules of these combinational ambiguity strings. Using these rules to test “People´s Daily” Corpus of 1996, we find that the average precision rate is improved from 80.65% to 94.95%. The result shows that using rules is effective for disambiguation.
Keywords :
natural language processing; word processing; Chinese combinational ambiguity; Chinese word segmentation; combinational ambiguity strings; contextual information; corpus based method; linguistic knowledge; semantic knowledge; Computers; Context; Pragmatics; Presses; Semantics; Syntactics; Tagging; combinational ambiguity; linguistic knowledge; rules;
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5931-5
DOI :
10.1109/FSKD.2010.5569209