DocumentCode :
2026238
Title :
Resolution to Chinese combinational ambiguity combined corpus-based method with linguistics knowledge
Author :
Liu, JiangYang ; Liu, Ying
Author_Institution :
Dept. of Chinese Language & Literature, Tsinghua Univ., Beijing, China
Volume :
3
fYear :
2010
fDate :
10-12 Aug. 2010
Firstpage :
1469
Lastpage :
1473
Abstract :
Combinational ambiguity is a challenging issue in Chinese word segmentation in that its disambiguation depends on the contextual information. This paper collects contextual information of 28 typical combinational ambiguity strings, and makes use of lexical, syntactic and semantic knowledge and large scale corpus to summarize the rules of these combinational ambiguity strings. Using these rules to test “People´s Daily” Corpus of 1996, we find that the average precision rate is improved from 80.65% to 94.95%. The result shows that using rules is effective for disambiguation.
Keywords :
natural language processing; word processing; Chinese combinational ambiguity; Chinese word segmentation; combinational ambiguity strings; contextual information; corpus based method; linguistic knowledge; semantic knowledge; Computers; Context; Pragmatics; Presses; Semantics; Syntactics; Tagging; combinational ambiguity; linguistic knowledge; rules;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Fuzzy Systems and Knowledge Discovery (FSKD), 2010 Seventh International Conference on
Conference_Location :
Yantai, Shandong
Print_ISBN :
978-1-4244-5931-5
Type :
conf
DOI :
10.1109/FSKD.2010.5569209
Filename :
5569209
Link To Document :
بازگشت