DocumentCode :
682670
Title :
Error feedback based lexical entity extraction for Chinese language modeling
Author :
Yi Liu ; Jing Hua ; Xiangang Li ; Xihong Wu
Author_Institution :
Speech & Hearing Res. Center, Peking Univ., Beijing, China
Volume :
03
fYear :
2013
fDate :
16-18 Dec. 2013
Firstpage :
1298
Lastpage :
1303
Abstract :
Chinese, which is quite different from western languages, has no standard definition of word. Therefore, choosing suitable lexicon plays an important role in Chinese language modeling. This paper proposes a novel method of constructing the lexicon automatically. Other than depending on statistical measures of text features, this method is directly based on the feedback of errors from the corresponding task, such as phoneme-to-grapheme conversion in this paper. The whole process consists of two iterative phases: selection of individual words from a large manual lexicon and further extraction of compound words based on Phase One. Experiments implemented on phoneme-to-grapheme conversion show that this method can achieve 1.09% and 0.38% absolute reduction in character error rate respectively for Phase One and Phase Two compared with baseline lexicons in the same size generated by the conventional method based on word frequency.
Keywords :
iterative methods; natural language processing; statistical analysis; Chinese language modeling; compound words; error feedback; iterative phases; lexical entity extraction; phoneme-to-grapheme conversion; statistical measurement; text features; western languages; Manuals; Training; Chinese language modeling; error feedback; lexical entity extraction; lexical entity selection; phoneme-to-grapheme conversion;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Image and Signal Processing (CISP), 2013 6th International Congress on
Conference_Location :
Hangzhou
Print_ISBN :
978-1-4799-2763-0
Type :
conf
DOI :
10.1109/CISP.2013.6743873
Filename :
6743873
Link To Document :
بازگشت