Title :
Error-Driven Adaptive Language Modeling for Chinese Pinyin-to-Character Conversion
Author :
Huang, Jin Hu ; Powers, David
Author_Institution :
Sch. of Comput. Sci., Flinders Univ., Adelaide, SA, Australia
Abstract :
The performance of Chinese Pinyin-to-Character conversion is severely affected when the characteristics of the training and conversion data differ. As natural language is highly variable and uncertain, it is impossible to build a complete and general language model to suit all the tasks. The traditional adaptive MAP models mix the task independent data with task dependent data using a mixture coefficient but we never can predict what style of language users have and what new domain will appear. This paper presents a statistical error-driven adaptive language modeling approach to Chinese Pinyin input system. This model can be incrementally adapted when an error occurs during Pinyin-to-Character converting time. It significantly improves Pinyin-to-Character conversion rate.
Keywords :
natural language processing; statistical analysis; Chinese Pinyin input system; Chinese Pinyin-to-character conversion; adaptive MAP model; conversion data; general language model; language user; mixture coefficient; natural language; statistical error-driven adaptive language modeling; task dependent data; task independent data; Adaptation models; Computational modeling; Context; Data models; Hidden Markov models; Smoothing methods; Training; Adaptive Learning; Chinese Language Processing; Pinyin-to-Character Conversion; Statistical Language Modeling;
Conference_Titel :
Asian Language Processing (IALP), 2011 International Conference on
Conference_Location :
Penang
Print_ISBN :
978-1-4577-1733-8
DOI :
10.1109/IALP.2011.46