Title :
Integrating Pinyin to Improve Spelling Errors Detection for Chinese Language
Author :
Peng Jin ; Xingyuan Chen ; Zhaoyi Guo ; Pengyuan Liu
Author_Institution :
Lab. of Intell. Inf. Process., Leshan Normal Univ., Leshan, China
Abstract :
Most Chinese texts are inputted with keyboard via two input methods: Pinyin and Wubi, especially by Pinyin input method. In this paper, this users´ habitation is used to find the spelling errors automatically. We first train a Chinese character form n-gram language model on a large scale Chinese corpus in the traditional way. In order to improve this character based model, we transform the whole corpus into Pinyin to obtain Pinyin based language model. Fatherly, the tone is considered to get the third model. Integrating these three models, we improve the performance of checking spelling error system. Experimental results demonstrate the effeteness of our model.
Keywords :
computational linguistics; error detection; natural language processing; spelling aids; Chinese character form n-gram language model; Chinese language; Chinese texts; Pinyin based language model; Pinyin input method; Wubi input method; spelling error detection; Computational modeling; Conferences; Educational institutions; Information processing; Integrated circuit modeling; Keyboards; Pinyin; n-gram language model; spelling error;
Conference_Titel :
Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on
Conference_Location :
Warsaw
DOI :
10.1109/WI-IAT.2014.71