Title :
Chinese Spelling Errors Detection Based on CSLM
Author :
Zhaoyi Guo;Xingyuan Chen;Peng Jin;Si-Yuan Jing
Author_Institution :
Sch. of Comput. Sci., Leshan Normal Univ., Leshan, China
Abstract :
Spelling errors are very common in various electronic documents and it leads to serious influence sometimes. To solve this problem, methods based on the n-gram language model are the most commonly used. CSLM (continuous space language model) which represents a word as a vector is different from traditional models. In this paper, we experimented with a specific CSLM, namely, the CBOW (Continuous Bag-of-Words) model, to detect spelling errors. Since spelling errors are usually considered as wrong characters rather than words in Chinese language, we trained character vectors with a large Chinese corpus, and then judged a Chinese character is right or not by its probability of the occurrence in a given context. Experimental results show that the method based on CSLM outperforms the n-gram language model.
Keywords :
"Computational modeling","Mathematical model","Context","Training","Yttrium","Predictive models","Pragmatics"
Conference_Titel :
Web Intelligence and Intelligent Agent Technology (WI-IAT), 2015 IEEE / WIC / ACM International Conference on
DOI :
10.1109/WI-IAT.2015.62