Title :
Context-Sensitive Arabic Spell Checker Using Context Words and N-Gram Language Models
Author :
Majed M. Al-Jefri;Sabri A. Mahmoud
Author_Institution :
Inf. &
Abstract :
This paper addresses real-word spell checking using context words and n-gram language models. A corpus that consists of different Arabic topics is collected. A collection of confusion sets is normally used in addressing real-word errors. Twenty eight confusion sets are chosen in our experiments. These sets were collected from the most common confused words made by non-native Arabic speakers and from OCR misrecognized words. The probabilities of the context words of the confusion sets are estimated using a window-based technique. N-gram language models are used to detect real-word errors and to choose the best correction for the errors once found. An automatic context-sensitive spell checking prototype that detects and corrects real-word errors in Arabic text is implemented. The experimental results showed promising correction accuracy.
Keywords :
"Context","Accuracy","Training","Probability","Context modeling","Optical character recognition software","Semantics"
Conference_Titel :
Advances in Information Technology for the Holy Quran and Its Sciences (32519), 2013 Taibah University International Conference on
DOI :
10.1109/NOORIC.2013.59