DocumentCode :
3667288
Title :
A content-based method for Persian real-word spell checking
Author :
Mohammad Hossein Samani;Zeinab Rahimi;Sara Rahimi
Author_Institution :
Department of infrastructure security, Research Center for Developing Advanced Technologies (RCDAT), Tehran, Iran
fYear :
2015
fDate :
5/1/2015 12:00:00 AM
Firstpage :
1
Lastpage :
5
Abstract :
Here, a content-based method for real-word spell checking in Persian language is presented. In this method real-word mistakes are classified in 5 categories and are resolved using a content-based procedure. Each word which may cause any real-word error is listed in a candidate set in a same entry with its similar words (potential mistakes in a single entry). In next step, a content-word list is constructed based on adjacent frequent N-grams for each word in confusion set. Evaluations indicate that proposed method not only provides promising performance and acceptable precision, but also outperforms a similar existing system from precision and recall points of view.
Keywords :
"Dictionaries","Simulated annealing","Semantics","Bayes methods","Training","Text processing"
Publisher :
ieee
Conference_Titel :
Information and Knowledge Technology (IKT), 2015 7th Conference on
Print_ISBN :
978-1-4673-7483-5
Type :
conf
DOI :
10.1109/IKT.2015.7288791
Filename :
7288791
Link To Document :
بازگشت