DocumentCode
560933
Title
Application of document spelling checker for Bahasa Indonesia
Author
Aqsath, R.N. ; Kamayani, Mia ; Reinanda, Ridho ; Simbolon, Simon ; Soleh, Moch Yusup ; Purwarianti, Ayu
Author_Institution
Sch. of Electr. & Inf. Eng., Bandung Inst. of Technol., Bandung, Indonesia
fYear
2011
fDate
17-18 Dec. 2011
Firstpage
249
Lastpage
252
Abstract
The needs of document spelling checker of Bahasa Indonesia is highly required. Unfortunately, there is no available application of document spelling checker for Bahasa Indonesia. The existing researches on Indonesian spelling checker have not developed into a complete document spelling checker. Here in this research, we compare several methods employed for Indonesian spelling checker especially in the word error detection and analyzed best methods employed in the building of Indonesian document spelling checker application. The main idea is to employ a complete word list as the reference. The Indonesian document spelling checker consists of 5 main components, namely document preprocess, word error detection, word error correction, word candidate ranking, and user feedback. The document preprocess is to process the document into a list of unique word which will be analyzed further in the spelling checker. In the word error detection, a binary search and hashing are used to do the searching faster. In the word error correction, the forward reverse and a similarity measure score are employed. In the candidate ranking, HMM is used to select the best correct word candidate. Using 13,000 words as the lexicon resource and 10 documents as the tested documents, the experimental results achieved 93.7% accuracy. The errors are caused by the word absence in the lexicon resource and the special repetition word form.
Keywords
document handling; hidden Markov models; natural language processing; Bahasa Indonesia; HMM; Indonesian document; Indonesian spelling checker; document preprocess; document spelling checker application; hidden markov model; word candidate ranking; word error correction; word error detection; Accuracy; Dictionaries; Forward error correction; Hidden Markov models; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Advanced Computer Science and Information System (ICACSIS), 2011 International Conference on
Conference_Location
Jakarta
Print_ISBN
978-1-4577-1688-1
Type
conf
Filename
6140765
Link To Document