• DocumentCode
    560933
  • Title

    Application of document spelling checker for Bahasa Indonesia

  • Author

    Aqsath, R.N. ; Kamayani, Mia ; Reinanda, Ridho ; Simbolon, Simon ; Soleh, Moch Yusup ; Purwarianti, Ayu

  • Author_Institution
    Sch. of Electr. & Inf. Eng., Bandung Inst. of Technol., Bandung, Indonesia
  • fYear
    2011
  • fDate
    17-18 Dec. 2011
  • Firstpage
    249
  • Lastpage
    252
  • Abstract
    The needs of document spelling checker of Bahasa Indonesia is highly required. Unfortunately, there is no available application of document spelling checker for Bahasa Indonesia. The existing researches on Indonesian spelling checker have not developed into a complete document spelling checker. Here in this research, we compare several methods employed for Indonesian spelling checker especially in the word error detection and analyzed best methods employed in the building of Indonesian document spelling checker application. The main idea is to employ a complete word list as the reference. The Indonesian document spelling checker consists of 5 main components, namely document preprocess, word error detection, word error correction, word candidate ranking, and user feedback. The document preprocess is to process the document into a list of unique word which will be analyzed further in the spelling checker. In the word error detection, a binary search and hashing are used to do the searching faster. In the word error correction, the forward reverse and a similarity measure score are employed. In the candidate ranking, HMM is used to select the best correct word candidate. Using 13,000 words as the lexicon resource and 10 documents as the tested documents, the experimental results achieved 93.7% accuracy. The errors are caused by the word absence in the lexicon resource and the special repetition word form.
  • Keywords
    document handling; hidden Markov models; natural language processing; Bahasa Indonesia; HMM; Indonesian document; Indonesian spelling checker; document preprocess; document spelling checker application; hidden markov model; word candidate ranking; word error correction; word error detection; Accuracy; Dictionaries; Forward error correction; Hidden Markov models; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Computer Science and Information System (ICACSIS), 2011 International Conference on
  • Conference_Location
    Jakarta
  • Print_ISBN
    978-1-4577-1688-1
  • Type

    conf

  • Filename
    6140765