• DocumentCode
    782242
  • Title

    A comparison of standard spell checking algorithms and a novel binary neural approach

  • Author

    Hodge, Victoria J. ; Austin, Jim

  • Author_Institution
    Dept. of Comput. Sci., York Univ., UK
  • Volume
    15
  • Issue
    5
  • fYear
    2003
  • Firstpage
    1073
  • Lastpage
    1081
  • Abstract
    In this paper, we propose a simple, flexible, and efficient hybrid spell checking methodology based upon phonetic matching, supervised learning, and associative matching in the AURA neural system. We integrate Hamming Distance and n-gram algorithms that have high recall for typing errors and a phonetic spell-checking algorithm in a single novel architecture. Our approach is suitable for any spell checking application though aimed toward isolated word error correction, particularly spell checking user queries in a search engine. We use a novel scoring scheme to integrate the retrieved words from each spelling approach and calculate an overall score for each matched word. From the overall scores, we can rank the possible matches. We evaluate our approach against several benchmark spellchecking algorithms for recall accuracy. Our proposed hybrid methodology has the highest recall rate of the techniques evaluated. The method has a high recall rate and low-computational cost.
  • Keywords
    learning (artificial intelligence); neural nets; pattern matching; spelling aids; AURA neural system; Hamming Distance; associative matching; benchmark spellchecking algorithms; binary neural approach; isolated word error correction; n-gram algorithms; phonetic matching; phonetic spell-checking algorithm; recall accuracy; recall rate; scoring scheme; search engine; spell checking application; standard spell checking algorithms; supervised learning; user queries; Costs; Error correction; Hamming distance; Humans; Internet; Neural networks; Robustness; Search engines; Supervised learning;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2003.1232265
  • Filename
    1232265