• DocumentCode
    3105392
  • Title

    Automatic Spelling Correction Rule Extraction and Application for Spoken-Style Korean Text

  • Author

    Byun, Jeunghyun ; Rim, Hae-Chang ; Park, So-Young

  • fYear
    2007
  • fDate
    22-24 Aug. 2007
  • Firstpage
    195
  • Lastpage
    199
  • Abstract
    Nowadays, spoken-style text is prevailing because lots of information are being written in spoken-style such as Short-Message-Service(SMS) messages. However, the spokenstyle text contains more spelling errors than the traditional written-style text. In this paper, we propose a rule-based spelling correction system which can automatically extract spelling correction rules from the correction corpus and apply extracted rules to spelling errors of input sentences. In order to preserve both high precision and high recall, we devise a candidate-elimination algorithm which determines appropriate context size of spelling correction rules based on rule accuracy. Experimental results showed that the proposed system can extract 42,537 spelling correction rules and apply the rules to correct spelling errors on the test corpus and thus, the rate of precision is increased from 31.08% to 79.04% on the basis of message unit.
  • Keywords
    Cellular phones; Computer errors; Data mining; Error correction; Information technology; Internet; Laboratories; Natural language processing; Natural languages; System testing; Spelling correctionSpoken-style Korean TextCandidate-elimination algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Language Processing and Web Information Technology, 2007. ALPIT 2007. Sixth International Conference on
  • Conference_Location
    Luoyang, Henan, China
  • Print_ISBN
    978-0-7695-2930-1
  • Type

    conf

  • DOI
    10.1109/ALPIT.2007.102
  • Filename
    4460639