Large scale experiments on correction of confused words

Author

Huang, Jin Hu ; Powers, David

Author_Institution

Sch. of Inf. & Eng., Flinders Univ. of South Australia, Bedford Park, SA, Australia

fYear

2001

fDate

2001

Firstpage

77

Lastpage

82

Abstract

The paper describes a new approach to automatically learn contextual knowledge for spelling and grammar correction; we aim particularly to deal with cases where the words are all in the dictionary and so it is not obvious that there is an error. Traditional approaches are dictionary based, or use elementary tagging or partial parsing of the sentence to obtain context knowledge. Our approach uses affix information and only the most frequent words to reduce the complexity in terms of training time and running time for context-sensitive spelling correction. We build large scale confused word sets based on keyboard adjacency and apply our new approach to learn the contextual knowledge to detect and correct them. We explore the performance of auto-correction under conditions where significance and probability are set by the user

Keywords

grammars; linguistics; spelling aids; text analysis; affix information; auto-correction; automatic learning; confused word correction; context knowledge; context-sensitive spelling correction; contextual knowledge; elementary tagging; grammar correction; keyboard adjacency; large scale confused word sets; large scale experiments; most frequent words; partial parsing; probability; running time; spelling correction; training time; Dictionaries; Error analysis; Error correction; Frequency; Humans; Informatics; Keyboards; Knowledge engineering; Large-scale systems; Tagging;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer Science Conference, 2001. ACSC 2001. Proceedings. 24th Australasian

Conference_Location

Gold Coast, Qld.

ISSN

1530-0900

Print_ISBN

0-7695-0963-0

Type

conf

DOI

10.1109/ACSC.2001.906626

Filename

906626