DocumentCode :
3208073
Title :
Supervised and unsupervised automatic spelling correction algorithms
Author :
Van Delden, Sebastian ; Bracewell, David ; Gomez, Fernando
Author_Institution :
Dept. of Math. & Comput. Sci., South Carolina Univ., Spartanburg, SC, USA
fYear :
2004
fDate :
8-10 Nov. 2004
Firstpage :
530
Lastpage :
535
Abstract :
We present two algorithms for automatically improving the quality of texts which contain a large number of spelling errors. A supervised algorithm, which automatically corrects unknown words that are generated primarily from typing errors, is presented first. The second algorithm is an unsupervised approach to automatically correcting typing errors, individual words that have been split, multiple words which have been concatenated, and a combination of these errors. The algorithms have been developed and tested on a large source of real-world, human- and machine-generated spelling errors.
Keywords :
natural languages; text analysis; word processing; automatic spelling correction algorithms; spelling errors; supervised algorithm; unsupervised approach; Computer errors; Computer science; Concatenated codes; Databases; Error correction; Filtering algorithms; Humans; Information retrieval; NASA; Natural languages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Reuse and Integration, 2004. IRI 2004. Proceedings of the 2004 IEEE International Conference on
Print_ISBN :
0-7803-8819-4
Type :
conf
DOI :
10.1109/IRI.2004.1431515
Filename :
1431515
Link To Document :
بازگشت