DocumentCode :
3196697
Title :
Thai OCR error correction using genetic algorithm
Author :
Kruatrachue, Boontee ; Somguntar, Krich ; Siriboon, Kritawan
Author_Institution :
King Mongkut´´s Inst. of Technol., Bangkok, Thailand
fYear :
2002
fDate :
2002
Firstpage :
137
Lastpage :
141
Abstract :
This paper presents an efficient method for Thai OCR error correction based on genetic algorithm (GA). The correction process starts with word graph construction from spell checking with dictionary, then a graph is searched for a corrected sentence with the highest perplexity (using language model, bi-gram and tri-gram) and word probability from OCR. For a long sentence, a search space is huge and can be resolved using GA. A list of nodes is used for chromosome encoding to represent all possible paths in a graph instead of standard binary string. The performance of the suggested technique is evaluated and compared to the full search for tested sentences of different size constructed from 10 nodes to 200 nodes word graphs.
Keywords :
error correction; genetic algorithms; optical character recognition; protocols; Thai OCR error correction; chromosome encoding; dictionary; genetic algorithm; spell checking; word graph construction; word probability; Character generation; Character recognition; Dictionaries; Error correction; Genetic algorithms; Natural languages; Optical character recognition software; Read only memory; Testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cyber Worlds, 2002. Proceedings. First International Symposium on
Print_ISBN :
0-7695-1862-1
Type :
conf
DOI :
10.1109/CW.2002.1180870
Filename :
1180870
Link To Document :
بازگشت