DocumentCode :
935440
Title :
Hybrid contextural text recognition with string matching
Author :
Sinha, R.M.K. ; Prasada, Birendra ; Houle, Gilles F. ; Sabourin, Michael
Author_Institution :
Indian Inst. of Technol., Kanpur, India
Volume :
15
Issue :
9
fYear :
1993
fDate :
9/1/1993 12:00:00 AM
Firstpage :
915
Lastpage :
925
Abstract :
The hybrid contextural algorithm for reading real-life documents printed in varying fonts of any size is presented. Text is recognized progressively in three passes. The first pass is used to generate character hypothesis, the second to generate word hypothesis, and the third to verify the word hypothesis. During the first pass, isolated characters are recognized using a dynamic contour warping classifier. Transient statistical information is collected to accelerate the recognition process and to verify hypotheses in later processing. A transient dictionary consisting of high confidence nondictionary words is constructed in this pass. During the second pass, word-level hypotheses are generated using hybrid contextual text processing. Nondictionary words are recognized using a modified Viterbi algorithm, a string matching algorithm utilizing n grams, special handlers for touching characters, and pragmatic handlers for numerals, punctuation, hyphens, apostrophes, and a prefix/suffix handler. This processing usually generates several word hypothesis. During the third pass, word-level verification occurs
Keywords :
document image processing; optical character recognition; character hypothesis; dynamic contour warping classifier; hybrid contextural algorithm; hypothesis verification; modified Viterbi algorithm; progressive recognition; real-life documents; string matching; text recognition; transient dictionary; transient statistical information; word hypothesis; Acceleration; Algorithm design and analysis; Character generation; Character recognition; Costs; Dictionaries; Hybrid power systems; Senior members; Text recognition; Viterbi algorithm;
fLanguage :
English
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher :
ieee
ISSN :
0162-8828
Type :
jour
DOI :
10.1109/34.232077
Filename :
232077
Link To Document :
بازگشت