DocumentCode :
3142751
Title :
Partitioning and searching dictionary for correction of optically read Devanagari character strings
Author :
Bansal, Veena ; Sinha, R.M.K.
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Kanpur, India
fYear :
1999
fDate :
20-22 Sep 1999
Firstpage :
653
Lastpage :
656
Abstract :
This paper describes a correction method for optically read Devanagari character strings which uses a partitioned word dictionary. The word dictionary is partitioned in order to reduce the search space besides preventing a forced match to the incorrect word. The envelop information of words consisting of the number of top, lower, core modifiers along with the number of core characters form the second level partitioning feature for short word partitions. The remaining words are further partitioned using tags. A tag is a string of fixed length associated with each partition. The search process uses a distance matrix for assigning a penalty for a mismatch. An improvement of approximately 20% in the recognition performance is obtained
Keywords :
dictionaries; document image processing; optical character recognition; search problems; Devanagari character strings; OCR; correction method; distance matrix; partitioned word dictionary; performance; search space; tags; word partitioning; Computer errors; Computer science; Dictionaries; Frequency estimation; Keyboards; Natural languages; Space technology; Speech recognition; Strips; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on
Conference_Location :
Bangalore
Print_ISBN :
0-7695-0318-7
Type :
conf
DOI :
10.1109/ICDAR.1999.791872
Filename :
791872
Link To Document :
بازگشت