Title :
A shape based post processor for Gurmukhi OCR
Author :
Lehal, G.S. ; Singh, Chandan ; Lehal, Ritu
Author_Institution :
Dept. of Comput. Sci. & Eng., Thapar Inst. of Eng. & Technol., Patiala, India
fDate :
6/23/1905 12:00:00 AM
Abstract :
A shape based post processing system for an OCR of Gurmukhi script has been developed. Based on the size and shape of a word, the Punjabi corpora has been split into different partitions. The statistical information of Punjabi language syllable combination, corpora look up and holistic recognition of most commonly occurring words have been combined to design the post processor. An improvement of 3% in recognition rate from 94.35% to 97.34% has been reported on machine printed images using the post processing techniques
Keywords :
character sets; document image processing; optical character recognition; Gurmukhi script; OCR; Punjabi language; corpora look up; holistic recognition; optical character recognition; shape based post processing system; statistical information; syllable combination; Computer science; Dictionaries; Image recognition; Image segmentation; Natural languages; Optical character recognition software; Process design; Shape; Statistical analysis; Text recognition;
Conference_Titel :
Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7695-1263-1
DOI :
10.1109/ICDAR.2001.953957