Title :
Synthetic handwritten Odia numeral database: From shallow hundreds to comprehensive thousands
Author :
Kalyan S Dash;N B Puhan;Ganapati Panda
Author_Institution :
School of Electrical Sciences, Indian Institute of Technology, Bhubaneswar, India
Abstract :
Comprehensive database that contains all possible variations of handwriting is crucial for training and recognition. The primary challenge for an optical character recognizer (OCR) is that a number of interclass characters bear structural resemblance whereas images within a class render much dissimilarity. Acquisition of such a large database that ensures robust training of the recognizer is a painstaking task. Therefore, recent research interests have been to create, from a few samples of handwriting, a comprehensive synthetic database which not only ensures naturalness, but provides much needed pattern variability. In this paper, we propose a new approach of synthetic handwritten numeral generation for Odia language using interclass deformation. We experimentally evaluate the generated databases using the state-of-the-art recognition systems. The recognition results are compared on two benchmark databases (ISI Kolkata and IIT Bhubaneswar Odia numeral) as well as two newly created synthetic databases. The Odia numeral database sizes are increased by 20-fold each using our proposed approach. The introduction of nonlinear pattern variance because of interclass deformation is proved to pose better challenge to conventional recognizers. We also experimented on a mixture of original and synthetic database for training the OCR to achieve robustness and higher accuracy.
Keywords :
"Databases","Handwriting recognition","Training","Shape","Optical character recognition software","Robustness"
Conference_Titel :
Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2015 Fifth National Conference on
DOI :
10.1109/NCVPRIPG.2015.7490025