Title :
Discrimination of handwritten and machine printed text in scanned document images based on Rough Set theory
Author :
Narayan, S. ; Gowda, Sahana D.
Author_Institution :
Dept. of Comput. Sci. & Eng., BNM Inst. of Technol., Bangalore, India
fDate :
Oct. 30 2012-Nov. 2 2012
Abstract :
Discrimination of handwritten and machine printed text in a scanned document image is an important phase as processing and recognizing machine printed and handwritten text cannot be done using a single OCR. In this paper a novel approach has been proposed to discriminate machine printed and handwritten text using Rough Set theory. The uniform occurrence of characters in a word is considered as the main feature for discrimination. Uniformity has been depicted from the transitions occurred due to the overlay of component structures on the null background. Rough sets are used to build the knowledge of uniformity among the characters in the word. Based on the equivalence relation between the defined rough set and the derived set, words are identified as machine printed or handwritten. Extensive experiments have been conducted on locally generated 400 samples and samples from IAM dataset.
Keywords :
document image processing; handwritten character recognition; image recognition; rough set theory; text analysis; word processing; IAM dataset; character uniformity; component structures; handwritten text processing; handwritten text recognition; handwritten word component; machine printed text processing; machine printed text recognition; machine printed word component; null background; rough set theory; scanned document images; Communications technology; Decision support systems; document image; handwritten word component; machine printed word component; rough set theory; transition;
Conference_Titel :
Information and Communication Technologies (WICT), 2012 World Congress on
Conference_Location :
Trivandrum
Print_ISBN :
978-1-4673-4806-5
DOI :
10.1109/WICT.2012.6409145