DocumentCode :
1622692
Title :
Script independent detection of bold words in multi font-size documents
Author :
Saikrishna, Pedamalli ; Ramakrishnan, A.G.
Author_Institution :
Dept. of Electr. Eng., Indian Inst. of Sci., Bangalore, India
fYear :
2013
Firstpage :
1
Lastpage :
4
Abstract :
A script independent, font-size independent scheme is proposed for detecting bold words in printed pages. In OCR applications such as minor modifications of an existing printed form, it is desirable to reproduce the font size and characteristics such as bold, and italics in the OCR recognized document. In this morphological opening based detection of bold (MOBDoB) method, the binarized image is segmented into sub-images with uniform font sizes, using the word height information. Rough estimation of the stroke widths of characters in each sub-image is obtained from the density. Each sub-image is then opened with a square structuring element of size determined by the respective stroke width. The union of all the opened sub-images is used to determine the locations of the bold words. Extracting all such words from the binarized image gives the final image. A minimum of 98 % of bold words were detected from a total of 65 Tamil, Kannada and English pages and the false alarm rate is less than 0.4 %.
Keywords :
character sets; document image processing; image segmentation; mathematical morphology; optical character recognition; text detection; MOBDoB; OCR recognized document; binarized image segmentation; bold word detection; character stroke widths estimation; font size reproduction; morphological opening based detection of bold; multifont-size documents; script independent detection; square structuring element; word height information; Character recognition; Electronic mail; Image segmentation; Optical character recognition software; Text analysis; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG), 2013 Fourth National Conference on
Conference_Location :
Jodhpur
Print_ISBN :
978-1-4799-1586-6
Type :
conf
DOI :
10.1109/NCVPRIPG.2013.6776180
Filename :
6776180
Link To Document :
بازگشت