DocumentCode
255576
Title
Shirorekha extraction in Character Segmentation for printed devanagri text in Document Image Processing
Author
Shinde, A.B. ; Dandawate, Y.H.
Author_Institution
Dept. of Electron. Eng., Padmabhooshan Vasantraodada Patil Inst. of Technol., Budhgaon, India
fYear
2014
fDate
11-13 Dec. 2014
Firstpage
1
Lastpage
7
Abstract
Finding Structural Layout, Text Line Segmentation, Word Level Segmentation and Character Level Segmentation is major step in offline OCR systems for Devanagari Script in Document Image Processing. This paper proposes a Word and Character Segmentation method for machine printed Devanagari text. A complete word and character segmentation system for Devanagari printed text is presented here. Sometimes, interline space and fused characters make line segmentation and character segmentation a difficult task respectively. We have tested our method on documents in Marathi scripts. A novel technique of character segmentation for printed Devanagari text is presented here. After removing the Shirorekha (header line) of Devanagari text, the bounding boxes are used to surround the segmented characters. Results obtained from this method are encouraging because of morphological operations. In this method we are proposing some basic morphological operations on the scanned document images and got much better results.
Keywords
document image processing; feature extraction; image segmentation; text detection; Marathi scripts; Shirorekha extraction; bounding boxes; character segmentation system; document image processing; header line; machine printed Devanagari text; morphological operations; word segmentation system; Image resolution; Image segmentation; Optical imaging; Radio frequency; Character Segmentation; Devanagari Script; Line Segmentation; Structural Layout; Word Segmentation;
fLanguage
English
Publisher
ieee
Conference_Titel
India Conference (INDICON), 2014 Annual IEEE
Conference_Location
Pune
Print_ISBN
978-1-4799-5362-2
Type
conf
DOI
10.1109/INDICON.2014.7030535
Filename
7030535
Link To Document