DocumentCode :
3033592
Title :
Character segmentation for multi lingual Indic and Roman scripts
Author :
Palrecha, Navanit ; Rai, Anvaya ; Kumar, Ajay ; Srivastava, Shikha ; Tyagi, Vipin
Author_Institution :
Centre for Dev. of Telematics, New Delhi, India
fYear :
2011
fDate :
4-6 March 2011
Firstpage :
45
Lastpage :
49
Abstract :
Character segmentation has long been a critical area of the Optical Character Recognition. In this paper, we present an algorithm for character segmentation for Indic and Roman scripts. Character segmentation is difficult for Indic scripts because in these scripts characters are connected with the Shirorekha or headline and the regions bounding the two consecutive characters might overlap because of matraas. Horizontal projection profile is used to extract the Shirorekha and a vertical projection profile is used to segment the characters. Results of the algorithm for the scanned and facsimile documents for Devnagari Script are shown.
Keywords :
feature extraction; image segmentation; natural language processing; optical character recognition; Devnagari script; Shirorekha extraction; character segmentation; facsimile documents; multi lingual Indic scripts; multi lingual Roman scripts; optical character recognition; Accuracy; Character recognition; Image segmentation; Optical character recognition software; Pixel; Signal processing; Signal processing algorithms; Character recognition; Character segmentation; Devnagari; Multilingual Scripts; Shirorekha;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Signal Processing and its Applications (CSPA), 2011 IEEE 7th International Colloquium on
Conference_Location :
Penang
Print_ISBN :
978-1-61284-414-5
Type :
conf
DOI :
10.1109/CSPA.2011.5759840
Filename :
5759840
Link To Document :
بازگشت