Title :
An Ocr system for printed Nasta´liq script: A segmentation based approach
Author :
Naz, Saeeda ; Umar, Arif Iqbal ; Bin Ahmed, Saad ; Shirazi, Syed Hamad ; Imran Razzak, M. ; Siddiqi, Imran
Author_Institution :
Dept. Of Inf. Technol., Hazara Univ., Mansehra, Pakistan
Abstract :
Machine simulation of human reading has been a subject of intensive research for almost four decades. Automatic Urdu character recognition remains a challenging task due to its cursive nature despite the fact that the latest improvements in recognition methods and systems for Latin script are very promising. This work introduces a robust approach based on statistical models that provide solution for recognition of Urdu text Nasta´liq style. Contrary to classical approaches which segment text into words, ligatures or characters, we intend to employ an implicit segmentation where text lines are recognized during segmentation. The developed system will be evaluated on standard Urdu text databases and compared with the state-of-the-art recognition techniques proposed till date.
Keywords :
image segmentation; optical character recognition; statistical analysis; Latin script; OCR system; Urdu character recognition; Urdu text database; Urdu text recognition; implicit segmentation; optical character recognition system; printed Nasta´liq script; segmentation based approach; statistical model; Character recognition; Feature extraction; Hidden Markov models; Image segmentation; Optical character recognition software; Optical imaging; Text recognition;
Conference_Titel :
Multi-Topic Conference (INMIC), 2014 IEEE 17th International
Print_ISBN :
978-1-4799-5754-5
DOI :
10.1109/INMIC.2014.7097347