DocumentCode :
2504053
Title :
A Novel Handwritten Urdu Word Spotting Based on Connected Components Analysis
Author :
Sagheer, Malik Waqas ; Nobile, Nicola ; He, Chun Lei ; Suen, Ching Y.
Author_Institution :
Comput. Sci. & Software Eng. Dept., Concordia Univ., Montreal, QC, Canada
fYear :
2010
fDate :
23-26 Aug. 2010
Firstpage :
2013
Lastpage :
2016
Abstract :
We propose a novel word spotting system for Urdu words within handwritten text lines. Spatial information of diacritics is integrated to the detection of the main connected components in candidate words generation. An Urdu word recognition system is effectively designed and applied to classify the candidate words. In this word recognition system, compound features and SVM were adapted. The verification/rejection process was based on the outputs from the Urdu word recognition system and the image´s global features were applied to achieve a promising result. As a result, a high 92.11% correct segmentation rate, a 50.75% word spotting precision rate were achieved while maintaining a 70.1% recall on CENPARMI´s Urdu Database.
Keywords :
handwriting recognition; natural language processing; support vector machines; word processing; SVM; Urdu word recognition system; connected components analysis; diacritics; handwritten Urdu word spotting; handwritten text lines; Databases; Euclidean distance; Feature extraction; Handwriting recognition; Image segmentation; Support vector machines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition (ICPR), 2010 20th International Conference on
Conference_Location :
Istanbul
ISSN :
1051-4651
Print_ISBN :
978-1-4244-7542-1
Type :
conf
DOI :
10.1109/ICPR.2010.496
Filename :
5597256
Link To Document :
بازگشت