DocumentCode
3135414
Title
Word Spotting Based Retrieval of Urdu Handwritten Documents
Author
Abidi, Abdessalem ; Jamil, Atif ; Siddiqi, Imran ; Khurshid, Kiran
Author_Institution
Nat. Univ. of Sci. & Technol., Islamabad, Pakistan
fYear
2012
fDate
18-20 Sept. 2012
Firstpage
331
Lastpage
336
Abstract
Urdu being one of the most popular languages adopted during different swatches of history has a valuable collection of handwritten scripts in different state libraries of South Asia. Digitizing these collections can serve not only to preserve them but also to make them available to general public. Non existence of an Urdu OCR, however, limits the concept of a digital Urdu library to scanning and manual search of documents only. We present a word spotting based search method for Urdu handwritten text. The text is first segmented into partial words and a set of features is computed from each partial word. The user queries the system using word image. The partial words in the query image are then matched with those in the database and the matched partial words are merged into complete words. The proposed method evaluated on 90 handwritten documents reported encouraging precision and recall rates.
Keywords
digital libraries; document image processing; handwritten character recognition; information retrieval; natural languages; optical character recognition; Urdu OCR; Urdu handwritten document; Urdu handwritten text; digital Urdu library; handwritten script; partial word; precision rate; query image; recall rate; word image; word spotting based retrieval; Feature extraction; Handwriting recognition; Image segmentation; Indexing; Libraries; Vectors; Partial Words; Run length smoothing alogrithm; Urdu handwritten text detection;
fLanguage
English
Publisher
ieee
Conference_Titel
Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
Conference_Location
Bari
Print_ISBN
978-1-4673-2262-1
Type
conf
DOI
10.1109/ICFHR.2012.289
Filename
6424415
Link To Document