DocumentCode
3134472
Title
Script Independent Word Spotting in Offline Handwritten Documents Based on Hidden Markov Models
Author
Wshah, S. ; Kumar, Girish ; Govindaraju, Vengatesan
fYear
2012
fDate
18-20 Sept. 2012
Firstpage
14
Lastpage
19
Abstract
Keyword spotting aims to retrieve all instances of a given keyword from a document in any language. In this paper, we propose a novel script independent line based word spotting framework for offline handwritten documents based on Hidden Markov Models. The methodology simulates the keywords in model space as a sequence of character models and uses the filler models for better representation of background or non-keyword text. We propose a two stage spotting framework where the candidate keywords are further pruned using the character based background and lexicon based background model. The system deals with large vocabulary without the need for word or character segmentation. The system has been evaluated on many public dataset from several languages such as IAM for English, AMA for Arabic and LAW for Devanagari. The system outperforms the modern line based approach on the English, Arabic and Devanagari Datasets.
Keywords
handwritten character recognition; hidden Markov models; AMA; Arabic; Devanagari; English; IAM; LAW; background model; hidden Markov model; keyword spotting; offline handwritten document; script independent line; script independent word spotting; vocabulary; word spotting framework; Computational modeling; Context; Context modeling; Feature extraction; Hidden Markov models; Testing; Training; Filler and Background Models; Handwriting Recognition; Hidden Markov Models; Script Independent; Spotting;
fLanguage
English
Publisher
ieee
Conference_Titel
Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
Conference_Location
Bari
Print_ISBN
978-1-4673-2262-1
Type
conf
DOI
10.1109/ICFHR.2012.264
Filename
6424364
Link To Document