• DocumentCode
    3134472
  • Title

    Script Independent Word Spotting in Offline Handwritten Documents Based on Hidden Markov Models

  • Author

    Wshah, S. ; Kumar, Girish ; Govindaraju, Vengatesan

  • fYear
    2012
  • fDate
    18-20 Sept. 2012
  • Firstpage
    14
  • Lastpage
    19
  • Abstract
    Keyword spotting aims to retrieve all instances of a given keyword from a document in any language. In this paper, we propose a novel script independent line based word spotting framework for offline handwritten documents based on Hidden Markov Models. The methodology simulates the keywords in model space as a sequence of character models and uses the filler models for better representation of background or non-keyword text. We propose a two stage spotting framework where the candidate keywords are further pruned using the character based background and lexicon based background model. The system deals with large vocabulary without the need for word or character segmentation. The system has been evaluated on many public dataset from several languages such as IAM for English, AMA for Arabic and LAW for Devanagari. The system outperforms the modern line based approach on the English, Arabic and Devanagari Datasets.
  • Keywords
    handwritten character recognition; hidden Markov models; AMA; Arabic; Devanagari; English; IAM; LAW; background model; hidden Markov model; keyword spotting; offline handwritten document; script independent line; script independent word spotting; vocabulary; word spotting framework; Computational modeling; Context; Context modeling; Feature extraction; Hidden Markov models; Testing; Training; Filler and Background Models; Handwriting Recognition; Hidden Markov Models; Script Independent; Spotting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
  • Conference_Location
    Bari
  • Print_ISBN
    978-1-4673-2262-1
  • Type

    conf

  • DOI
    10.1109/ICFHR.2012.264
  • Filename
    6424364