• DocumentCode
    591985
  • Title

    A Coarse-to-Fine Approach for Handwritten Word Spotting in Large Scale Historical Documents Collection

  • Author

    Almazan, Jon ; Fernandez, Diego ; Fornes, Alicia ; Llados, Josep ; Valveny, Ernest

  • Author_Institution
    Dept. Cienc. de la Computacio, Univ. Aut`onoma de Barcelona, Barcelona, Spain
  • fYear
    2012
  • fDate
    18-20 Sept. 2012
  • Firstpage
    455
  • Lastpage
    460
  • Abstract
    In this paper we propose an approach for word spotting in handwritten document images. We state the problem from a focused retrieval perspective, i.e. locating instances of a query word in a large scale dataset of digitized manuscripts. We combine two approaches, namely one based on word segmentation and another one segmentation-free. The first approach uses a hashing strategy to coarsely prune word images that are unlikely to be instances of the query word. This process is fast but has a low precision due to the errors introduced in the segmentation step. The regions containing candidate words are sent to the second process based on a state of the art technique from the visual object detection field. This discriminative model represents the appearance of the query word and computes a similarity score. In this way we propose a coarse-to-fine approach achieving a compromise between efficiency and accuracy. The validation of the model is shown using a collection of old handwritten manuscripts. We appreciate a substantial improvement in terms of precision regarding the previous proposed method with a low computational cost increase.
  • Keywords
    cryptography; document image processing; handwritten character recognition; image retrieval; image segmentation; object detection; query processing; coarse-to-fine approach; digitized manuscript; discriminative model; focused retrieval perspective; handwritten document image; handwritten manuscript; handwritten word spotting; hashing strategy; historical document collection; query word; similarity score; visual object detection; word segmentation; Accuracy; Computational modeling; Histograms; Image segmentation; Training; Vectors; Visualization; appearance models; historical documents; word indexation; word spotting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
  • Conference_Location
    Bari
  • Print_ISBN
    978-1-4673-2262-1
  • Type

    conf

  • DOI
    10.1109/ICFHR.2012.151
  • Filename
    6424435