Title :
A Comprehensive Representation Model for Handwriting Dedicated to Word Spotting
Author :
Peng Wang ; Eglin, Veronique ; Garcia, Christophe ; Largeron, Christine ; McKenna, Antony
Author_Institution :
LIRIS, INSA-Lyon, Villeurbanne, France
Abstract :
In this paper, we propose an original representation model for handwriting document images. Most state-of-the-art handwriting representation models only use separately textural properties, selective dominant features (such as stroke orientation or gradient orientation) or structural properties. To avoid the drawbacks of using the properties from a single aspect, we design a comprehensive model that contains both morphological and topological information of handwriting. After interest points (the starting/ending points, branch points and high-curved points) are selected, an adapted version of Shape Context (SC) descriptor built on the interest points is employed to describe the contour of the text. In order to model the structural characteristics of the handwritten text, a graph is constructed based on the interest points and the skeleton of the text. With the graph, loops and specific strokes in the handwriting are detected and analyzed. Based on this model, a coarse-to-fine approach for word spotting application is introduced. Without segmenting texts into words, a group of regions of interest are selected by comparing textural features (orientation, projection profile, upper and lower border projection) using the DTW method. Afterwards, regions of interest and queries are represented by the proposed model. The final similarity measure is a weighted mixture of the SC cost, loop difference, stroke analysis and texture comparison with different weights. The validation of the model shows the significance of combining the various properties of the handwriting envisaged in its different aspects.
Keywords :
document image processing; feature extraction; handwriting recognition; image representation; DTW method; SC cost; SC descriptor; branch points; coarse-to-fine approach; comprehensive representation model; dynamic time warping; gradient orientation feature; handwriting document images; handwriting representation models; high-curved points; loop difference; lower border projection feature; morphological handwriting information; orientation feature; projection profile feature; selective dominant features; shape context descriptor; similarity measure; starting-ending points; stroke analysis; stroke orientation feature; textural property; texture comparison; topological handwriting information; upper border projection feature; word spotting; Context; Handwriting recognition; Hidden Markov models; Shape; Skeleton; Text analysis; Writing; coarse-to-fine description; graph-based description; interest points; word segmentation-free method; word spotting;
Conference_Titel :
Document Analysis and Recognition (ICDAR), 2013 12th International Conference on
Conference_Location :
Washington, DC
DOI :
10.1109/ICDAR.2013.97