Title :
Reducing Annotation Workload Using a Codebook Mapping and Its Evaluation in On-Line Handwriting
Author :
Jinpeng Li ; Mouchere, Harold ; Viard-Gaudin, Christian
Author_Institution :
IRCCyN, Univ. de Nantes, Nantes, France
Abstract :
The training of most of the existing recognition systems requires availability of large datasets labeled at the symbol level. However, producing ground-truth datasets is a tedious work. Two repetitive tasks have to be chained. One is to select a subset of strokes that belong to the same symbol, a next step is to assign a label to this stroke group. In this paper, we discuss a framework to reduce the human workload for labeling at the symbol level a large set of documents based on any graphical language. A hierarchical clustering is used to produce a codebook with one or several strokes per symbol, which is used for a mapping on the raw handwritten data. Evaluation is proposed on two different datasets.
Keywords :
document image processing; handwriting recognition; pattern clustering; annotation workload reduction; codebook mapping; documents; graphical language; ground-truth dataset; handwritten data; hierarchical clustering; label assignment; on-line handwriting evaluation; recognition system; stroke group; symbol level; symbol strokes; Databases; Humans; Labeling; Measurement; Prototypes; Training; Visualization; Hierarchical Clustering; Modified Hausdorff Distance; On-Line Handwriting; Symbol Annotation;
Conference_Titel :
Frontiers in Handwriting Recognition (ICFHR), 2012 International Conference on
Conference_Location :
Bari
Print_ISBN :
978-1-4673-2262-1
DOI :
10.1109/ICFHR.2012.259