DocumentCode
153323
Title
A Combined System for Text Line Extraction and Handwriting Recognition in Historical Documents
Author
Fischer, Anath ; Baechler, Micheal ; Garz, Angelika ; Liwicki, Marcus ; Ingold, Rolf
Author_Institution
Dept. of Electr. Eng., Polytech. Montreal, Montreal, QC, Canada
fYear
2014
fDate
7-10 April 2014
Firstpage
71
Lastpage
75
Abstract
Automated reading of historical handwriting is needed to search and browse ancient manuscripts in digital libraries based on their textual content. In this paper, we present a combined system for text localization and transcription in page images. It includes flexible learning-based methods for layout analysis and handwriting recognition, which were developed in the context of the Swiss research project HisDoc. A comprehensive experimental evaluation is provided for the medieval Parzival database, demonstrating a promising word recognition accuracy of 93.0% with closed vocabulary. In order to harmonize the evaluation of the two document analysis tasks, we introduce a novel evaluation measure for text line extraction that takes substitution, deletion, as well as insertion errors into account.
Keywords
digital libraries; document image processing; feature extraction; handwriting recognition; Swiss research project HisDoc; ancient manuscript; automated reading; digital libraries; document analysis task; flexible learning-based method; handwriting recognition; historical document; layout analysis; medieval Parzival database; text line extraction; text localization; transcription; word recognition; Accuracy; Databases; Handwriting recognition; Hidden Markov models; Layout; Text analysis; Text recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis Systems (DAS), 2014 11th IAPR International Workshop on
Conference_Location
Tours
Print_ISBN
978-1-4799-3243-6
Type
conf
DOI
10.1109/DAS.2014.51
Filename
6830972
Link To Document