DocumentCode :
1635774
Title :
Word-Based Adaptive OCR for Historical Books
Author :
Kluzner, Vladimir ; Tzadok, Asaf ; Shimony, Yuval ; Walach, Eugene ; Antonacopoulos, Apostolos
Author_Institution :
Haifa Res. Lab., IBM Corp., Haifa, Israel
fYear :
2009
Firstpage :
501
Lastpage :
505
Abstract :
The aim of this work is to propose a new approach to the recognition of historical texts by providing an adaptive mechanism that automatically tunes itself to a specific book. The system is based on clustering together all the similar words in a book/text and simultaneously handling entire class. The paper describes the architecture of such a system and new algorithms that have been developed for robust word image comparison (including registration, optical flow based distortion compensation, and adaptive binarization). Results for a large dataset are presented as well. Over 23% recognition improvement is demonstrated.
Keywords :
electronic publishing; history; optical character recognition; pattern clustering; text analysis; word processing; historical book; image recognition; optical character recognition; word-based adaptive OCR; Books; Character recognition; Engines; Optical character recognition software; Optical distortion; Optical sensors; Shape; Software libraries; Text analysis; Text recognition; adaptive OCR; document processing; historical texts; non-rigid registration; optical flow;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2009. ICDAR '09. 10th International Conference on
Conference_Location :
Barcelona
ISSN :
1520-5363
Print_ISBN :
978-1-4244-4500-4
Electronic_ISBN :
1520-5363
Type :
conf
DOI :
10.1109/ICDAR.2009.133
Filename :
5277611
Link To Document :
بازگشت