DocumentCode
3022084
Title
Text extraction from gray scale historical document images using adaptive local connectivity map
Author
Shi, Zhixin ; Setlur, Srirangaraj ; Govindaraju, Venu
Author_Institution
Center of Excellence for Document Anal. & Recognition, New York State Univ., Buffalo, NY, USA
fYear
2005
fDate
29 Aug.-1 Sept. 2005
Firstpage
794
Abstract
This paper presents an algorithm using adaptive local connectivity map for retrieving text lines from the complex handwritten documents such as handwritten historical manuscripts. The algorithm is designed for solving the particularly complex problems seen in handwritten documents. These problems include fluctuating text lines, touching or crossing text lines and low quality image that do not lend themselves easily to binarizations. The algorithm is based on connectivity features similar to local projection profiles, which can be directly extracted from gray scale images. The proposed technique is robust and has been tested on a set of complex historical handwritten documents such as Newton´s and Galileo´s manuscripts. A preliminary testing shows a successful location rate of above 95% for the test set.
Keywords
feature extraction; handwritten character recognition; information retrieval; text analysis; visual databases; Galileo manuscript; Newton manuscript; adaptive local connectivity map; gray scale image; handwritten document; handwritten historical manuscript; historical document image; image quality; local projection profile; text extraction; text lines retrieval; Algorithm design and analysis; Design methodology; Iterative algorithms; Libraries; Partitioning algorithms; Robustness; Strips; Testing; Text analysis; Venus;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
ISSN
1520-5263
Print_ISBN
0-7695-2420-6
Type
conf
DOI
10.1109/ICDAR.2005.229
Filename
1575654
Link To Document