• DocumentCode
    3022084
  • Title

    Text extraction from gray scale historical document images using adaptive local connectivity map

  • Author

    Shi, Zhixin ; Setlur, Srirangaraj ; Govindaraju, Venu

  • Author_Institution
    Center of Excellence for Document Anal. & Recognition, New York State Univ., Buffalo, NY, USA
  • fYear
    2005
  • fDate
    29 Aug.-1 Sept. 2005
  • Firstpage
    794
  • Abstract
    This paper presents an algorithm using adaptive local connectivity map for retrieving text lines from the complex handwritten documents such as handwritten historical manuscripts. The algorithm is designed for solving the particularly complex problems seen in handwritten documents. These problems include fluctuating text lines, touching or crossing text lines and low quality image that do not lend themselves easily to binarizations. The algorithm is based on connectivity features similar to local projection profiles, which can be directly extracted from gray scale images. The proposed technique is robust and has been tested on a set of complex historical handwritten documents such as Newton´s and Galileo´s manuscripts. A preliminary testing shows a successful location rate of above 95% for the test set.
  • Keywords
    feature extraction; handwritten character recognition; information retrieval; text analysis; visual databases; Galileo manuscript; Newton manuscript; adaptive local connectivity map; gray scale image; handwritten document; handwritten historical manuscript; historical document image; image quality; local projection profile; text extraction; text lines retrieval; Algorithm design and analysis; Design methodology; Iterative algorithms; Libraries; Partitioning algorithms; Robustness; Strips; Testing; Text analysis; Venus;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2005. Proceedings. Eighth International Conference on
  • ISSN
    1520-5263
  • Print_ISBN
    0-7695-2420-6
  • Type

    conf

  • DOI
    10.1109/ICDAR.2005.229
  • Filename
    1575654