Title :
A robust text line detection in complex handwritten documents
Author :
Jakub Leszek Pach;Piotr Bilski
Author_Institution :
Institute of Radioelectronics Warsaw University of Technology, Warsaw 00-665
Abstract :
In this paper, we present the modified method of detecting text lines in handwritten documents based on the Block-Based Hough Transform. The algorithm has the practical application in the manuscript author identification. The proposed technique consists of three steps: preprocessing, detecting of potential text lines and eliminating the false ones. The first step covers the following operations: image binarization, extraction of connected components and selection of supporting connected components based on the local maximal values in the vertical histogram strips. In the second step we select the appropriate subset of connected components supplemented by one-point components. Next, we use the block-based Hough transform to detect potential text lines. Finally, we detect possible false alarms (lines detected incorrectly). The proposed method is applied to the text lines analysis in the fifteenth century Latin manuscripts. Our approach was verified to be more effective than the traditional ones, in the best cases by twenty percent.
Keywords :
"Transforms","Histograms","Feature extraction","Image segmentation","Arrays","Text analysis","Writing"
Conference_Titel :
Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS), 2015 IEEE 8th International Conference on
Print_ISBN :
978-1-4673-8359-2
DOI :
10.1109/IDAACS.2015.7340742