DocumentCode :
1580365
Title :
Text line segmentation and word recognition in a system for general writer independent handwriting recognition
Author :
Marti, U.-V. ; Bunke, H.
Author_Institution :
Inst. fur Inf. und Angewandte Math., Bern Univ., Switzerland
fYear :
2001
fDate :
6/23/1905 12:00:00 AM
Firstpage :
159
Lastpage :
163
Abstract :
We present a system for recognizing unconstrained English handwritten text based on a large vocabulary. We describe the three main components of the system, which are preprocessing, feature extraction and recognition. In the preprocessing phase the handwritten texts are first segmented into lines. Then each line of text is normalized with respect to of skew, slant, vertical position and width. After these steps, text lines are segmented into single words. For this purpose distances between connected components are measured. Using a threshold, the distances are divided into distances within a word and distances between different words. A line of text is segmented at positions where the distances are larger than the chosen threshold. From each image representing a single word, a sequence of features is extracted. These features are input to a recognition procedure which is based on hidden Markov models. To investigate the stability of the segmentation algorithm the threshold that separates intra- and inter-word distances from each other is varied. If the threshold is small many errors are caused by over-segmentation, while for large thresholds under-segmentation errors occur. The best segmentation performance is 95.56% correctly segmented words, tested on 541 text lines containing 3899 words. Given a correct segmentation rate of 95.56%, a recognition rate of 73.45% on the word level is achieved
Keywords :
document image processing; feature extraction; handwritten character recognition; hidden Markov models; image segmentation; optical character recognition; English handwritten text; feature extraction; hidden Markov models; large vocabulary; performance; text line segmentation; text preprocessing; text recognition; word recognition; writer independent handwriting recognition; Character recognition; Data mining; Error correction; Feature extraction; Handwriting recognition; Hidden Markov models; Image segmentation; Stability; Testing; Text recognition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7695-1263-1
Type :
conf
DOI :
10.1109/ICDAR.2001.953775
Filename :
953775
Link To Document :
بازگشت