DocumentCode :
1063791
Title :
A survey of methods and strategies in character segmentation
Author :
Casey, Richard G. ; Lecolinet, Eric
Author_Institution :
IBM Almaden Res. Center, San Jose, CA, USA
Volume :
18
Issue :
7
fYear :
1996
fDate :
7/1/1996 12:00:00 AM
Firstpage :
690
Lastpage :
706
Abstract :
Character segmentation has long been a critical area of the OCR process. The higher recognition rates for isolated characters vs. those obtained for words and connected character strings well illustrate this fact. A good part of recent progress in reading unconstrained printed and written text may be ascribed to more insightful handling of segmentation. This paper provides a review of these advances. The aim is to provide an appreciation for the range of techniques that have been developed, rather than to simply list sources. Segmentation methods are listed under four main headings. What may be termed the “classical” approach consists of methods that partition the input image into subimages, which are then classified. The operation of attempting to decompose the image into classifiable units is called “dissection.” The second class of methods avoids dissection, and segments the image either explicitly, by classification of prespecified windows, or implicitly by classification of subsets of spatial features collected from the image as a whole. The third strategy is a hybrid of the first two, employing dissection together with recombination rules to define potential segments, but using classification to select from the range of admissible segmentation possibilities offered by these subimages. Finally, holistic approaches that avoid segmentation by recognizing entire character strings as units are described
Keywords :
hidden Markov models; image segmentation; optical character recognition; OCR process; character segmentation; connected character strings; dissection; holistic approaches; isolated characters; recognition rates; unconstrained printed; words; written text; Character recognition; Error analysis; Feature extraction; Hidden Markov models; Image analysis; Image recognition; Image segmentation; Optical character recognition software; Pattern recognition; Pipelines;
fLanguage :
English
Journal_Title :
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Publisher :
ieee
ISSN :
0162-8828
Type :
jour
DOI :
10.1109/34.506792
Filename :
506792
Link To Document :
بازگشت