DocumentCode :
1581791
Title :
Exploration of contextual constraints for character pre-classification
Author :
Ho, Tin Kam ; Nagy, George
fYear :
2001
fDate :
6/23/1905 12:00:00 AM
Firstpage :
450
Lastpage :
454
Abstract :
We present strategies and results for identifying the symbol type (lower-case, upper-case, digit, and punctuation or special symbols) of every character in a text document by using various kinds of information from neighboring characters. In the expectation of reasonable word and character segmentation for shape clustering, we designed several type recognition methods that depend on cluster n-grams, shape codes, and within word context. On an ASCII test corpus of 925 articles that simulates perfect image-level processing, these methods achieve a substantial improvement over default assignment of all characters to lower case
Keywords :
image segmentation; optical character recognition; character preclassification; character segmentation; contextual constraints exploration; default assignment; image-level processing; shape clustering; symbol type; Character recognition; Engines; Frequency; Image segmentation; Optical character recognition software; Shape; Stability; Testing; Text recognition; Tin;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on
Conference_Location :
Seattle, WA
Print_ISBN :
0-7695-1263-1
Type :
conf
DOI :
10.1109/ICDAR.2001.953830
Filename :
953830
Link To Document :
بازگشت