Title :
A multi-level pattern matching method for text image parsing
Author :
Prussak, Michal ; Hull, Jonathan J.
Author_Institution :
Dept. of Comput. Sci., State Univ. of New York, Buffalo, NY, USA
Abstract :
A multiple-level pattern matching approach to text image parsing is described. The parser assigns syntactic categories to words that occur in a two-dimensional image of a constrained block of text. Bottom-up information (from an image segmentation routine), that includes the number of lines of text, the number of words in each line, and an estimate of the number of characters in each word, is initially used to compute a ranked list of categories for each word and a probability that each one is correct. The local support provided by words that are horizontally adjacent is used to refine the initial assignment and to assign a ranked list of categories to each line. The word and line category assignments are then probabilistically fit to patterns that describe allowable configurations of lines. The output is a ranked list of the patterns that fit best and the probabilities that they are correct. A success rate of up to 96 percent is achieved in classifying lines of text in a test set of postal address images
Keywords :
computerised pattern recognition; grammars; natural languages; postal services; image segmentation routine; multiple-level pattern matching approach; postal address images; syntactic categories; text image parsing; two-dimensional image; Automatic control; Automation; Cities and towns; Computer science; Dictionaries; Image recognition; Image segmentation; Pattern matching; Roads; System testing;
Conference_Titel :
Artificial Intelligence Applications, 1991. Proceedings., Seventh IEEE Conference on
Conference_Location :
Miami Beach, FL
Print_ISBN :
0-8186-2135-4
DOI :
10.1109/CAIA.1991.120867