Title :
Enforcing similarity constraints with integer programming for better scene text recognition
Author :
Smith, David L. ; Field, Jeff ; Learned-Miller, Erik
Author_Institution :
Dept. of Comput. Sci., Univ. of Massachusetts Amherst, Amherst, MA, USA
Abstract :
The recognition of text in everyday scenes is made difficult by viewing conditions, unusual fonts, and lack of linguistic context. Most methods integrate a priori appearance information and some sort of hard or soft constraint on the allowable strings. Weinman and Learned-Miller showed that the similarity among characters, as a supplement to the appearance of the characters with respect to a model, could be used to improve scene text recognition. In this work, we make further improvements to scene text recognition by taking a novel approach to the incorporation of similarity. In particular, we train a similarity expert that learns to classify each pair of characters as equivalent or not. After removing logical inconsistencies in an equivalence graph, we formulate the search for the maximum likelihood interpretation of a sign as an integer program. We incorporate the equivalence information as constraints in the integer program and build an optimization criterion out of appearance features and character bigrams. Finally, we take the optimal solution from the integer program, and compare all “nearby” solutions using a probability model for strings derived from search engine queries. We demonstrate word error reductions of more than 30% relative to previous methods on the same data set.
Keywords :
feature extraction; integer programming; learning (artificial intelligence); maximum likelihood estimation; optical character recognition; probability; query processing; search engines; text analysis; appearance features; character bigrams; character classification; equivalence graph; equivalence information; integer programming; maximum likelihood interpretation; optimization criterion; probability model; scene text recognition; search engine queries; similarity constraints; similarity expert; word error reductions; Accuracy; Character recognition; IP networks; Markov processes; Optimization; Text recognition; Training;
Conference_Titel :
Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on
Conference_Location :
Providence, RI
Print_ISBN :
978-1-4577-0394-2
DOI :
10.1109/CVPR.2011.5995700