DocumentCode
3057419
Title
A hidden Markov model for language syntax in text recognition
Author
Hull, Jonathan J.
Author_Institution
Dept. of Comput. Sci., State Univ. of New York, Buffalo, NY, USA
fYear
1992
fDate
30 Aug-3 Sep 1992
Firstpage
124
Lastpage
127
Abstract
The use of a hidden Markov model (HMM) for language syntax to improve the performance of a text recognition algorithm is proposed. Syntactic constraints are described by the transition probabilities between word classes. The confusion between the feature string for a word and the various syntactic classes is also described probabilistically. A modification of the Viterbi algorithm is also proposed that finds a fixed number of sequences of syntactic classes for a given sentence that have the highest probabilities of occurrence, given the feature strings for the words. An experimental application of this approach is demonstrated with a word hypothesization algorithm that produces a number of guesses about the identity of each word in a running text. The use of first and second order transition probabilities is explored. Overall performance of between 65 and 80 percent reduction in the average number of words that can match a given image is achieved
Keywords
Markov processes; character recognition; grammars; Viterbi algorithm; hidden Markov model; language syntax; syntactic constraints; text recognition; word class transition probabilities; Algorithm design and analysis; Character recognition; Dictionaries; Hidden Markov models; Image analysis; Performance analysis; Shape; Text analysis; Text recognition; Viterbi algorithm;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition, 1992. Vol.II. Conference B: Pattern Recognition Methodology and Systems, Proceedings., 11th IAPR International Conference on
Conference_Location
The Hague
Print_ISBN
0-8186-2915-0
Type
conf
DOI
10.1109/ICPR.1992.201736
Filename
201736
Link To Document