• DocumentCode
    3057419
  • Title

    A hidden Markov model for language syntax in text recognition

  • Author

    Hull, Jonathan J.

  • Author_Institution
    Dept. of Comput. Sci., State Univ. of New York, Buffalo, NY, USA
  • fYear
    1992
  • fDate
    30 Aug-3 Sep 1992
  • Firstpage
    124
  • Lastpage
    127
  • Abstract
    The use of a hidden Markov model (HMM) for language syntax to improve the performance of a text recognition algorithm is proposed. Syntactic constraints are described by the transition probabilities between word classes. The confusion between the feature string for a word and the various syntactic classes is also described probabilistically. A modification of the Viterbi algorithm is also proposed that finds a fixed number of sequences of syntactic classes for a given sentence that have the highest probabilities of occurrence, given the feature strings for the words. An experimental application of this approach is demonstrated with a word hypothesization algorithm that produces a number of guesses about the identity of each word in a running text. The use of first and second order transition probabilities is explored. Overall performance of between 65 and 80 percent reduction in the average number of words that can match a given image is achieved
  • Keywords
    Markov processes; character recognition; grammars; Viterbi algorithm; hidden Markov model; language syntax; syntactic constraints; text recognition; word class transition probabilities; Algorithm design and analysis; Character recognition; Dictionaries; Hidden Markov models; Image analysis; Performance analysis; Shape; Text analysis; Text recognition; Viterbi algorithm;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 1992. Vol.II. Conference B: Pattern Recognition Methodology and Systems, Proceedings., 11th IAPR International Conference on
  • Conference_Location
    The Hague
  • Print_ISBN
    0-8186-2915-0
  • Type

    conf

  • DOI
    10.1109/ICPR.1992.201736
  • Filename
    201736