• DocumentCode
    963872
  • Title

    Offline recognition of unconstrained handwritten texts using HMMs and statistical language models

  • Author

    Bunke, Horst ; Bengio, Samy ; Vinciarelli, Alessandro

  • Volume
    26
  • Issue
    6
  • fYear
    2004
  • fDate
    6/1/2004 12:00:00 AM
  • Firstpage
    709
  • Lastpage
    720
  • Abstract
    This paper presents a system for the offline recognition of large vocabulary unconstrained handwritten texts. The only assumption made about the data is that it is written in English. This allows the application of statistical language models in order to improve the performance of our system. Several experiments have been performed using both single and multiple writer data. Lexica of variable size (from 10,000 to 50,000 words) have been used. The use of language models is shown to improve the accuracy of the system (when the lexicon contains 50,000 words, the error rate is reduced by ∼50 percent for single writer data and by ∼25 percent for multiple writer data). Our approach is described in detail and compared with other methods presented in the literature to deal with the same problem. An experimental setup to correctly deal with unconstrained text recognition is proposed.
  • Keywords
    computational linguistics; handwritten character recognition; hidden Markov models; statistical analysis; English; large vocabulary unconstrained handwritten texts; multiple writer data; offline recognition; single writer data; statistical language models; unconstrained text recognition; Data mining; Dictionaries; Error analysis; Handwriting recognition; Hidden Markov models; Law; Legal factors; Natural languages; Text recognition; Vocabulary; Nhbox{-}{rm{grams}}; Offline cursive handwriting recognition; continuous density Hidden Markov Models.; statistical language models; Algorithms; Artificial Intelligence; Automatic Data Processing; Biometry; Computer Graphics; Documentation; Handwriting; Image Enhancement; Image Interpretation, Computer-Assisted; Information Storage and Retrieval; Markov Chains; Models, Statistical; Numerical Analysis, Computer-Assisted; Pattern Recognition, Automated; Reproducibility of Results; Sensitivity and Specificity; Signal Processing, Computer-Assisted; Subtraction Technique; User-Computer Interface;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2004.14
  • Filename
    1288521