Postprocessing statistical language models for handwritten Chinese character recognizer

Author

Wong, Pak-Kwong ; Chan, Chorkin

Author_Institution

Dept. of Comput. Sci., Hong Kong Univ., Hong Kong

Volume

29

Issue

2

fYear

1999

fDate

4/1/1999 12:00:00 AM

Firstpage

286

Lastpage

291

Abstract

Two statistical language models have been investigated on their effectiveness in upgrading the accuracy of a Chinese character recognizer. The baseline model is one of lexical analytic nature which segments a sequence of character images according to the maximum matching of words with consideration of word binding forces. A model of bigram statistics of word-classes is then investigated and compared against the baseline model in terms of recognition rate improvement on the image recognizer. On the average, the baseline language model improves the recognition rate by about 7% while the bigram statistics model upgrades it by about 10%

Keywords

character sets; handwritten character recognition; statistical analysis; Chinese character recognizer; baseline model; bigram statistics; character images; handwritten Chinese; statistical language models; Character recognition; Handwriting recognition; Image analysis; Image recognition; Image segmentation; Image sequence analysis; Lattices; Natural languages; Statistics; Testing;

fLanguage

English

Journal_Title

Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on

Publisher

ieee

ISSN

1083-4419

Type

jour

DOI

10.1109/3477.752802

Filename

752802