• DocumentCode
    419630
  • Title

    Influence of language models and candidate set size on contextual post-processing for Chinese script recognition

  • Author

    Li, Yuan-Xiang ; Tan, Chew Lim

  • Author_Institution
    Sch. of Comput., Nat. Univ. of Singapore, Singapore
  • Volume
    2
  • fYear
    2004
  • fDate
    23-26 Aug. 2004
  • Firstpage
    537
  • Abstract
    In the Chinese language, a word consisting of one or more characters is a basic syntax-meaningful unit, however, each character in the word also has a definite meaning in itself. We compare the perplexities of four n-gram language models (character-based bigram, character-based trigram, word-based bigram and class-based bigram) and their influence on the performance of contextual post-processing of Chinese scripts in an offline handwritten Chinese character recognition system. We also demonstrate the influence of the candidate set size on the performance of contextual post-processing in detail, and indicate that the number of candidates should vary with each script.
  • Keywords
    handwritten character recognition; natural languages; text analysis; word processing; Chinese script recognition; basic syntax-meaningful unit; candidate set size; character-based bigram; character-based trigram; class-based bigram; contextual post-processing; language models; offline handwritten Chinese character recognition system; word-based bigram; Character recognition; Computational modeling; Context modeling; Handwriting recognition; Image recognition; Natural languages; Pattern recognition; Probability; Shape; Writing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on
  • ISSN
    1051-4651
  • Print_ISBN
    0-7695-2128-2
  • Type

    conf

  • DOI
    10.1109/ICPR.2004.1334295
  • Filename
    1334295