DocumentCode
419630
Title
Influence of language models and candidate set size on contextual post-processing for Chinese script recognition
Author
Li, Yuan-Xiang ; Tan, Chew Lim
Author_Institution
Sch. of Comput., Nat. Univ. of Singapore, Singapore
Volume
2
fYear
2004
fDate
23-26 Aug. 2004
Firstpage
537
Abstract
In the Chinese language, a word consisting of one or more characters is a basic syntax-meaningful unit, however, each character in the word also has a definite meaning in itself. We compare the perplexities of four n-gram language models (character-based bigram, character-based trigram, word-based bigram and class-based bigram) and their influence on the performance of contextual post-processing of Chinese scripts in an offline handwritten Chinese character recognition system. We also demonstrate the influence of the candidate set size on the performance of contextual post-processing in detail, and indicate that the number of candidates should vary with each script.
Keywords
handwritten character recognition; natural languages; text analysis; word processing; Chinese script recognition; basic syntax-meaningful unit; candidate set size; character-based bigram; character-based trigram; class-based bigram; contextual post-processing; language models; offline handwritten Chinese character recognition system; word-based bigram; Character recognition; Computational modeling; Context modeling; Handwriting recognition; Image recognition; Natural languages; Pattern recognition; Probability; Shape; Writing;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition, 2004. ICPR 2004. Proceedings of the 17th International Conference on
ISSN
1051-4651
Print_ISBN
0-7695-2128-2
Type
conf
DOI
10.1109/ICPR.2004.1334295
Filename
1334295
Link To Document