DocumentCode :
397294
Title :
Probability table compression for handwritten character recognition
Author :
Cho, Sung-Jung ; Perrone, Micheal ; Ratzlaff, Eugene
Author_Institution :
IBM T. J. Watson Res. Center, Yorktown Heights, NY, USA
fYear :
2003
fDate :
3-6 Aug. 2003
Firstpage :
173
Abstract :
This paper presents a new probability table memory compression method based on mixture models and its application to N-tuple recognizers and N-gram character language models. Joint probability tables are decomposed into lower dimensional probability components and their mixtures. The maximum likelihood parameters of the mixture models are trained by the expectation maximization (EM) algorithm and quantized to one byte integers. Probability elements that mixture models do not estimate reliably are kept separately. Experimental results with on-line handwritten UNIPEN uppercase and lowercase characters show that the total memory size of an on-line scanning N-tuple recognizer is reduced from 12.3 MB to 0.66 MB bytes, while the recognition rate drops from 91.64% to 91.13% for uppercase characters and from 88.44% to 87.31% for lowercase characters. The N-gram character language model was compressed from 73.6 MB to 0.58 MB with minimal reduction in performance.
Keywords :
data compression; handwriting recognition; handwritten character recognition; maximum likelihood estimation; probability; N-gram character language model; expectation maximization algorithm; handwritten character recognition; joint probability table; maximum likelihood parameter; mixture coefficient table; mixture component table; mixture model; on-line handwritten UNIPEN lowercase character; on-line handwritten UNIPEN uppercase character; on-line scanning N-tuple recognizer; probability table memory compression method; Character recognition; Handwriting recognition; Text analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on
Print_ISBN :
0-7695-1960-1
Type :
conf
DOI :
10.1109/ICDAR.2003.1227654
Filename :
1227654
Link To Document :
بازگشت