DocumentCode :
2605161
Title :
Lossless Compression of Textual Images: A Study on Indic Script Documents
Author :
Garain, Utpal ; Chakraborty, M.P. ; Chanda, Bhabatosh
Author_Institution :
Indian Stat. Inst., Kolkata
Volume :
3
fYear :
0
fDate :
0-0 0
Firstpage :
806
Lastpage :
809
Abstract :
This paper presents a method for lossless compression of Indian language textual images. The study is an extension of the previously developed pattern matching and substitution (PM&S)-based method for lossy compression of similar images. Here an efficient method for residue coding is proposed and its performance is compared with CCITT Gr-IV and JBIG. A set of 20 text images for two most popular Indic scripts, namely Devanagari (Hindi) and Bengali, is used in the experiment. It is noted that the best results is achieved by PM&S-based approach followed by LZW-based residue coding. This combined scheme gives lossless compression ratio of about 37.9
Keywords :
data compression; document image processing; image coding; natural languages; pattern matching; text analysis; CCITT Gr-IV; IZW-based residue coding; Indian language textual images; Indic script documents; JBIG; lossless compression; pattern matching and substitution; Arithmetic; Decoding; Dictionaries; Image coding; Image reconstruction; Image storage; Libraries; Optical character recognition software; Pattern matching; Prototypes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Pattern Recognition, 2006. ICPR 2006. 18th International Conference on
Conference_Location :
Hong Kong
ISSN :
1051-4651
Print_ISBN :
0-7695-2521-0
Type :
conf
DOI :
10.1109/ICPR.2006.776
Filename :
1699648
Link To Document :
بازگشت