Title :
Universal Compression of Markov and Related Sources Over Arbitrary Alphabets
Author :
Dhulipala, A.K. ; Orlitsky, Alon
Author_Institution :
Electr. & Comput. Eng. Dept., Univ. of California San Diego, La Jolla, CA
Abstract :
Recent work has considered encoding a string by separately conveying its symbols and its pattern-the order in which the symbols appear. It was shown that the patterns of independent and identically distributed (i.i.d.) strings can be losslessly compressed with diminishing per-symbol redundancy. In this correspondence, the pattern redundancy of distributions with memory is considered. Close lower and upper bounds are established on the pattern redundancy of strings generated by Hidden Markov models (HMMs) with a small number of states, showing in particular that their per-symbol pattern redundancy diminishes with increasing string length. The upper bounds are obtained by analyzing the growth rate of the number of multidimensional integer partitions, and the lower bounds, using Hayman´s theorem
Keywords :
data compression; hidden Markov models; source coding; HMM; arbitrary alphabet; hidden Markov model; pattern redundancy; string encoding; universal compression; Algorithm design and analysis; Compression algorithms; Data communication; Entropy; Hidden Markov models; Information theory; Multidimensional systems; Source coding; Speech; Upper bound; Hidden Markov models (HMMs); integer partitions; large alphabets; multidimensional partitions; patterns; redundancy; universal compression;
Journal_Title :
Information Theory, IEEE Transactions on
DOI :
10.1109/TIT.2006.880013