DocumentCode
1412724
Title
Asymptotic average redundancy of Huffman (and other) block codes
Author
Szpankowski, Wojciech
Author_Institution
Dept. of Comput. Sci., Purdue Univ., West Lafayette, IN, USA
Volume
46
Issue
7
fYear
2000
fDate
11/1/2000 12:00:00 AM
Firstpage
2434
Lastpage
2443
Abstract
We study asymptotically the redundancy of Huffman (and other) codes. It has been known from the inception of the Huffman (1952) code that in the worst case its redundancy-defined as the excess of the code length over the optimal (ideal) code length-is not more than one. However, to the best of our knowledge no precise asymptotic results have been reported in literature thus far. We consider here a memoryless binary source generating a sequence of length n distributed as binomial (n, p) with p being the probability of emitting 0. Based on the results of Stubley (1994), we prove that for p<1/2 the average redundancy R¯nH of the Huffman code becomes as n→∞: R¯nH={(3/2-(1/ln2+o(1))=0.057304…, α irrational); (3/2-(1/M)(⟨βMn⟩-½)); (-(1/M(1-2-1M/))2-⟨nβM⟩M/); (+O(ρn), α=N/M rational); where α=log2 (1-p)/p and β=-log2(1-p), ρ<1, M, N are integers such that gcd (N, M)=1, and ⟨x⟩=x-[x] is the fractional part of x. The appearance of the fractal-like function ⟨βMn⟩ explains the erratic behavior of the Huffman redundancy, and its “resistance” to succumb to a precise analysis. As a side result, we prove that the average redundancy of the Shannon block code is as n→∞: R¯nS{(½+o(1), α irrational); (½-1/M (⟨Mnβ⟩-½)); (+O(ρn), α=N/M rational); where ρ<1. Finally, we derive the redundancy of the Golomb (1966) code (for the geometric distribution) which can be viewed as a special case of the Huffman and Shannon codes, Golomb´s code redundancy has only oscillating behavior (i.e., there is not convergent mode). These findings are obtained by analytic methods such as theory of distribution of sequences modulo 1 and Fourier series
Keywords
Fourier series; Huffman codes; binary sequences; binomial distribution; block codes; memoryless systems; redundancy; Fourier series; Golomb code; Huffman code; Huffman redundancy; Shannon block code; Shannon code; asymptotic average redundancy; binary sequence; binomial distribution; block codes; code length; distribution theory; fractal-like function; geometric distribution; lossless coding; memoryless binary source; oscillating behavior; probability; sequence length; Binary sequences; Block codes; Computer science; Entropy; Fourier series; Fractals; Huffman coding; Information theory; Postal services; Source coding;
fLanguage
English
Journal_Title
Information Theory, IEEE Transactions on
Publisher
ieee
ISSN
0018-9448
Type
jour
DOI
10.1109/18.887856
Filename
887856
Link To Document