Author :
Milidiù, Ruy Luiz ; Pessoa, Artur Alves ; Laber, Eduardo Sany
Author_Institution :
Dept. de Inf., PUC-Rio, Rio de Janeiro, Brazil
Abstract :
The minimum redundancy prefix code problem is to determine, for a given list W=[w1,...,wn] of n positive symbol weights, a list L=[l1,...,ln] of n corresponding integer codeword lengths such that Σi=1n2 -li⩽1 and Σi=1nwil i is minimized. Let us consider the case where W is already sorted. In this case, the output list L can be represented by a list M=[m1,...,mH], where m(l1), for l=1,...,H, denotes the multiplicity of the codeword length l in L and H is the length of the greatest codeword. Fortunately, H is proved to be O(min{log(1/(p1)),n}), where p1 is the smallest symbol probability, given by w1/Σi=1n wi. We present the F-LazyHuff and the E-LazyHuff algorithms. F-LazyHuff runs in O(n) time but requires O(min{H2,n}) additional space. On the other hand, E-LazyHuff runs in O(nlog(n/H)) time, requiring only O(H) additional space. Finally, since our two algorithms have the advantage of not writing at the input buffer during the code calculation, we discuss some applications where this feature is very useful
Keywords :
codes; computational complexity; data compression; minimisation; probability; redundancy; E-LazyHuff algorithm; F-LazyHuff algorithm; integer codeword lengths; list representation; minimum redundancy prefix codes; running time; space-economical algorithms; symbol probability; Binary trees; Data compression; Writing;