Title :
In-place length-restricted prefix coding
Author :
Milidiu, Ruy Luiz ; Pessoa, Artur Alves ; Laber, Eduardo Sany
Author_Institution :
Dept. de Inf., PUC, Rio de Janeiro, Brazil
Abstract :
Huffman codes, combined with word-based models, are considered efficient compression schemes for full-text retrieval systems. The decoding rate for these schemes can be substantially improved if the maximum length of the codewords is not greater then the machine word size L. However, if the vocabulary is large, simple methods for generating optimal length-restricted codes are either too slow or require a significantly large amount of memory. We present an in-place, simple and fast implementation for the BRCI (Build, Remove, Condense and Insert) algorithm, an approximative method for length-restricted coding. It overwrites a sorted input list of n weights with the corresponding codeword lengths in O(n) time. In addition, the worst-case compression loss introduced by BRCI codes with respect to unrestricted Huffman codes is proved to be negligible for all practical values of both L and n
Keywords :
Huffman codes; computational complexity; data compression; decoding; full-text databases; sorting; vocabulary; BRCI; Build Remove Condense Insert algorithm; Huffman codes; codewords; compression schemes; decoding; full-text retrieval systems; in-place length-restricted prefix coding; sorted input list; vocabulary; word-based models; Databases; Decoding; Delay; Encoding; Information retrieval; Polynomials; Upper bound; Vocabulary;
Conference_Titel :
String Processing and Information Retrieval: A South American Symposium, 1998. Proceedings
Conference_Location :
Santa Cruz de La Sierra
Print_ISBN :
0-8186-8664-2
DOI :
10.1109/SPIRE.1998.712982