DocumentCode :
1355705
Title :
Universal lossless compression via multilevel pattern matching
Author :
Kieffer, John C. ; Yang, En-Hui ; Nelson, Gregory J. ; Cosman, Pamela
Author_Institution :
Dept. of Electr. & Comput. Eng., Minnesota Univ., Minneapolis, MN, USA
Volume :
46
Issue :
4
fYear :
2000
fDate :
7/1/2000 12:00:00 AM
Firstpage :
1227
Lastpage :
1245
Abstract :
A universal lossless data compression code called the multilevel pattern matching code (MPM code) is introduced. In processing a finite-alphabet data string of length n, the MPM code operates at O(log log n) levels sequentially. At each level, the MPM code detects matching patterns in the input data string (substrings of the data appearing in two or more nonoverlapping positions). The matching patterns detected at each level are of a fixed length which decreases by a constant factor from level to level, until this fixed length becomes one at the final level. The MPM code represents information about the matching patterns at each level as a string of tokens, with each token string encoded by an arithmetic encoder. From the concatenated encoded token strings, the decoder can reconstruct the data string via several rounds of parallel substitutions. A O(1/log n) maximal redundancy/sample upper bound is established for the MPM code with respect to any class of finite state sources of uniformly bounded complexity. We also show that the MPM code is of linear complexity in terms of time and space requirements. The results of some MPM code compression experiments are reported
Keywords :
arithmetic codes; computational complexity; concatenated codes; data compression; pattern matching; MPM code; MPM code compression experiments; arithmetic encoder; concatenated encoded token strings; data reconstruction; finite state sources; finite-alphabet data string length; input data string; linear complexity; maximal redundancy/sample upper bound; multilevel pattern matching; parallel substitutions; substrings; uniformly bounded complexity; universal lossless data compression code; Arithmetic; Concatenated codes; Councils; Data compression; Decoding; Entropy; Pattern matching; Performance loss; Source coding; Upper bound;
fLanguage :
English
Journal_Title :
Information Theory, IEEE Transactions on
Publisher :
ieee
ISSN :
0018-9448
Type :
jour
DOI :
10.1109/18.850665
Filename :
850665
Link To Document :
بازگشت