DocumentCode :
3061698
Title :
Sorted sliding window compression
Author :
Gräf, Ulrich
Author_Institution :
Inst. of Theor. Comput. Sci., Tech. Univ. Darmstadt, Germany
fYear :
1999
fDate :
29-31 Mar 1999
Firstpage :
527
Abstract :
Summary form only given. Sorted sliding window compression (SSWC) uses a new model (sorted sliding window model, SSWM) to encode strings efficiently, which appear again while encoding a symbol sequence. The SSWM holds statistics of all strings up to a certain length k in a sliding window of size n. The compression program can use the SSWM to determine if the string of the next symbols are already contained in the sliding window and returns the length of match. SSWM gives directly statistics (borders of subinterval in an interval) for use in entropy encoding methods like arithmetic coding or dense coding. For a given number in an interval and string length the SSWM gives back the corresponding string which is used in decompressing. After an encoding (decoding) step the model is updated with the just encoded (decoded) characters. The model sorts all string starting points in the sliding window lexicographically. A simple way to implement the SSWM is by exhaustive search in the sliding window. An implementation with a B-tree together with special binary searches is used here. SSWC is a simple compression scheme, which uses this new model to evaluate its properties. It looks on the next characters to encode and determines the longest match with the SSWM. If the match is smaller than 2, the character is encoded. Otherwise the length and the subinterval of the string are encoded. The length values are encoded together with the single characters by using the same adaptive frequency model. Additionally some rules are used to reduce the matching length if the code length gets worse. Encoding of frequencies and intervals is done with dense coding
Keywords :
adaptive decoding; arithmetic codes; binary sequences; data compression; entropy codes; search problems; sorting; statistical analysis; string matching; tree data structures; variable length codes; B-tree; adaptive frequency model; arithmetic coding; binary searches; code length; decoding; dense coding; entropy encoding; exhaustive search; matching length; model update; sorted sliding window compression; statistics; symbol sequence encoding; Arithmetic; Computer science; Documentation; Encoding; Entropy; Frequency; Statistics; Usability;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference, 1999. Proceedings. DCC '99
Conference_Location :
Snowbird, UT
ISSN :
1068-0314
Print_ISBN :
0-7695-0096-X
Type :
conf
DOI :
10.1109/DCC.1999.785684
Filename :
785684
Link To Document :
بازگشت