Title :
On optimality of variants of the block sorting compression
Author :
Sadakane, Kunihiko
Author_Institution :
Dept. of Inf. Sci., Tokyo Univ., Japan
fDate :
30 Mar-1 Apr 1998
Abstract :
Summary form only given. Block sorting uses the Burrows-Wheeler transformation (BWT) which permutes an input string. The permutation is defined by the lexicographic order of contexts of symbols. If we assume that symbol probability is defined by preceding k symbols called context, symbols whose contexts are the same are collected in consecutive regions after the BWT. Sadakane (1997) proposed a variant of the block sorting and it is asymptotically optimal for any finite-order Markov source if permutation of symbols whose contexts are the same is random. However, the variant encodes 1 symbols as a block and therefore it is not practical because 1 is large. We propose two compression schemes not using blocks but encoding symbols one by one by using arithmetic codes. The move-to-front transformation is not used. The former encodes symbols by different codes defined by symbol frequencies in contexts. It is asymptotically optimal for k-th order Markov sources. However, it is available only if the order k of the source is already known. The latter divides the permuted string into many parts and encodes symbols using different arithmetic codes by the parts. Each part, has symbols whose contexts are the same. If the permutation is random, the scheme is asymptotically optimal for any finite-order Markov source. The permutation in the BWT is not completely random. However, we conjecture that the permuted string is memoryless and our schemes work
Keywords :
Markov processes; arithmetic codes; combinatorial mathematics; data compression; BWT; Burrows-Wheeler transformation; arithmetic codes; block sorting compression; compression schemes; context; finite-order Markov source; input string; lexicographic order; memoryless string; permutation; permuted string; symbol probability; variants optimality; Sorting;
Conference_Titel :
Data Compression Conference, 1998. DCC '98. Proceedings
Conference_Location :
Snowbird, UT
Print_ISBN :
0-8186-8406-2
DOI :
10.1109/DCC.1998.672312