DocumentCode :
1938492
Title :
On the performance of BWT sorting algorithms
Author :
Seward, Julian
Author_Institution :
Microsoft Res., Cambridge, UK
fYear :
2000
fDate :
2000
Firstpage :
173
Lastpage :
182
Abstract :
In recent years lossless text compression based on the Burrows-Wheeler transform (BWT) has grown popular. The expensive activity during compression is sorting of all the rotations of the block of data to compress. Burrows and Wheeler (1994) describe an efficient implementation of rotation sorting but give little analysis of its performance. This paper addresses that need. We present the base algorithm using a more formal framework, describe two important optimisations not present in the original paper, and measure performance of the variants on a set of 14 files. For completeness, a tuned implementation of Sadakane´s (1998) sorting algorithm was also tested. Merely measuring running times gives poor insight into the finer aspects of performance on contemporary machine architectures. We report measurements of instruction counts and cache misses for the algorithms giving a clearer picture of which variants perform well, and why
Keywords :
cache storage; data compression; sorting; text analysis; BWT sorting algorithms; Burrows-Wheeler transform; Sadakane sorting algorithm; cache misses; instruction counts; lossless text compression; optimisation; performance; rotation sorting; Sorting;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference, 2000. Proceedings. DCC 2000
Conference_Location :
Snowbird, UT
ISSN :
1068-0314
Print_ISBN :
0-7695-0592-9
Type :
conf
DOI :
10.1109/DCC.2000.838157
Filename :
838157
Link To Document :
بازگشت