DocumentCode
1938492
Title
On the performance of BWT sorting algorithms
Author
Seward, Julian
Author_Institution
Microsoft Res., Cambridge, UK
fYear
2000
fDate
2000
Firstpage
173
Lastpage
182
Abstract
In recent years lossless text compression based on the Burrows-Wheeler transform (BWT) has grown popular. The expensive activity during compression is sorting of all the rotations of the block of data to compress. Burrows and Wheeler (1994) describe an efficient implementation of rotation sorting but give little analysis of its performance. This paper addresses that need. We present the base algorithm using a more formal framework, describe two important optimisations not present in the original paper, and measure performance of the variants on a set of 14 files. For completeness, a tuned implementation of Sadakane´s (1998) sorting algorithm was also tested. Merely measuring running times gives poor insight into the finer aspects of performance on contemporary machine architectures. We report measurements of instruction counts and cache misses for the algorithms giving a clearer picture of which variants perform well, and why
Keywords
cache storage; data compression; sorting; text analysis; BWT sorting algorithms; Burrows-Wheeler transform; Sadakane sorting algorithm; cache misses; instruction counts; lossless text compression; optimisation; performance; rotation sorting; Sorting;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Compression Conference, 2000. Proceedings. DCC 2000
Conference_Location
Snowbird, UT
ISSN
1068-0314
Print_ISBN
0-7695-0592-9
Type
conf
DOI
10.1109/DCC.2000.838157
Filename
838157
Link To Document