• DocumentCode
    1938492
  • Title

    On the performance of BWT sorting algorithms

  • Author

    Seward, Julian

  • Author_Institution
    Microsoft Res., Cambridge, UK
  • fYear
    2000
  • fDate
    2000
  • Firstpage
    173
  • Lastpage
    182
  • Abstract
    In recent years lossless text compression based on the Burrows-Wheeler transform (BWT) has grown popular. The expensive activity during compression is sorting of all the rotations of the block of data to compress. Burrows and Wheeler (1994) describe an efficient implementation of rotation sorting but give little analysis of its performance. This paper addresses that need. We present the base algorithm using a more formal framework, describe two important optimisations not present in the original paper, and measure performance of the variants on a set of 14 files. For completeness, a tuned implementation of Sadakane´s (1998) sorting algorithm was also tested. Merely measuring running times gives poor insight into the finer aspects of performance on contemporary machine architectures. We report measurements of instruction counts and cache misses for the algorithms giving a clearer picture of which variants perform well, and why
  • Keywords
    cache storage; data compression; sorting; text analysis; BWT sorting algorithms; Burrows-Wheeler transform; Sadakane sorting algorithm; cache misses; instruction counts; lossless text compression; optimisation; performance; rotation sorting; Sorting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference, 2000. Proceedings. DCC 2000
  • Conference_Location
    Snowbird, UT
  • ISSN
    1068-0314
  • Print_ISBN
    0-7695-0592-9
  • Type

    conf

  • DOI
    10.1109/DCC.2000.838157
  • Filename
    838157