DocumentCode :
3227074
Title :
Suffix Array for Large Alphabet
Author :
Sestak, R. ; Lansky, J. ; Zemlicka, Michal
Author_Institution :
Charles Univ., Prague
fYear :
2008
fDate :
25-27 March 2008
Firstpage :
543
Lastpage :
543
Abstract :
Burrows-Wheeler Transform (BWT) is used as the main part in block compression which has a good balance of speed and compression ratio. Suffix arrays are used in the coding phase of BWT and we focus on creating them for an alphabet larger than 256 symbols. The motivation for this work has been software project XBW-an application for compression of large XML files using word- and syllable-based BWT. The role of BWT is to reorder input before applying other algorithms. We describe and implement three families of algorithms for encoding. Finally we present algorithm by Karkkainen and Sanders for constructing suffix arrays in linear time.
Keywords :
XML; data compression; text analysis; transforms; Burrows-Wheeler transform; XBW software project; XML files compression; alphabet coding; block compression; suffix array; syllable-based BWT; textual files; word-based BWT; Application software; Arithmetic; Data compression; Encoding; Mathematics; Phased arrays; Physics; Sorting; Testing; XML; Burrows-Wheeler transform; suffix array sorting; text compression; word-based compression;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression Conference, 2008. DCC 2008
Conference_Location :
Snowbird, UT
ISSN :
1068-0314
Print_ISBN :
978-0-7695-3121-2
Type :
conf
DOI :
10.1109/DCC.2008.22
Filename :
4483370
Link To Document :
بازگشت