Title :
Fast Insertion and Deletion in Compressed Texts
Author :
Böttcher, Stefan ; Bültmann, Alexander ; Hartel, Rita ; Schlussler, Jonathan
Author_Institution :
EIM - Electr. Eng., Comput. Sci. & Math., Univ. of Paderborn, Paderborn, Germany
Abstract :
Text compression techniques like bzip2 lack the possibility to delete the nth word or to insert text be-fore the nth word of compressed texts without prior decompression of the compressed texts. We present a text compression technique that supports fast insertion into and deletion from compressed texts without full decompression of the compressed text. Our approach combines Indexed Reversible Transformation (IRT) [1], Run-Length-Encoding (RLE), and the Wavelet Tree (WT). For a reasonable size of inserted or deleted texts (more details are given in [2]), our approach is faster than modifying uncompressed text preceded by a decompression step and followed by a compression step.
Keywords :
data compression; word processing; compressed text deletion; compressed text fast insertion; indexed reversible transformation; run length encoding; text compression technique; wavelet tree; Data compression; Decoding; Electrical engineering; Indexes; Merging; Wavelet transforms; BWT; RLE; burrows wheeler transformation; deletion; insertion; text compression; wavelet trees;
Conference_Titel :
Data Compression Conference (DCC), 2012
Conference_Location :
Snowbird, UT
Print_ISBN :
978-1-4673-0715-4
DOI :
10.1109/DCC.2012.50