Title :
Deplump for Streaming Data
Author :
Bartlett, Nicholas ; Wood, Frank
Author_Institution :
Dept. of Stat., Columbia Univ., New York, NY, USA
Abstract :
We present a general-purpose, loss less compressor for streaming data. This compressor is based on the deplump probabilistic compressor for batch data. Approximations to the inference procedure used in the probabilistic model underpinning deplump are introduced that yield the computational asyptotics necessary for stream compression. We demonstrate the performance of this streaming deplump variant relative to the batch compressor on a benchmark corpus and find that it performs equivalently well despite these approximations. We also explore the performance of the streaming variant on corpora that are too large to be compressed by batch deplump and demonstrate excellent compression performance.
Keywords :
data compression; probability; batch compressor; batch data; deplump probabilistic compressor; inference procedure; probabilistic model underpinning deplump; stream compression; streaming data loss less compressor; streaming deplump variant relative; Approximation algorithms; Approximation methods; Complexity theory; Computational modeling; Context; Inference algorithms; Vegetation; Bayesian; Non-parameteric; sequence memoizer;
Conference_Titel :
Data Compression Conference (DCC), 2011
Conference_Location :
Snowbird, UT
Print_ISBN :
978-1-61284-279-0
DOI :
10.1109/DCC.2011.43