DocumentCode :
3127014
Title :
Biff (Bloom filter) codes: Fast error correction for large data sets
Author :
Mitzenmacher, Michael ; Varghese, George
fYear :
2012
fDate :
1-6 July 2012
Firstpage :
483
Lastpage :
487
Abstract :
Large data sets are increasingly common in cloud and virtualized environments. For example, transfers of multiple gigabytes are commonplace, as are replicated blocks of such sizes. There is a need for fast error-correction or data reconciliation in such settings even when the expected number of errors is small. Motivated by such cloud reconciliation problems, we consider error-correction schemes designed for large data, after explaining why previous approaches appear unsuitable. We introduce Biff codes, which are based on Bloom filters and are designed for large data. For Biff codes with a message of length L and E errors, the encoding time is O(L), decoding time is O(L + E) and the space overhead is O(E). Biff codes are low-density parity-check codes; they are similar to Tornado codes, but are designed for errors instead of erasures. Further, Biff codes are designed to be very simple, removing any explicit graph structures and based entirely on hash tables. We derive Biff codes by a simple reduction from a set reconciliation algorithm for a recently developed data structure, invertible Bloom lookup tables. While the underlying theory is extremely simple, what makes this code especially attractive is the ease with which it can be implemented and the speed of decoding. We present results from a prototype implementation that decodes messages of 1 million words with thousands of errors in well under a second.
Keywords :
decoding; encoding; error correction codes; parity check codes; Biff codes; bloom filter; cloud reconciliation problems; data reconciliation; decoding speed; decoding time; encoding time; error-correction schemes; fast error correction; hash tables; invertible Bloom lookup tables; large data sets; low-density parity-check codes; multiple gigabytes transfers; set reconciliation algorithm; space overhead; virtualized environments; Decoding; Encoding; Error analysis; Reed-Solomon codes; Standards; Tornadoes;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Theory Proceedings (ISIT), 2012 IEEE International Symposium on
Conference_Location :
Cambridge, MA
ISSN :
2157-8095
Print_ISBN :
978-1-4673-2580-6
Electronic_ISBN :
2157-8095
Type :
conf
DOI :
10.1109/ISIT.2012.6284236
Filename :
6284236
Link To Document :
بازگشت