Title :
A Multi-partitioning Approach to Building Fast and Accurate Counting Bloom Filters
Author :
Kun Huang ; Jie Zhang ; Dafang Zhang ; Gaogang Xie ; Salamatian, Kave ; Liu, Alex X. ; Wei Li
Author_Institution :
Inst. of Comput. Technol., Beijing, China
Abstract :
Bloom filters are space-efficient data structures for fast set membership queries. Counting Bloom Filters (CBFs) extend Bloom filters by allowing insertions and deletions to support dynamic sets. The performance of CBFs is critical for various applications and systems. This paper presents a novel approach to building a fast and accurate data structure called Multiple-Partitioned Counting Bloom Filter (MPCBF) that addresses large-scale data processing challenges. MPCBF is based on two ideas: reducing the number of memory accesses from k (for k hash functions) in the standard CBF to only one memory access in the basic MPCBF-1 case, and a hierarchical structure to improve the false positive rate. We also generalize MPCBF-1 to MPCBF-g to accommodate up to g memory accesses. Our simulation and implementation in MapReduce show that MPCBF outperforms the standard CBF in terms of speed and accuracy. Compared to CBF, at the same memory consumption, MPCBF significantly reduces the false positive rate by an order of magnitude, with a reduction of processing overhead by up to 85.9%.
Keywords :
data structures; query processing; MPCBF; fast set membership queries; hash functions; hierarchical structure; large-scale data processing; multiple-partitioned counting Bloom filter; space-efficient data structures; Bandwidth; Data structures; Information filtering; Memory management; Radiation detectors; Standards; Vectors; Bloom filter; hashing; hierarchical structure; mapreduce; packet processing;
Conference_Titel :
Parallel & Distributed Processing (IPDPS), 2013 IEEE 27th International Symposium on
Conference_Location :
Boston, MA
Print_ISBN :
978-1-4673-6066-1
DOI :
10.1109/IPDPS.2013.51