• DocumentCode
    1255780
  • Title

    Locality-Sensitive Bloom Filter for Approximate Membership Query

  • Author

    Hua, Yu ; Xiao, Bin ; Veeravalli, Bharadwaj ; Feng, Dan

  • Author_Institution
    Wuhan Nat. Lab. for Optoelectron., Huazhong Univ. of Sci. & Technol., Wuhan, China
  • Volume
    61
  • Issue
    6
  • fYear
    2012
  • fDate
    6/1/2012 12:00:00 AM
  • Firstpage
    817
  • Lastpage
    830
  • Abstract
    In many network applications, Bloom filters are used to support exact-matching membership query for their randomized space-efficient data structure with a small probability of false answers. In this paper, we extend the standard Bloom filter to Locality-Sensitive Bloom Filter (LSBF) to provide Approximate Membership Query (AMQ) service. We achieve this by replacing uniform and independent hash functions with locality-sensitive hash functions. Such replacement makes the storage in LSBF to be locality sensitive. Meanwhile, LSBF is space efficient and query responsive by employing the Bloom filter design. In the design of the LSBF structure, we propose a bit vector to reduce False Positives (FP). The bit vector can verify multiple attributes belonging to one member. We also use an active overflowed scheme to significantly decrease False Negatives (FN). Rigorous theoretical analysis (e.g., on FP, FN, and space overhead) shows that the design of LSBF is space compact and can provide accurate response to approximate membership queries. We have implemented LSBF in a real distributed system to perform extensive experiments using real-world traces. Experimental results show that LSBF, compared with a baseline approach and other state-of-the-art work in the literature (SmartStore and LSB-tree), takes less time to respond AMQ and consumes much less storage space.
  • Keywords
    data structures; query processing; approximate membership query; data structure; distributed system; exact matching membership query; false positives; hash functions; locality sensitive bloom filter; query responsive; Accuracy; Arrays; Euclidean distance; Extraterrestrial measurements; Geometry; Probabilistic logic; Approximate membership query; bloom filters; locality sensitive hashing.;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/TC.2011.108
  • Filename
    5928322