• DocumentCode
    2518908
  • Title

    BloomStore: Bloom-Filter based memory-efficient key-value store for indexing of data deduplication on flash

  • Author

    Lu, Guanlin ; Nam, Young Jin ; Du, David H C

  • Author_Institution
    EMC2, Santa Clara, CA, USA
  • fYear
    2012
  • fDate
    16-20 April 2012
  • Firstpage
    1
  • Lastpage
    11
  • Abstract
    Due to its better scalability, Key-Value (KV) store has superseded traditional relational databases for many applications, such as data deduplication, on-line multi-player gaming, and Internet services like Amazon and Facebook. The KV store efficiently supports two operations (key lookup and KV pair insertion) through an index structure that maps keys to their associated values. The KV store is also commonly used to implement the chunk index in data deduplication, where a chunk ID (SHA1 value computed based on the chunk´s content) is a key and its associative chunk metadata (e.g., physical storage location, stream ID) is the value. For a deduplication system, typically the number of chunks is too large to store the KV store solely in RAM. Thus, the KV store maintains a large (hash-table based) index structure in RAM to index all KV pairs stored on secondary storage. Hence, its available RAM space limits the maximum number of KV pairs that can be stored. Moving the index data structure from RAM to flash can possibly overcome the space limitation. In this paper, we propose efficient KV store on flash with a Bloom Filter based index structure called BloomStore. The unique features of the BloomStore include (1) no index structure is required to be stored in RAM so that a small RAM space can support a large number of KV pairs and (2) both index structure and KV pairs are stored compactly on flash memory to improve its performance. Compared with the state-of-the-art KV store designs, the BloomStore achieves a significantly better key lookup performance and roughly the same insertion performance with multiple times less RAM usage based on our experiments with deduplication workloads.
  • Keywords
    flash memories; indexing; meta data; storage management; table lookup; BloomStore; KV pair insertion; RAM; associative chunk metadata; bloom-filter based memory-efficient key-value store; data deduplication; flash memory; hash-table based index structure; indexing; key lookup; Ash; Buffer storage; Indexing; Random access memory; Scalability; Throughput;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Mass Storage Systems and Technologies (MSST), 2012 IEEE 28th Symposium on
  • Conference_Location
    San Diego, CA
  • ISSN
    2160-195X
  • Print_ISBN
    978-1-4673-1745-0
  • Electronic_ISBN
    2160-195X
  • Type

    conf

  • DOI
    10.1109/MSST.2012.6232390
  • Filename
    6232390