Title :
SHHC: A Scalable Hybrid Hash Cluster for Cloud Backup Services in Data Centers
Author :
Xu, Lei ; Hu, Jian ; Mkandawire, Stephen ; Jiang, Hong
Author_Institution :
Univ. of Nebraska-Lincoln, Lincoln, NE, USA
Abstract :
Data deduplication techniques are ideal solutions for reducing both bandwidth and storage space requirements for cloud backup services in data centers. Current data deduplication solutions rely on comparing fingerprints (hash values) of data chunks to identify redundant data and store the fingerprints on a centralized server. This approach limits the overall throughput and concurrency performance in large scale systems. Furthermore, the slow seek time associated with hard disks degrades the performance of hash lookup operations which are mainly random I/Os. In this paper we present a scalable hybrid hash cluster (SHHC) to maintain a low-latency distributed hash table for storing data fingerprints. Each hybrid node in the cluster is composed of RAM and Solid State Drives (SSD) to take advantage of the fast random access inherent in SSDs. This distributed approach makes the system scalable, balances the load on the hash store and significantly reduces the latency of the hash lookup process.
Keywords :
cloud computing; computer centres; pattern clustering; table lookup; RAM; centralized server; cloud backup services; concurrency performance; data centers; data chunks; data deduplication techniques; data fingerprints; hash lookup operations; hash values; low-latency distributed hash table; scalable hybrid hash cluster; solid state drives; Cloud computing; Indexes; Peer to peer computing; Random access memory; Scalability; Servers; Throughput; Cloud Backup; Deduplication;
Conference_Titel :
Distributed Computing Systems Workshops (ICDCSW), 2011 31st International Conference on
Conference_Location :
Minneapolis, MN
Print_ISBN :
978-1-4577-0384-3
Electronic_ISBN :
1545-0678
DOI :
10.1109/ICDCSW.2011.31