DocumentCode :
656169
Title :
Hysteresis Re-chunking Based Metadata Harnessing Deduplication of Disk Images
Author :
Bing Zhou ; Jiangtao Wen
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
fYear :
2013
fDate :
1-4 Oct. 2013
Firstpage :
389
Lastpage :
398
Abstract :
Metadata-related overhead can significantly impact the performance of data deduplication systems, including the real duplication elimination ratio and the deduplication throughput. The amount of metadata produced is mainly determined by the chunking mechanism for the input data stream. In this paper, we propose a metadata harnessing deduplication (MHD) algorithm utilizing a duplication-distribution-based hysteresis re-chunking strategy. MHD harnesses the metadata by dynamically merging multiple non-duplicate chunks into one big chunk represented by one hash value while dividing big chunks straddling duplicate and non-duplicate data regions into small chunks represented with multiple hashes. Experimental results show that the proposed algorithm achieves a lower metadata overhead and a higher deduplication throughput for a given duplication elimination ratio, as compared with other state-of-the-art algorithms such as the Bimodal, Sub Chunk and Sparse Indexing algorithms.
Keywords :
meta data; storage management; MHD algorithm; bimodal algorithm; data storage system; deduplication throughput; disk images; duplication-distribution-based hysteresis re-chunking strategy; hysteresis re-chunking based metadata harnessing deduplication system; input data stream; metadata-related overhead; real duplication elimination ratio; sparse indexing algorithms; subchunk algorithm; Algorithm design and analysis; Hysteresis; Indexes; Magnetohydrodynamics; Merging; Random access memory; Throughput; Data Deduplication; Metadata Harnessing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing (ICPP), 2013 42nd International Conference on
Conference_Location :
Lyon
ISSN :
0190-3918
Type :
conf
DOI :
10.1109/ICPP.2013.48
Filename :
6687372
Link To Document :
بازگشت