DocumentCode :
1679350
Title :
Leap-based Content Defined Chunking — Theory and Implementation
Author :
Chuanshuai Yu ; Chengwei Zhang ; Yiping Mao ; Fulu Li
fYear :
2015
Firstpage :
1
Lastpage :
12
Abstract :
Content Defined Chunking (CDC) is an important component in data deduplication, which affects both the deduplication ratio as well as deduplication performance. The sliding-window-based CDC algorithm and its variants have been the most popular CDC algorithms for the last 15 years. However, their performance is limited in certain application scenarios since they have to slide byte by byte. The authors present a leap-based CDC algorithm which provides significant improvement in deduplication performance without compromising the deduplication ratio. Compared to the sliding-window-based CDC algorithm, the new algorithm enables up to two-fold improvement in performance.
Keywords :
data compression; storage management; data deduplication; deduplication performance; deduplication ratio; leap-based content defined chunking; sliding-window-based CDC algorithm; two-fold improvement; Algorithm design and analysis; Approximation algorithms; Complexity theory; Fingerprint recognition; Gaussian distribution; Indexes; Memory management; content defined chunking; deduplication; judgment function; secondary condition;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Mass Storage Systems and Technologies (MSST), 2015 31st Symposium on
Conference_Location :
Santa Clara, CA
Type :
conf
DOI :
10.1109/MSST.2015.7208290
Filename :
7208290
Link To Document :
بازگشت