DocumentCode
2137097
Title
DAM: A DataOwnership-Aware Multi-layered De-duplication Scheme
Author
Tan, Yujuan ; Feng, Dan ; Yan, Zhichao ; Zhou, Guohui
Author_Institution
Wuhan Nat. Lab. for Optoelectron., Huazhong Univ. of Sci. & Technol., Wuhan, China
fYear
2010
fDate
15-17 July 2010
Firstpage
403
Lastpage
411
Abstract
Beyond the storage savings brought by chunk-level de-duplication in backup and archiving systems, a prominent challenge facing this technology is how to efficiently and effectively identify the duplicate chunks. Most of the chunk fingerprints used to identify individual chunks are stored on disks due to the limited main memory capacity. Checking for chunk fingerprint match on disk for every input chunk is known to be a severe performance bottleneck for the backup process. On the other hand, our intuitions and analyses of real backup data both indicate that duplicate chunks tend to strongly concentrate according to the data ownership. Motivated by this observation and to avoid or alleviate the aforementioned backup performance bottleneck, we propose DAM, a dataownership-aware multi-layered de-duplication scheme that exploits the data chunks´ ownership and uses a tri-layered de-duplication approach to narrow the search space for duplicate chunks to reduce the total disk accesses. Our experimental results with real world datasets on DAM show it reduces the disk accesses by an average of 60.8% and shortens the de-duplication time by an average of 46.3%.
Keywords
data compression; storage management; DAM; archiving systems; chunk fingerprints; chunk-level deduplication; data ownership-aware multilayered deduplication scheme; disk access; main memory capacity; Containers; Data structures; Electronic mail; Laboratories; Noise measurement; Redundancy; Servers; backup; de-duplication; disk accesses;
fLanguage
English
Publisher
ieee
Conference_Titel
Networking, Architecture and Storage (NAS), 2010 IEEE Fifth International Conference on
Conference_Location
Macau
Print_ISBN
978-1-4244-8133-0
Type
conf
DOI
10.1109/NAS.2010.57
Filename
5575707
Link To Document