Title :
PeerDedupe: Insights into the Peer-Assisted Sampling Deduplication
Author :
Xing, Yuanjian ; Li, Zhenhua ; Dai, Yafei
Author_Institution :
Dept. of Comput. Sci. & Technol., Peking Univ., Beijing, China
Abstract :
As the digital data rapidly inflates to a world-wide storage crisis, data deduplication is showing its increasingly prominent function in data storage. Driven by the problems behind the mainstream server-side deduplication schemes, recently there has been a tendency of introducing peer-assisted methods into the deduplication systems. However, this topic is still quite vague at present and lacks thorough research. In this paper, we conduct in-depth and quantitative investigation on the peer-assisted deduplication. Through measurements we observe that the inter-peer duplication accounts for a large proportion of the total duplication, and exhibits strong peer locality. Then based on our observations, we propose PeerDedupe, a novel peer-assisted sampling deduplication approach. Experiments show that PeerDedupe can remove over 98% duplication with each peer coordinating with no more than 5 other peers, and it requires much less server RAM usage than the existing works.
Keywords :
data compression; peer-to-peer computing; random-access storage; sampling methods; storage management; PeerDedupe; RAM usage; data deduplication; digital data; inter-peer duplication account; mainstream server-side deduplication; peer-assisted sampling deduplication; Accuracy; Estimation; Greedy algorithms; Redundancy; Sampling methods; Servers; Weibull distribution;
Conference_Titel :
Peer-to-Peer Computing (P2P), 2010 IEEE Tenth International Conference on
Conference_Location :
Delft
Print_ISBN :
978-1-4244-7140-9
Electronic_ISBN :
978-1-4244-7139-3
DOI :
10.1109/P2P.2010.5570004