DocumentCode :
2903847
Title :
Improved deduplication through parallel Binning
Author :
Zhike Zhang ; Bhagwat, D. ; Litwin, W. ; Long, Derek ; Schwarz, S. J. Thomas
Author_Institution :
Univ. of California, Santa Cruz, Santa Cruz, CA, USA
fYear :
2012
fDate :
1-3 Dec. 2012
Firstpage :
130
Lastpage :
141
Abstract :
Many modern storage systems use deduplication in order to compress data by avoiding storing the same data twice. Deduplication needs to use data stored in the past, but accessing information about all data stored can cause a severe bottleneck. Similarity based deduplication only accesses information on past data that is likely to be similar and thus more likely to yield good deduplication. We present an adaptive deduplication strategy that extends Extreme Binning and investigate theoretically and experimentally the effects of the additional bin accesses.
Keywords :
data compression; parallel processing; adaptive deduplication strategy; data compression; extreme binning; improved deduplication; parallel binning; Companies; Data structures; Feature extraction; Indexes; Probability; Random access memory; Search engines;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Performance Computing and Communications Conference (IPCCC), 2012 IEEE 31st International
Conference_Location :
Austin, TX
ISSN :
1097-2641
Print_ISBN :
978-1-4673-4881-2
Type :
conf
DOI :
10.1109/PCCC.2012.6407746
Filename :
6407746
Link To Document :
بازگشت