DocumentCode
1610954
Title
A Fast Duplicate Chunk Identifying Method Based on Hierarchical Indexing Structure
Author
Can Wang ; Zhi-guang Qin ; Lei Yang ; Juan Wang
Author_Institution
Sch. of Comput. Sci. & Eng., Univ. of Electron. Sci. & Technol. of China, Chengdu, China
fYear
2012
Firstpage
624
Lastpage
627
Abstract
To solve the disk bottleneck problem of deduplication system without depending on the data locality, a fast duplicate chunk identifying method based on hierarchical indexing structure is proposed. In this method, the traditional flat indexing structure is vertically divided into two layers, and only a handful of the most representative indices selected according to the Broder´s theorem are kept in the RAM. The experiment results on real data, which are lack of locality, indicate that the deduplication performance of this method can reach 87.05% of the optimal value with a far less RAM requirement than the current methods.
Keywords
indexing; random-access storage; Broder theorem; RAM; deduplication system; disk bottleneck problem; duplicate chunk identifying method; flat indexing structure; hierarchical indexing structure; representative indices; Educational institutions; Feature extraction; Indexing; Random access memory; Throughput; Writing; data locality; deduplication; disk bottleneck; hierarchical indexing structure;
fLanguage
English
Publisher
ieee
Conference_Titel
Industrial Control and Electronics Engineering (ICICEE), 2012 International Conference on
Conference_Location
Xi´an
Print_ISBN
978-1-4673-1450-3
Type
conf
DOI
10.1109/ICICEE.2012.169
Filename
6322458
Link To Document