Title :
MassStore: A low bandwidth, high De-duplication efficiency network backup system
Author :
Du, Jiayang ; Yu, Hongliang ; Zheng, Weimin
Author_Institution :
Dept. of Comput. Sci. & Technol., Tsinghua Univ., Beijing, China
Abstract :
De-duplication technology has been widely used in disk-based backup system in order to save disk space and reduce backup traffic through internet. But unfortunately De-duplication based backup system often has metadata indexing bottleneck that greatly reduces the backup efficiency and throughput. Existing approaches usually take advantage of backup data flow´s similarity or locality to accelerate metadata indexing. In this paper, we design and implement MassStore, a de-duplication based network backup system which use a two-stage locality sensitive hash algorithm, that combines backup data flow´s data similarity within data flow´s chunk set and the locality between different chunk sets, to accelerate metadata indexing so as to improve de-duplication efficiency. The experimental results using real word data sets shows that our MassStore not only saved the backup storage by average of 88.5%, but also reduced the network bandwidth and RAM usage.
Keywords :
Internet; back-up procedures; data flow analysis; disc storage; meta data; random-access storage; storage management; Internet; MassStore; RAM usage; backup data flow locality; backup data flow similarity; backup efficiency; backup storage; backup throughput; backup traffic; data flow chunk set; de-duplication based network backup system; de-duplication efficiency; de-duplication technology; disk space; disk-based backup system; metadata indexing; network bandwidth; two-stage locality sensitive hash algorithm; Acceleration; Bandwidth; Indexing; Internet; Protocols; Servers; bandwidth; de-duplication; locality; metadata indexing; similarity;
Conference_Titel :
Systems and Informatics (ICSAI), 2012 International Conference on
Conference_Location :
Yantai
Print_ISBN :
978-1-4673-0198-5
DOI :
10.1109/ICSAI.2012.6223150