Title :
On the implementation of Zigzag codes for distributed storage system
Author :
Lijia Lu;Hui Li;Jun Chen;Bing Zhu;Weijuan Yin
Author_Institution :
Institute of Big Data Technology, Shenzhen Graduate School, Peking University, Shenzhen, China
Abstract :
Erasure codes such as Reed-Solomon (RS) codes are widely used to improve data reliability in distributed storage systems. Although erasure codes indeed greatly reduce the storage overhead compared to the replication schemes, it is still very costly in terms of network bandwidth when repairing a failed node. To address such problem, we employ the Zigzag code, a MDS array code with optimal repair property, in the practical system. Specifically, we first build a general system on Hadoop to evaluate the encoding, decoding and repair performance of different codes, and then implement Zigzag codes on our system. The experimental results show that the Zigzag codes coincide with the theoretical findings and has certain advantages. Compared to current HDFS modules that use RS codes, our Zigzag based HDFS implementation shows significant reduction of repair disk I/O and repair bandwidth with the same computation complexity.
Keywords :
"Big data","Conferences"
Conference_Titel :
Big Data (Big Data), 2015 IEEE International Conference on
DOI :
10.1109/BigData.2015.7363951