DocumentCode :
655328
Title :
A Proposal for Improving Data Deduplication with Dual Side Fixed Size Chunking Algorithm
Author :
Krishnaprasad, P.K. ; Narayamparambil, Biju Abraham
Author_Institution :
Dept. of Comput. Sci. & Eng., Rajagiri Sch. of Eng. & Technol., Kochi, India
fYear :
2013
fDate :
29-31 Aug. 2013
Firstpage :
13
Lastpage :
16
Abstract :
DeDuplication is the technique of data reduction by breaking streams of data down into very granular components, and storing only the first instance of data items on the destination media and all the other similar occurrences to an index. Hash values are computed to identify the similar data items. Fixed size chunking (FSC) is a DeDuplication algorithm which breaks the data into fixed size chunks or blocks from the beginning of the file. But the main disadvantage of this technique is that, if new chunks are added in front or in the middle of a file, remaining chunks will get shifted from its initial position. This will yields a new hash value to the resulting chunks and thereby less DeDuplication ratio. But we can overcome this drawback by calculating hash values of chunks from the beginning as well as from the end of file and storing both values to metadata table. A new algorithm ´Dual Side Fixed Size Chunking´ is proposed to get the high DeDuplication ratio over existing FSC. Without using computationally expensive Variable size chunking or content defined chunking, this algorithm can be effectively used for video or audio files to achieve a better DeDuplication ratio. This data reduction will provide network bandwidth savings and the ability to store more data on a given amount of disk or cloud storage. Reduced storage requirements will result in lower storage management and energy costs.
Keywords :
data reduction; storage management; FSC; audio files; cloud storage; data deduplication algorithm; data items; data reduction technique; destination media; dual side fixed size chunking algorithm; energy costs; metadata; storage management; video files; Algorithm design and analysis; Bandwidth; Cloud computing; Educational institutions; Electronic mail; Power capacitors; Servers; Cloud Storage; DeDuplication; DeDuplication Algorithm; Fixed Block Hashing; Fixed Size Chunking;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in Computing and Communications (ICACC), 2013 Third International Conference on
Conference_Location :
Cochin
Type :
conf
DOI :
10.1109/ICACC.2013.10
Filename :
6686327
Link To Document :
بازگشت