• DocumentCode
    655328
  • Title

    A Proposal for Improving Data Deduplication with Dual Side Fixed Size Chunking Algorithm

  • Author

    Krishnaprasad, P.K. ; Narayamparambil, Biju Abraham

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Rajagiri Sch. of Eng. & Technol., Kochi, India
  • fYear
    2013
  • fDate
    29-31 Aug. 2013
  • Firstpage
    13
  • Lastpage
    16
  • Abstract
    DeDuplication is the technique of data reduction by breaking streams of data down into very granular components, and storing only the first instance of data items on the destination media and all the other similar occurrences to an index. Hash values are computed to identify the similar data items. Fixed size chunking (FSC) is a DeDuplication algorithm which breaks the data into fixed size chunks or blocks from the beginning of the file. But the main disadvantage of this technique is that, if new chunks are added in front or in the middle of a file, remaining chunks will get shifted from its initial position. This will yields a new hash value to the resulting chunks and thereby less DeDuplication ratio. But we can overcome this drawback by calculating hash values of chunks from the beginning as well as from the end of file and storing both values to metadata table. A new algorithm ´Dual Side Fixed Size Chunking´ is proposed to get the high DeDuplication ratio over existing FSC. Without using computationally expensive Variable size chunking or content defined chunking, this algorithm can be effectively used for video or audio files to achieve a better DeDuplication ratio. This data reduction will provide network bandwidth savings and the ability to store more data on a given amount of disk or cloud storage. Reduced storage requirements will result in lower storage management and energy costs.
  • Keywords
    data reduction; storage management; FSC; audio files; cloud storage; data deduplication algorithm; data items; data reduction technique; destination media; dual side fixed size chunking algorithm; energy costs; metadata; storage management; video files; Algorithm design and analysis; Bandwidth; Cloud computing; Educational institutions; Electronic mail; Power capacitors; Servers; Cloud Storage; DeDuplication; DeDuplication Algorithm; Fixed Block Hashing; Fixed Size Chunking;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advances in Computing and Communications (ICACC), 2013 Third International Conference on
  • Conference_Location
    Cochin
  • Type

    conf

  • DOI
    10.1109/ICACC.2013.10
  • Filename
    6686327