• DocumentCode
    2743192
  • Title

    An approximation to the greedy algorithm for differential compression of very large files

  • Author

    Agarwal, Ramesh C. ; Amalapurapu, Suchitra ; Jain, Shaili

  • Author_Institution
    IBM Almaden Res. Center, San Jose, CA, USA
  • fYear
    2004
  • fDate
    23-25 March 2004
  • Firstpage
    523
  • Abstract
    This paper presents a new differential compression algorithm that combines the hash value and suffix array technique. In this algorithm, hash values for every block of the reference file is computed. Next, suffix arrays on these block hash values are computed. This algorithm finds the longest matches for every offset of the version file. This algorithm depends upon the utilization of three new data structures, the block hash table, the quick index array, and the pointer array, which improves the run-time of the algorithm, and compress very large files.
  • Keywords
    algorithm theory; block codes; data compression; data structures; file organisation; algorithm run-time; block hash table; data structure; differential compression algorithm; greedy algorithm; hash value; large file compression; pointer array; quick index array; reference file computation; suffix array technique; version file matching; Compression algorithms; Computational biology; Data compression; Data structures; Encoding; Greedy algorithms; Runtime;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Compression Conference, 2004. Proceedings. DCC 2004
  • ISSN
    1068-0314
  • Print_ISBN
    0-7695-2082-0
  • Type

    conf

  • DOI
    10.1109/DCC.2004.1281499
  • Filename
    1281499