• DocumentCode
    3343492
  • Title

    Algorithms for Low-Latency Remote File Synchronization

  • Author

    Hao Yan ; Irmak, U. ; Suel, Torsten

  • Author_Institution
    Polytech. Univ., Brooklyn
  • fYear
    2008
  • fDate
    13-18 April 2008
  • Abstract
    The remote file synchronization problem is how to update an outdated version of a file located on one machine to the current version located on another machine with a minimal amount of network communication. It arises in many scenarios including Web site mirroring, file system backup and replication, or web access over slow links. A widely used open-source tool called rsync uses a single round of messages to solve this problem (plus an initial round for exchanging meta information). While research has shown that significant additional savings in bandwidth are possible by using multiple rounds, such approaches are often not desirable due to network latencies, increased protocol complexity, and higher I/O and CPU overheads at the endpoints. In this paper, we study single-round synchronization techniques that achieve savings in bandwidth consumption while preserving many of the advantages of the rsync approach. In particular, we propose a new and simple algorithm for file synchronization based on set reconciliation techniques. We then show how to integrate sampling techniques into our approach in order to adaptively select the most suitable algorithm and parameter setting for a given data set. Experimental results on several data sets show that the resulting protocol gives significant benefits over rsync, particularly on data sets with high degrees of redundancy between the versions.
  • Keywords
    client-server systems; peer-to-peer computing; sampling methods; synchronisation; Web site mirroring; client-server system; file sharing; file system backup; file system replication; low-latency remote file synchronization problem; network communication; rsync open-source tool; sampling technique; set reconciliation technique; single-round synchronization technique; Access protocols; Bandwidth; Communications Society; Computational Intelligence Society; Costs; Delay; Educational institutions; File systems; Open source software; Sampling methods;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    INFOCOM 2008. The 27th Conference on Computer Communications. IEEE
  • Conference_Location
    Phoenix, AZ
  • ISSN
    0743-166X
  • Print_ISBN
    978-1-4244-2025-4
  • Type

    conf

  • DOI
    10.1109/INFOCOM.2008.40
  • Filename
    4509635