• DocumentCode
    3238649
  • Title

    Access-pattern and bandwidth aware file replication algorithm in a grid environment

  • Author

    Sato, Hitoshi ; Matsuoka, Satoshi ; Endo, Toshio ; Maruyama, Naoya

  • Author_Institution
    Tokyo Inst. of Technol., Tokyo
  • fYear
    2008
  • fDate
    Sept. 29 2008-Oct. 1 2008
  • Firstpage
    250
  • Lastpage
    257
  • Abstract
    Replication in grid file systems can significantly improve I/O performance of data-intensive grid applications, but its manual creation and placement would be impractical in a real grid environment involving thousands to millions of files accessed per application. Although automatic determination of where and how many replicas should be created should be decided with regards to application access patterns and network throughputs, thereby achieving high application throughput and minimizing replica space overhead, previous studies have focused on limited parameter spaces in their algorithmic optimizations. We propose an automated replication algorithm that allows most of I/O accesses to be performed within a given time threshold, while simultaneously minimizing the space overhead by replication. Our algorithm models the replication problem as a combinatorial optimization problem, where the constraints are derived from the given access time threshold and various system parameters, while the objective function being to minimize file replication costs. We solve the optimization problem by dynamically monitoring and estimating inter-node link throughputs and file access patterns of running applications. Our simulation-based studies suggest that the proposed algorithm can achieve higher performance than simple techniques, such as ones that always or never create replicas, while keeping storage usage very low. The results also indicate that the proposed automated algorithm can perform comparably with manual replica placement.
  • Keywords
    combinatorial mathematics; file organisation; grid computing; optimisation; bandwidth aware file replication algorithm; combinatorial optimization problem; data-intensive grid applications; grid environment; grid file systems; Bandwidth; Constraint optimization; Cost function; Data security; Degradation; File systems; Grid computing; Monitoring; Processor scheduling; Throughput;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Grid Computing, 2008 9th IEEE/ACM International Conference on
  • Conference_Location
    Tsukuba
  • Print_ISBN
    978-1-4244-2578-5
  • Electronic_ISBN
    978-1-4244-2579-2
  • Type

    conf

  • DOI
    10.1109/GRID.2008.4662806
  • Filename
    4662806