DocumentCode
3238649
Title
Access-pattern and bandwidth aware file replication algorithm in a grid environment
Author
Sato, Hitoshi ; Matsuoka, Satoshi ; Endo, Toshio ; Maruyama, Naoya
Author_Institution
Tokyo Inst. of Technol., Tokyo
fYear
2008
fDate
Sept. 29 2008-Oct. 1 2008
Firstpage
250
Lastpage
257
Abstract
Replication in grid file systems can significantly improve I/O performance of data-intensive grid applications, but its manual creation and placement would be impractical in a real grid environment involving thousands to millions of files accessed per application. Although automatic determination of where and how many replicas should be created should be decided with regards to application access patterns and network throughputs, thereby achieving high application throughput and minimizing replica space overhead, previous studies have focused on limited parameter spaces in their algorithmic optimizations. We propose an automated replication algorithm that allows most of I/O accesses to be performed within a given time threshold, while simultaneously minimizing the space overhead by replication. Our algorithm models the replication problem as a combinatorial optimization problem, where the constraints are derived from the given access time threshold and various system parameters, while the objective function being to minimize file replication costs. We solve the optimization problem by dynamically monitoring and estimating inter-node link throughputs and file access patterns of running applications. Our simulation-based studies suggest that the proposed algorithm can achieve higher performance than simple techniques, such as ones that always or never create replicas, while keeping storage usage very low. The results also indicate that the proposed automated algorithm can perform comparably with manual replica placement.
Keywords
combinatorial mathematics; file organisation; grid computing; optimisation; bandwidth aware file replication algorithm; combinatorial optimization problem; data-intensive grid applications; grid environment; grid file systems; Bandwidth; Constraint optimization; Cost function; Data security; Degradation; File systems; Grid computing; Monitoring; Processor scheduling; Throughput;
fLanguage
English
Publisher
ieee
Conference_Titel
Grid Computing, 2008 9th IEEE/ACM International Conference on
Conference_Location
Tsukuba
Print_ISBN
978-1-4244-2578-5
Electronic_ISBN
978-1-4244-2579-2
Type
conf
DOI
10.1109/GRID.2008.4662806
Filename
4662806
Link To Document