Title :
Improving performance via computational replication on a large-scale computational grid
Author :
Li, Yaohang ; Mascagni, Michael
Author_Institution :
Dept. of Comput. Sci., Florida State Univ., Tallahassee, FL, USA
Abstract :
High performance computing on a large-scale computational grid is complicated by the heterogeneous computational capabilities of each node, node unavailability, and unreliable network connectivity. Replicating computation on multiple nodes can significantly improve performance by reducing task completion time on a grid´s dynamic environment. We develop an analytical model to determine the number of task replicas to meet the performance goals in different computational grid configurations. Furthermore, taking advantage of the statistical nature of grid-based Monte Carlo applications, we extend the computational replication technique to an N-out-of-M scheduling strategy for grid-based Monte Carlo applications, which can potentially form a large category of grid-computing applications. In addition, we establish a corresponding model for the N-out-of-M scheduling mechanism. Simulations are used to validate the computational replication models. Our preliminary results show that the models we use are effective in predicting the required number of replicas to achieve short task completion time with a given high probability.
Keywords :
Monte Carlo methods; grid computing; processor scheduling; replica techniques; Monte Carlo application; N-out-of-M scheduling; computational replication technique; grid computing; Grid computing; Large-scale systems;
Conference_Titel :
Cluster Computing and the Grid, 2003. Proceedings. CCGrid 2003. 3rd IEEE/ACM International Symposium on
Print_ISBN :
0-7695-1919-9
DOI :
10.1109/CCGRID.2003.1199399