Title :
Static strategies forworksharing with unrecoverable interruptions
Author :
Benoit, A. ; Robert, Y. ; Rosenberg, A.L. ; Vivien, F.
Author_Institution :
ENS Lyon, Lyon, France
Abstract :
One has a large workload that is ldquodivisiblerdquo-its constituent work´s granularity can be adjusted arbitrarily;-and one has access to p remote computers that can assist in computing the workload. The problem is that the remote computers are subject to interruptions of known likelihood that kill all work in progress. One wishes to orchestrate sharing the workload with the remote computers in a way that maximizes the expected amount of work completed. Strategies for achieving this goal, by balancing the desire to checkpoint often, in order to decrease the amount of vulnerable work at any point, vs. the desire to avoid the context-switching required to checkpoint, are studied. Strategies are devised that provably maximize the expected amount of work when there is only one remote computer (the case p = 1). Results suggest the intractability of such maximization for higher values of p, which motivates the development of heuristic approaches. Heuristics are developed that replicate works on several remote computers, in the hope of thereby decreasing the impact of work-killing interruptions. The quality of these heuristics is assessed through exhaustive simulations.
Keywords :
scheduling; workstation clusters; remote computer; static strategies; unrecoverable interruptions; work-killing interruptions; worksharing; Algorithm design and analysis; Assembly; Computational modeling; Concurrent computing; Costs; DNA; Grid computing; Hardware; Processor scheduling; Uncertainty;
Conference_Titel :
Parallel & Distributed Processing, 2009. IPDPS 2009. IEEE International Symposium on
Conference_Location :
Rome
Print_ISBN :
978-1-4244-3751-1
Electronic_ISBN :
1530-2075
DOI :
10.1109/IPDPS.2009.5161044