Title :
A DHT-Based Infrastructure for Sharing Checkpoints in Desktop Grid Computing
Author :
Domingues, Patricio ; Araujo, Filipe ; Silva, Luis Moura
Author_Institution :
Polytechnic Institute of Leiria, Portugal
Abstract :
In this paper we present Chkpt2Chkpt, a desktop grid system that aims to reduce turnaround times of applications by replicating checkpoints. We target desktop computing projects with applications that are comprised of longrunning independent tasks, executed in hundreds or thousands of computers spread over the Internet. While these applications typically do local checkpointing to deal with failures, we propose to replicate those checkpoints in remote places to make them available to other worker nodes. The main idea is to organize the worker nodes of a desktop grid into a peer-to-peer Distributed Hash Table. Worker nodes can take advantage of this P2P network to keep track, share, manage and reclaim the space of the checkpoint files. We used simulation to validate our system and we show that remotely storing replicas of checkpoints can considerably reduce the turnaround times of the tasks, when compared to the traditional approaches where nodes manage their own checkpoints locally. These results make us conclude that the application of P2P techniques seems to be quite helpful in wide-scale desktop grid environments.
Keywords :
Application software; Central Processing Unit; Checkpointing; Computer applications; Distributed computing; Grid computing; Internet; Network address translation; Peer to peer computing; Technology management;
Conference_Titel :
e-Science and Grid Computing, 2006. e-Science '06. Second IEEE International Conference on
Conference_Location :
Amsterdam, The Netherlands
Print_ISBN :
0-7695-2734-5
DOI :
10.1109/E-SCIENCE.2006.261160