DocumentCode :
1854084
Title :
LBG-SQUARE Fault-Tolerant, Locality-Aware Co-Allocation in P2P Grids
Author :
Dethier, Gérard ; Briquet, Cyril ; Marchot, Pierre ; de Marneffe, P.-A.
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of Liege, Liege
fYear :
2008
fDate :
1-4 Dec. 2008
Firstpage :
252
Lastpage :
258
Abstract :
In this paper, the deployment and execution of iterative stencil applications on a P2P grid middleware are investigated. So-called iterative stencil applications are composed of sets of heavily-communicating, long-running tasks. They thus require co-allocation of multiple reliable resources for extended periods of time. P2P grids are totally decentralized and provide on-demand, transparent access to edge resources, e.g. Internet-connected, non-dedicated desktop computers. A P2P grid has the potential to provide access to a large number of resources at the fraction of the cost of a dedicated cluster. However, edge resources are heterogeneous in performance and intrinsically unreliable: task execution failures are common due to resource preemption or resource failure. Furthermore, P2P grid schedulers usually target sets of independent computational Tasks, i.e. so-called Bags of Tasks applications. It is therefore not trivial to deploy and run an iterative stencil application on a P2P grid. Checkpointing is a common fault-tolerance mechanism in high performance distributed computing, often based on a centralized architecture. Locality-aware co-allocation in P2P grids has been recently investigated. Checkpointing and locality-aware co-allocation yet have to be integrated in P2P grids. We propose to provide co-allocation through an existing middleware-level Bag of Tasks scheduling mechanism. We also introduce a layer of fault-tolerance for the iterative stencils that relies on a scalable, application-level, P2P checkpointing mechanism. Finally, LBG-SQUARE is described. This software results from the combination of a specific Iterative Stencil application (a computational fluid dynamics simulation software called LaBoGrid) with a P2P grid middleware (Lightweight Bartering Grid).
Keywords :
fault tolerant computing; grid computing; iterative methods; middleware; peer-to-peer computing; workstation clusters; Bags of Tasks; LBG-SQUARE; LaBoGrid; Lightweight Bartering Grid; P2P grid middleware; checkpointing mechanism; computational fluid dynamics simulation software; dedicated cluster; dedicated desktop computers; fault-tolerance mechanism; high performance distributed computing; iterative stencil applications; locality-aware coallocation; Application software; Checkpointing; Computer architecture; Costs; Distributed computing; Fault tolerance; Grid computing; Internet; Middleware; Processor scheduling; Checkpointing; Co-Allocation; Fault-tolerance; P2P Grid;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Computing, Applications and Technologies, 2008. PDCAT 2008. Ninth International Conference on
Conference_Location :
Otago
Print_ISBN :
978-0-7695-3443-5
Type :
conf
DOI :
10.1109/PDCAT.2008.24
Filename :
4710988
Link To Document :
بازگشت