DocumentCode :
2285855
Title :
Fault Tolerance Mechanisms for SLA-aware Resource Management
Author :
Hovestadt, Matthias
Author_Institution :
Paderborn Center for Parallel Comput., Paderborn Univ.
Volume :
2
fYear :
2005
fDate :
22-22 July 2005
Firstpage :
458
Lastpage :
462
Abstract :
Future grid systems will demand for properties like runtime responsibility, predictability, and a guaranteed service quality level. In this context, service level agreements have central importance. Many ongoing research projects already focus on the realization of required mechanisms at grid middleware layer. However, only concentrating on grid middleware is not enough. Also the underlying resource management systems have to provide an increased QoS level, since they provide their resources to grid environments. The EU-funded project HPC4U aims at realizing an SLA-aware resource management system. It allows the grid user to negotiate on SLAs, assuring the adherence with agreed SLAs by means of application-transparent checkpointing, snapshotting, and migration
Keywords :
checkpointing; contracts; grid computing; middleware; quality of service; resource allocation; software fault tolerance; QoS; SLA; application-transparent checkpointing; fault tolerance mechanism; grid middleware layer; grid system; resource management system; service level agreement; service quality level; Business; Checkpointing; Context-aware services; Fault tolerance; Grid computing; Middleware; Parallel processing; Quality of service; Resource management; Runtime;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Systems, 2005. Proceedings. 11th International Conference on
Conference_Location :
Fukuoka
ISSN :
1521-9097
Print_ISBN :
0-7695-2281-5
Type :
conf
DOI :
10.1109/ICPADS.2005.155
Filename :
1524350
Link To Document :
بازگشت