DocumentCode :
2622953
Title :
Grid Unit: A Self-Managing Building Block for Grid System
Author :
Zhan, Jianfeng ; Wang, Lei ; Zou, Ming ; Wang, Hui ; Gao, Shuang ; Ding, Yulei
Author_Institution :
Chinese Acad. of Sci., Beijing
fYear :
2007
fDate :
3-6 Dec. 2007
Firstpage :
303
Lastpage :
310
Abstract :
Grid system software is inherently complex, hard to build and maintain. In this paper, we propose a self-managing building block: grid unit, which facilitates constructing grid system with higher availability and lower management overhead. We present an agent organization as autonomic management framework, and propose a self-recovering protocol to eliminate most of tough jobs from system administrator´s routines. The system has been deployed on Dawning 4000A since 2004, the biggest node for China grid system. We have done extensive experiments to evaluate grid unit, and the collected log data shows the availability of a grid parallel process management service, built on the basis of grid unit, reaches 99.997%.
Keywords :
grid computing; parallel processing; software management; China; Dawning 4000A; agent organization; autonomic management framework; grid parallel process management service; grid system software; grid unit; self-managing building block; self-recovering protocol; Application software; Availability; Computer architecture; Concurrent computing; Distributed computing; Laboratories; Large-scale systems; Operating systems; Protocols; Resource management;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Computing, Applications and Technologies, 2007. PDCAT '07. Eighth International Conference on
Conference_Location :
Adelaide, SA
Print_ISBN :
0-7695-3049-4
Type :
conf
DOI :
10.1109/PDCAT.2007.43
Filename :
4420184
Link To Document :
بازگشت