DocumentCode :
2356877
Title :
Phoenix: making data-intensive grid applications fault-tolerant
Author :
Kola, George ; Kosar, Tevfik ; Livny, Miron
Author_Institution :
Comput. Sci. Dept., Wisconsin Univ., Madison, WI, USA
fYear :
2004
fDate :
8 Nov. 2004
Firstpage :
251
Lastpage :
258
Abstract :
A major hurdle facing data intensive grid applications is the appropriate handling of failures that occur in the grid-environment. Implementing the fault-tolerance transparently at the grid-middleware level would make different data intensive applications fault-tolerant without each having to pay a separate cost and reduce the time to grid-based solution for many scientific problems. We analyzed the failures encountered by four real-life production data intensive applications: NCSA image processing pipeline, WCER video processing pipeline, US-CMS pipeline and BMRB BLAST pipeline. Taking the result of the analysis into account, we have designed and implemented Phoenix, a transparent middleware-level fault-tolerance layer that detects failures early, classifies failures into transient and permanent and appropriately handIes the transient failures. We applied our fault-tolerance layer to a prototype of the NCSA image processing pipeline and considerably improved the failure handling and report on the insights gained in the process.
Keywords :
fault tolerant computing; grid computing; image processing; middleware; pipeline processing; system recovery; BMRB BLAST pipeline; NCSA image processing pipeline; Phoenix; US-CMS pipeline; WCER video processing pipeline; data-intensive grid application; failure detection; failure handling; fault-tolerant; grid-middleware level; real-life production; scientific problem; Costs; Failure analysis; Fault detection; Fault tolerance; Image analysis; Image processing; Pipelines; Production; Prototypes; Transient analysis;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Grid Computing, 2004. Proceedings. Fifth IEEE/ACM International Workshop on
ISSN :
1550-5510
Print_ISBN :
0-7695-2256-4
Type :
conf
DOI :
10.1109/GRID.2004.51
Filename :
1382838
Link To Document :
بازگشت