DocumentCode :
1753482
Title :
On demand check pointing for grid application reliability using communicating process model
Author :
Baghavathi Priya, S. ; Subramaniam, Chandrasekaran ; Ravichandran, T.
Author_Institution :
Jawaharlal Nehru Technol. Univ., Hyderabad, India
fYear :
2011
fDate :
13-16 Feb. 2011
Firstpage :
393
Lastpage :
398
Abstract :
The objective of the work is to propose an on-demand asynchronous check pointing technique for the fault recovery of a grid application in communicating process approach. The formal modelling of processes using LOTOS is done wherein the process features are declared in terms of possibilities of rollback and replicas permitted to accept the assigned tasks as decided by the scheduler. If any process is tending to be faulty in run time that will be detected by check pointing mechanism through the Task Dependency Graph (TDG) and their respective worst case execution time and dead line parameters are used to decide the schedulability. The Asynchronous Check Pointing On Demand (ACP-OD) approach is used to enhance the grid application reliability through the needed fault tolerant services. The scheduling of concurrent tasks can be done using the proposed Concurrent Task Scheduling Algorithm (CTSA) algorithm to recover from the faulty states using replication or rollback techniques. The check pointing and replication mechanisms have been used in which the synchronization between communicating processes is needed to enhance the efficiency of check pointing mechanism. The model is tested with a number of rollback variables treating the application as a Stochastic Activity Network (SAN) using Mobius.
Keywords :
grid computing; scheduling; software fault tolerance; LOTOS model; Mobius; asynchronous check pointing on demand approach; asynchronous check pointing technique; communicating process model; concurrent task scheduling algorithm; demand check pointing technique; fault recovery; fault tolerant service; grid application; replication technique; rollback technique; stochastic activity network; task dependency graph; Computational modeling; Fault tolerance; Fault tolerant systems; Information services; Schedules; Synchronization; Check pointing; Process; Reliability; Replication; Rollback;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advanced Communication Technology (ICACT), 2011 13th International Conference on
Conference_Location :
Seoul
ISSN :
1738-9445
Print_ISBN :
978-1-4244-8830-8
Type :
conf
Filename :
5745839
Link To Document :
بازگشت