DocumentCode :
1221298
Title :
Nonblocking checkpointing for optimistic parallel simulation: description and an implementation
Author :
Quaglia, Francesco ; Santoro, Andrea
Author_Institution :
Diportimento di Informatica & Sistemistica, Univ. di Roma "La Sapienza", Italy
Volume :
14
Issue :
6
fYear :
2003
fDate :
6/1/2003 12:00:00 AM
Firstpage :
593
Lastpage :
610
Abstract :
Describes a nonblocking checkpointing mode in support of optimistic parallel discrete event simulation. This mode allows real concurrency in the execution of state saving and other simulation specific operations (e.g, event list update, event execution) with the aim of removing the cost of recording state information from the completion time of the parallel simulation application. We present an implementation of a C library supporting nonblocking checkpointing on a myrinet based cluster, which demonstrates the practical viability of this checkpointing mode on standard off-the-shelf hardware. By the results of an empirical study on classical parameterized synthetic benchmarks, we show that, except for the case of minimal state granularity applications, nonblocking checkpointing allows improvement of the speed of the parallel execution, as compared to commonly adopted, optimized checkpointing methods based on the classical blocking mode. A performance study for the case of a personal communication system (PCS) simulation is additionally reported to point out the benefits from nonblocking checkpointing for a real world application.
Keywords :
concurrency control; data integrity; data structures; discrete event simulation; message passing; system recovery; workstation clusters; C library; DMA; completion time; concurrency; discrete event simulation; minimal state granularity; myrinet based cluster; nonblocking checkpointing; optimistic parallel simulation; optimistic synchronization; performance optimization; personal communication system; state saving; Central Processing Unit; Checkpointing; Circuit simulation; Concurrent computing; Context modeling; Costs; Discrete event simulation; Hardware; Libraries; Personal communication networks;
fLanguage :
English
Journal_Title :
Parallel and Distributed Systems, IEEE Transactions on
Publisher :
ieee
ISSN :
1045-9219
Type :
jour
DOI :
10.1109/TPDS.2003.1206506
Filename :
1206506
Link To Document :
بازگشت