Title :
Performance benefits of optimism in fossil collection
Author :
Young, C.H. ; Abu-Ghazaleh, N.B. ; Radhakrishnan, R. ; Wilsey, P.A.
Author_Institution :
Dept. of Electr. & Comput. Eng. Sci., Cincinnati Univ., OH, USA
Abstract :
Each process in a time-warp parallel simulation requires a time-varying set of state and event histories to be retained for recovering from erroneous computations. Erroneous computation is discovered when straggler messages arrive with time-stamps in the process´s past. The traditional method of determining the set of histories to retain has been through the estimation of a global virtual time (GVT) for the distributed simulation. A distributed GVT calculation requires an estimation of the global progress of the simulation during a real-time interval. Optimistic fossil collection (OFC) predicts a bound for the needed histories using local information, or previously collected information, that enables the process to continue. In most cases, OFC requires less communication overhead and less memory usage, and estimates the set of committed events faster. These benefits come at the cost of a possible penalty of having to recover from a state history that was incorrectly fossil-collected (an OFC fault). Sufficiently lightweight checkpointing and recovery techniques compensate for this possibility while yielding good performance. In this paper, the requirements of an OFC-based simulator (algorithm has been implemented in the WARPED time-warp parallel discrete-event simulator) are detailed along with a presentation of results from an OFC simulation. Performance statistics are given comparing the execution time and required memory usage of each logical process for different fossil collection methods.
Keywords :
parallel programming; software performance evaluation; system recovery; time warp simulation; OFC fault; WARPED discrete-event simulator; committed events; communication overhead; distributed simulation; erroneous computation recovery; event histories; execution time; global virtual time; lightweight checkpointing techniques; lightweight recovery techniques; local information; logical processes; memory usage; optimistic fossil collection; performance statistics; real-time interval; state histories; straggler messages; time-stamps; time-warp parallel simulation; Checkpointing; Clocks; Discrete event simulation; History; Kernel; Protocols; Scheduling; Statistics; Synchronization; Time warp simulation;
Conference_Titel :
Systems Sciences, 1999. HICSS-32. Proceedings of the 32nd Annual Hawaii International Conference on
Conference_Location :
Maui, HI, USA
Print_ISBN :
0-7695-0001-3
DOI :
10.1109/HICSS.1999.773082