Title :
FixD : Fault Detection, Bug Reporting, and Recoverability for Distributed Applications
Author :
Cristian Tapus;David A. Noblet
Author_Institution :
California Institute of Technology, crt@cs.caltech.edu
fDate :
3/1/2007 12:00:00 AM
Abstract :
Model checking, logging, debugging, and checkpointing/recovery are great tools to identify bugs in small sequential programs. The direct application of these techniques to the domain of distributed applications, however, has been less effective (mostly owing to the high degree of concurrency in this context). This paper presents the design of a hybrid tool, FixD, that attempts to address the deficiencies of these tools with respect to their application to distributed systems by using a novel composition of several of these existing techniques. The authors first identify and describe the four abstract components that comprise the FixD tool, then conclude with a proposal for how existing tools can be used to implement these components.
Keywords :
"Fault detection","Debugging","Application software","Fault diagnosis","Computer bugs","Concurrent computing","Software tools","Safety","State-space methods","Scheduling"
Conference_Titel :
Parallel and Distributed Processing Symposium, 2007. IPDPS 2007. IEEE International
Print_ISBN :
1-4244-0909-8
DOI :
10.1109/IPDPS.2007.370413