Title :
Collaborative fault diagnosis in grids through automated tests
Author :
Duarte, Alexandre ; Brasileiro, Francisco ; Cirne, Walfredo ; Filho, José Alencar
Author_Institution :
Univ. Fed. de Campina Grande, Brazil
Abstract :
Grids have the potential to revolutionize computing by providing ubiquitous, on demand access to computational services and resources. However, grid systems are extremely large, complex and prone to failures. A survey we have conducted reveals that fault diagnosis is still a major problem for grid users. When a failure appears at the user screen, it becomes very difficult for the user to identify whether the problem is in his application, somewhere in the grid middleware, or even lower in the fabric that comprises the grid. To overcome this problem, we argue that current grid platforms must be augmented with a collaborative diagnosis mechanism. We propose for such mechanism to use automated tests to identify the root cause of a failure and propose the appropriate fix. We also present a Java-based implementation of the proposed mechanism, which provides a simple and flexible framework that eases the development and maintenance of the automated tests.
Keywords :
Java; automatic testing; computational complexity; fault diagnosis; grid computing; middleware; ubiquitous computing; Java-based implementation; automated tests; collaborative fault diagnosis; computational services; grid middleware; grid systems; ubiquitous computing; Automatic testing; Collaboration; Computer crashes; Fabrics; Fault diagnosis; Grid computing; Java; Middleware; Parallel processing; Pervasive computing;
Conference_Titel :
Advanced Information Networking and Applications, 2006. AINA 2006. 20th International Conference on
Print_ISBN :
0-7695-2466-4
DOI :
10.1109/AINA.2006.127