Title :
NIRVANA: A Non-intrusive Black-Box Monitoring Framework for Rack-Level Fault Detection
Author :
Claudio Ciccotelli;Leonardo Aniello;Federico Lombardi;Luca Montanari;Leonardo Querzoni;Roberto Baldoni
Author_Institution :
Luca Montanari, Leonardo Querzoni, Roberto Baldoni Sapienza Univ. of Rome, Rome, Italy
Abstract :
Many organizations today still manage mid or large in-house data centers that require very expensive maintenance efforts, including fault detection. Common monitoring frameworks used to quickly detect faults are complex to deploy/maintain, expensive, and intrusive as they require the installation of probes on monitored hw/sw to collect raw data. Such intrusiveness can be problematic as it imposes installation/management overhead and may interfere with security/privacy policies. In this paper we introduce NIRVANA, a novel monitoring system for fault detection that works at rack-level and is (i) non-intrusive, i.e., it does not require the installation of software probes on the hosts to be monitored and (ii) black-box, i.e., agnostic with respect to monitored applications. At the core of our solution lies the observation that aggregated features that can be monitored at rack-level in a non-intrusive and black-box way, show predictable behaviors while the system works in both fault-free and faulty states, it is therefore possible to detect and identify faults by monitoring and analyzing any perturbations to these behaviors. An extensive experimental evaluation shows that non-intrusiveness does not significantly hamper the fault detection capabilities of the monitoring system, thus validating our approach.
Keywords :
"Monitoring","Probes","Fault detection","Software","Power demand","Computer architecture","Organizations"
Conference_Titel :
Dependable Computing (PRDC), 2015 IEEE 21st Pacific Rim International Symposium on
DOI :
10.1109/PRDC.2015.22