Title :
CLUE: System trace analytics for cloud service performance diagnosis
Author :
Hui Zhang ; Junghwan Rhee ; Arora, Nipun ; Gamage, Sahan ; Guofei Jiang ; Yoshihira, K. ; Dongyan Xu
Author_Institution :
Dept. of Autonomic Manage., NEC Labs. America, Princeton, NJ, USA
Abstract :
In this paper, we present CLUE, a system event analytics tool for black-box performance diagnosis in production Cloud Computing systems. CLUE provides an unified and extensible means of profiling service transactional behaviors, and builds structured data called event sketches. CLUE further offers a set of analytic tools for summarizing and analyzing event sketches by integrating data mining and statistical analysis. CLUE has been developed in NEC as an internal tool and applied in diagnosing a diverse set of real performance problems for multi-tiered IT applications running on multi-core servers of major platforms including Linux (Redhat, Fedora), Unix (HP-UX), and Windows (Windows Server 2008). We demonstrated the evaluation of our framework on real-world IT systems, and showed how it can enable visibility and effective diagnosis of service system performance problems.
Keywords :
cloud computing; data mining; statistical analysis; CLUE; Fedora; Linux; Redhat; TIP-UX; Unix; Windows Server 2008; black-box performance diagnosis; cloud service performance diagnosis; data mining; event sketches; multicore servers; multitiered IT applications; production cloud computing systems; profiling service transactional behaviors; real performance problems; statistical analysis; system event analytics tool; system trace analytics; Context; Data mining; Kernel; Message systems; Servers; Synchronization; Cloud Computing; data analytics; data centers; performance diagnostics; system troubleshooting;
Conference_Titel :
Network Operations and Management Symposium (NOMS), 2014 IEEE
Conference_Location :
Krakow
DOI :
10.1109/NOMS.2014.6838348