DocumentCode
2302010
Title
ELT: Efficient Log-based Troubleshooting System for Cloud Computing Infrastructures
Author
Kc, Kamal ; Gu, Xiaohui
Author_Institution
Dept. of Comput. Sci., North Carolina State Univ., Raleigh, NC, USA
fYear
2011
fDate
4-7 Oct. 2011
Firstpage
11
Lastpage
20
Abstract
We present an Efficient Log-based Troubleshooting(ELT) system for cloud computing infrastructures. ELT adopts a novel hybrid log mining approach that combines coarse-grained and fine-grained log features to achieve both high accuracy and low overhead. Moreover, ELT can automatically extract key log messages and perform invariant checking to greatly simplify the troubleshooting task for the system administrator. We have implemented a prototype of the ELT system and conducted an extensive experimental study using real management console logs of a production cloud system and a Hadoop cluster. Our experimental results show that ELT can achieve more efficient and powerful troubleshooting support than existing schemes. More importantly, ELT can find software bugs that cannot be detected by current cloud system management practice.
Keywords
cloud computing; program debugging; system monitoring; ELT; Hadoop cluster; cloud computing infrastructures; cloud system management practice; coarse-grained log features; efficient log-based troubleshooting system; fine-grained log features; hybrid log mining approach; invariant checking; production cloud system; software bugs; Algorithm design and analysis; Cloud computing; Clustering algorithms; Feature extraction; Production systems; Runtime;
fLanguage
English
Publisher
ieee
Conference_Titel
Reliable Distributed Systems (SRDS), 2011 30th IEEE Symposium on
Conference_Location
Madrid
ISSN
1060-9857
Print_ISBN
978-1-4577-1349-1
Type
conf
DOI
10.1109/SRDS.2011.11
Filename
6076757
Link To Document