• DocumentCode
    2302010
  • Title

    ELT: Efficient Log-based Troubleshooting System for Cloud Computing Infrastructures

  • Author

    Kc, Kamal ; Gu, Xiaohui

  • Author_Institution
    Dept. of Comput. Sci., North Carolina State Univ., Raleigh, NC, USA
  • fYear
    2011
  • fDate
    4-7 Oct. 2011
  • Firstpage
    11
  • Lastpage
    20
  • Abstract
    We present an Efficient Log-based Troubleshooting(ELT) system for cloud computing infrastructures. ELT adopts a novel hybrid log mining approach that combines coarse-grained and fine-grained log features to achieve both high accuracy and low overhead. Moreover, ELT can automatically extract key log messages and perform invariant checking to greatly simplify the troubleshooting task for the system administrator. We have implemented a prototype of the ELT system and conducted an extensive experimental study using real management console logs of a production cloud system and a Hadoop cluster. Our experimental results show that ELT can achieve more efficient and powerful troubleshooting support than existing schemes. More importantly, ELT can find software bugs that cannot be detected by current cloud system management practice.
  • Keywords
    cloud computing; program debugging; system monitoring; ELT; Hadoop cluster; cloud computing infrastructures; cloud system management practice; coarse-grained log features; efficient log-based troubleshooting system; fine-grained log features; hybrid log mining approach; invariant checking; production cloud system; software bugs; Algorithm design and analysis; Cloud computing; Clustering algorithms; Feature extraction; Production systems; Runtime;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Reliable Distributed Systems (SRDS), 2011 30th IEEE Symposium on
  • Conference_Location
    Madrid
  • ISSN
    1060-9857
  • Print_ISBN
    978-1-4577-1349-1
  • Type

    conf

  • DOI
    10.1109/SRDS.2011.11
  • Filename
    6076757