• DocumentCode
    8941
  • Title

    Toward Fine-Grained, Unsupervised, Scalable Performance Diagnosis for Production Cloud Computing Systems

  • Author

    Haibo Mi ; Huaimin Wang ; Yangfan Zhou ; Lyu, Michael R. ; Hua Cai

  • Author_Institution
    Nat. Lab. for Parallel & Distrib. Process., Nat. Univ. of Defense Technol., Changsha, China
  • Volume
    24
  • Issue
    6
  • fYear
    2013
  • fDate
    Jun-13
  • Firstpage
    1245
  • Lastpage
    1255
  • Abstract
    Performance diagnosis is labor intensive in production cloud computing systems. Such systems typically face many real-world challenges, which the existing diagnosis techniques for such distributed systems cannot effectively solve. An efficient, unsupervised diagnosis tool for locating fine-grained performance anomalies is still lacking in production cloud computing systems. This paper proposes CloudDiag to bridge this gap. Combining a statistical technique and a fast matrix recovery algorithm, CloudDiag can efficiently pinpoint fine-grained causes of the performance problems, which does not require any domain-specific knowledge to the target system. CloudDiag has been applied in a practical production cloud computing systems to diagnose performance problems. We demonstrate the effectiveness of CloudDiag in three real-world case studies.
  • Keywords
    cloud computing; matrix algebra; performance evaluation; statistical analysis; CloudDiag system; fast matrix recovery algorithm; fine-grained performance; performance diagnosis technique; production cloud computing system; statistical technique; unsupervised diagnosis tool; Clocks; Cloud computing; Data collection; Electronic mail; Production; Synchronization; Time factors; Cloud computing; performance diagnosis; request tracing;
  • fLanguage
    English
  • Journal_Title
    Parallel and Distributed Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1045-9219
  • Type

    jour

  • DOI
    10.1109/TPDS.2013.21
  • Filename
    6410318