• DocumentCode
    558733
  • Title

    Semi-automated data center hotspot diagnosis

  • Author

    McIntosh, S. ; Kephart, J.O. ; Lenchner, J. ; Feridun, M. ; Nidd, M. ; Tanner, A. ; Yang, B. ; Barabasi, I.

  • Author_Institution
    Thomas J. Watson Res. Center, IBM, Yorktown Heights, NY, USA
  • fYear
    2011
  • fDate
    24-28 Oct. 2011
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    An increasingly important requirement for energy-efficient data center operation is to diagnose and fix thermal anomalies that sometimes occur due to excessive workload or equipment failures. Today, the task of diagnosing thermal anomalies entails expert but tedious analysis of data collected manually from disparate management systems. Our ultimate goal is to substantially reduce the time, tedium and expertise required to diagnose thermal hotspots by developing a system that generates accurate diagnoses automatically. We describe a substantial step towards this goal: a loosely-coupled, semi-automated thermal diagnosis system that integrates IT and facilities data, uses simple heuristics to highlight the most likely culprits, and provides a graphical interface that enables an administrator to narrow the list further by exploring data correlations. Among the challenges addressed by our solution are coping with heterogeneous data types and data access methods, and detecting and managing erroneous sensor readings.
  • Keywords
    computer centres; data analysis; data access methods; data analysis; data correlations; disparate management systems; energy-efficient data center operation; equipment failures; erroneous sensor reading detection; erroneous sensor reading management; excessive workload; graphical interface; heterogeneous data types; loosely-coupled semi-automated thermal diagnosis system; semi-automated data center hotspot diagnosis; thermal anomalies; thermal hotspots; Blades; Cooling; Monitoring; Servers; Temperature measurement; Temperature sensors; Tiles; Energy Management; Green products; Linked Data; Semantic Web; Systems Management Data Integration;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Network and Service Management (CNSM), 2011 7th International Conference on
  • Conference_Location
    Paris
  • Print_ISBN
    978-1-4577-1588-4
  • Electronic_ISBN
    978-3-901882-44-9
  • Type

    conf

  • Filename
    6104023