• DocumentCode
    1996529
  • Title

    Energy Consumption Models and Predictions for Large-Scale Systems

  • Author

    Samak, Taghrid ; Morin, Christine ; Bailey, D.

  • Author_Institution
    Lawrence Berkeley Nat. Lab. (LBNL), Berkeley, CA, USA
  • fYear
    2013
  • fDate
    20-24 May 2013
  • Firstpage
    899
  • Lastpage
    906
  • Abstract
    Responsible, efficient and well-planned power consumption is becoming a necessity for monetary returns and scalability of computing infrastructures. While there are numerous sources from which power data can be obtained, analyzing this data is an intrinsically hard task. In this paper, we propose a data analysis pipeline that can handle the large-scale collection of energy consumption logs, apply sophisticated modeling to enable accurate prediction, and evaluate the efficiency of the analysis approach. We present the analysis of a power consumption data set collected over a 6-month period from two clusters of the Grid´5000 experimentation platform used in production. To solve the large data challenge, we used Hadoop with Pig data processing to generate a summary of the data that provides basic statistical aggregations, over different time scales. The aggregate data is then analyzed as a time series using sophisticated modeling methods with R statistical software. Energy models from such large dataset can help in understanding the evolution of consumption patterns, predicting future energy trends, and providing basis for generalizing the energy models to similar large-scale systems.
  • Keywords
    data analysis; large-scale systems; power aware computing; power consumption; public domain software; Grid´5000 experimentation platform; Hadoop; Pig data processing; R statistical software; aggregate data analysis; computing infrastructure scalability; data analysis pipeline; energy consumption logs; energy consumption models; monetary returns; power consumption data set; power data; sophisticated modeling methods; statistical aggregations; time series; Analytical models; Autoregressive processes; Computational modeling; Correlation; Data models; Energy consumption; Predictive models; Energy model; Grid´5000; distrbuted systems;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2013 IEEE 27th International
  • Conference_Location
    Cambridge, MA
  • Print_ISBN
    978-0-7695-4979-8
  • Type

    conf

  • DOI
    10.1109/IPDPSW.2013.228
  • Filename
    6650971