• DocumentCode
    70040
  • Title

    Ensemble Learning for Large-Scale Workload Prediction

  • Author

    Singh, Navab ; Rao, Smitha

  • Author_Institution
    Int. Inst. of Inf. Technol. - Bangalore (IIIT-B), Bangalore, India
  • Volume
    2
  • Issue
    2
  • fYear
    2014
  • fDate
    Jun-14
  • Firstpage
    149
  • Lastpage
    165
  • Abstract
    Increasing energy costs of large-scale server systems have led to a demand for innovative methods for optimizing resource utilization in these systems. Such methods aim to reduce server energy consumption, cooling requirements, carbon footprint, and so on, thereby leading to improved holistic sustainability of the overall server infrastructure. At the core of many of these methods lie reliable workload-prediction techniques that guide in identifying servers, time intervals, and other parameters that are needed for building sustainability solutions based on techniques like virtualization and server consolidation for server systems. Many workload prediction methods have been proposed in the recent paper, but unfortunately they do not deal adequately with the issues that arise specifically in large-scale server systems, viz., extensive nonstationarity of server workloads, and massive online streaming data. In this paper, we fill this gap by proposing two online ensemble learning methods for workload prediction, which address these issues in large-scale server systems. The proposed algorithms are motivated from the weighted majority and simulatable experts approaches, which we extend and adapt to the large-scale workload prediction problem. We demonstrate the effectiveness of our algorithms using real and synthetic data sets, and show that using the proposed algorithms, the workloads of 91% of servers in a real data center can be predicted with accuracy > 89%, whereas using baseline approaches, the workloads of only 13%-24% of the servers can be predicted with similar accuracy.
  • Keywords
    learning (artificial intelligence); network servers; power aware computing; resource allocation; data center; holistic sustainability; large-scale server systems; large-scale workload prediction; massive online streaming data; online ensemble learning methods; resource utilization optimization; server workload extensive nonstationarity; Approximation algorithms; Computational modeling; Data models; Energy consumption; Large-scale systems; Prediction algorithms; Predictive models; Server workload prediction; ensemble-based learning; machine learning; sustainable computing; sustainable server systems;
  • fLanguage
    English
  • Journal_Title
    Emerging Topics in Computing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    2168-6750
  • Type

    jour

  • DOI
    10.1109/TETC.2014.2310455
  • Filename
    6784514