• DocumentCode
    3700247
  • Title

    A feature generation framework for Google trace analysis

  • Author

    Zi-Wei Fan;Pei-Jie Huang;Pei-Sen Huang;Lin-Xiao Chen;Yu-Qing Xiao;Ming-Xiang Huo;Yu Liang

  • Author_Institution
    College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
  • Volume
    1
  • fYear
    2015
  • fDate
    7/1/2015 12:00:00 AM
  • Firstpage
    229
  • Lastpage
    234
  • Abstract
    Analysis of Cloud workloads using data mining technology is critical for improving resource management. Unfortunately there is a lack of systematic approach to support the feature generation in comprehensive workload analysis. In this paper we propose a function based generalized feature generation method for the analysis of a one-month trace of a Google data center over 40 million task events across about 12,000 machines. The feature set is generated by a set of constructor functions that can keep interpretability and be partly automatically generated. These can be calculation function that can be performed automatically, such as aggregate, synthesis, and statistical operators, or reduction operators that rely on prior information. The proposed method was experimentally evaluated in the host load prediction based on Naive Bayes algorithm. Experiment results show the well performance of our method.
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics (ICMLC), 2015 International Conference on
  • Type

    conf

  • DOI
    10.1109/ICMLC.2015.7340927
  • Filename
    7340927