• DocumentCode
    1757766
  • Title

    An Effective Gray-Box Identification Procedure for Multicore Thermal Modeling

  • Author

    Beneventi, Francesco ; Bartolini, Andrea ; Tilli, Andrea ; Benini, Luca

  • Author_Institution
    Dept. of Electron., Comput. Sci. & Syst., Univ. of Bologna, Bologna, Italy
  • Volume
    63
  • Issue
    5
  • fYear
    2014
  • fDate
    41760
  • Firstpage
    1097
  • Lastpage
    1110
  • Abstract
    Aggressive thermal management is a critical feature for high-end computing platforms, as worst-case thermal budgeting is becoming unaffordable. Reactive thermal management, which sets temperature thresholds to trigger thermal capping actions, is too “near-sighted,” and it may lead to severe performance degradation and thermal overshoots. More aggressive proactive thermal managements minimize performance penalty with smooth optimal control. These techniques require knowledge of thermal models, which have to be accurate and simple to make the controls effective, while keeping their complexity limited. In practice, these models are not provided by manufacturers, and in most cases, they strongly depend on the deployment environment. Hence, procedures to automatically derive thermal models in the field are needed. In this paper, we propose a gray-box procedure to learn a compact and physically consistent model for multicore chips. We leverage the physical consistency of the proposed model to tame the model complexity and to face large quantization noise in measurements. We exploit Output Error structures along with Levenberg-Marquardt and Least Squares optimization algorithms. We tackle the problem in a real-life contest: we developed a complete infrastructure for model building and thermal data collection in the Linux environment, and we tested it on an Intel Nehalem-based server CPU.
  • Keywords
    Linux; computational complexity; least mean squares methods; microprocessor chips; multiprocessing systems; optimal control; optimisation; performance evaluation; power aware computing; Intel Nehalem-based server CPU; Levenberg-Marquardt algorithms; Linux environment; effective gray-box identification procedure; high-end computing platforms; least squares optimization algorithms; model building; model complexity; multicore chips; multicore thermal modeling; output error structures; performance degradation; proactive thermal managements; quantization noise; reactive thermal management; smooth optimal control; thermal data collection; thermal management; thermal overshoots; triggerthermal capping actions; worst-case thermal budgeting; Computational modeling; Heating; Mathematical model; Multicore processing; Temperature sensors; Thermal control; gray box; multicore; power model; system identification; thermal model;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/TC.2012.293
  • Filename
    6381401