• DocumentCode
    3135270
  • Title

    Robust prediction of critical temperatures in multi-core chips with limited sensory data

  • Author

    Ankireddi, Sai

  • Author_Institution
    Package & Assembly Eng., Intersil Corp., Milpitas, CA, USA
  • fYear
    2011
  • fDate
    20-24 March 2011
  • Firstpage
    216
  • Lastpage
    221
  • Abstract
    Current generations of high performance microprocessors feature multiple cores and micro-cores, with each supporting multiple threads implemented in hardware. Such designs routinely feature billions of transistors, and chip layout teams are frequently hard pressed for placement and routing of all the functional blocks and sub-blocks that go into the design. An additional complexity arises because system engineers would like to have each micro-cores temperature monitored for silicon reliability and system performance reasons, which translates into them requiring that each core preferably be outfitted with a thermal sensor that routed out to the external world. Since die real estate is already at a premium and sensor macros can often be large, CPU design teams frequently shy away from placing and routing one sensor per each micro-core. The practical implication of this is that there is no means to monitor how hot any given micro-core is getting during field operation - which can compound risk significantly from the standpoints of silicon reliability (GoX, TDDB), chip electrical performance (timing, clock skew, jitter) and system performance (real time benchmarks, field performance, data coherency etc). In this study, a multi-core processor chip with a wide range of core-to-core power variability is considered. A finite number of sensor locations, which are known to be thermally sub-optimal, are assumed to be available for placement and routing. Using sensory data from these “poor” locations and an offline training algorithm, temperatures of all key core locations are determined using a causal, linear least-squares error basis. The resulting formulation is tested for prediction integrity using a large sample Monte Carlo analysis, and the temperature predictions are found to be robust. The technique developed is general enough to be applied across any microprocessor product family. The study concludes with suggested techniques to maintain prediction robu stness in the presence of measurement errors, diode part-to-part variation and other inaccuracies. The approach proposed here can circumvent the limitations on placing and routing multiple diodes in real-estate constrained multi-core microprocessor and ASIC applications.
  • Keywords
    Monte Carlo methods; application specific integrated circuits; clocks; elemental semiconductors; least squares approximations; microprocessor chips; multiprocessing systems; semiconductor device reliability; silicon; temperature measurement; temperature sensors; timing jitter; ASIC applications; CPU design; Monte Carlo analysis; chip electrical performance; chip layout; clock skew; jitter; least-squares error; measurement errors; microcores temperature monitoring; microprocessors; multicore chips; multiple diodes; sensory data; silicon reliability; temperature predictions; thermal sensor; timing; Histograms; Microprocessors; Multicore processing; Temperature distribution; Temperature measurement; Temperature sensors; CPU; Multi-core; diodes; least-squares; monitoring; prediction; temperature;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semiconductor Thermal Measurement and Management Symposium (SEMI-THERM), 2011 27th Annual IEEE
  • Conference_Location
    San Jose, CA
  • ISSN
    1065-2221
  • Print_ISBN
    978-1-61284-740-5
  • Type

    conf

  • DOI
    10.1109/STHERM.2011.5767203
  • Filename
    5767203