DocumentCode
3135270
Title
Robust prediction of critical temperatures in multi-core chips with limited sensory data
Author
Ankireddi, Sai
Author_Institution
Package & Assembly Eng., Intersil Corp., Milpitas, CA, USA
fYear
2011
fDate
20-24 March 2011
Firstpage
216
Lastpage
221
Abstract
Current generations of high performance microprocessors feature multiple cores and micro-cores, with each supporting multiple threads implemented in hardware. Such designs routinely feature billions of transistors, and chip layout teams are frequently hard pressed for placement and routing of all the functional blocks and sub-blocks that go into the design. An additional complexity arises because system engineers would like to have each micro-cores temperature monitored for silicon reliability and system performance reasons, which translates into them requiring that each core preferably be outfitted with a thermal sensor that routed out to the external world. Since die real estate is already at a premium and sensor macros can often be large, CPU design teams frequently shy away from placing and routing one sensor per each micro-core. The practical implication of this is that there is no means to monitor how hot any given micro-core is getting during field operation - which can compound risk significantly from the standpoints of silicon reliability (GoX, TDDB), chip electrical performance (timing, clock skew, jitter) and system performance (real time benchmarks, field performance, data coherency etc). In this study, a multi-core processor chip with a wide range of core-to-core power variability is considered. A finite number of sensor locations, which are known to be thermally sub-optimal, are assumed to be available for placement and routing. Using sensory data from these “poor” locations and an offline training algorithm, temperatures of all key core locations are determined using a causal, linear least-squares error basis. The resulting formulation is tested for prediction integrity using a large sample Monte Carlo analysis, and the temperature predictions are found to be robust. The technique developed is general enough to be applied across any microprocessor product family. The study concludes with suggested techniques to maintain prediction robu stness in the presence of measurement errors, diode part-to-part variation and other inaccuracies. The approach proposed here can circumvent the limitations on placing and routing multiple diodes in real-estate constrained multi-core microprocessor and ASIC applications.
Keywords
Monte Carlo methods; application specific integrated circuits; clocks; elemental semiconductors; least squares approximations; microprocessor chips; multiprocessing systems; semiconductor device reliability; silicon; temperature measurement; temperature sensors; timing jitter; ASIC applications; CPU design; Monte Carlo analysis; chip electrical performance; chip layout; clock skew; jitter; least-squares error; measurement errors; microcores temperature monitoring; microprocessors; multicore chips; multiple diodes; sensory data; silicon reliability; temperature predictions; thermal sensor; timing; Histograms; Microprocessors; Multicore processing; Temperature distribution; Temperature measurement; Temperature sensors; CPU; Multi-core; diodes; least-squares; monitoring; prediction; temperature;
fLanguage
English
Publisher
ieee
Conference_Titel
Semiconductor Thermal Measurement and Management Symposium (SEMI-THERM), 2011 27th Annual IEEE
Conference_Location
San Jose, CA
ISSN
1065-2221
Print_ISBN
978-1-61284-740-5
Type
conf
DOI
10.1109/STHERM.2011.5767203
Filename
5767203
Link To Document