DocumentCode :
726424
Title :
Task scheduling strategies to mitigate hardware variability in embedded shared memory clusters
Author :
Rahimi, Abbas ; Cesarini, Daniele ; Marongiu, Andrea ; Gupta, Rajesh K. ; Benini, Luca
Author_Institution :
UC San Diego, San Diego, CA, USA
fYear :
2015
fDate :
8-12 June 2015
Firstpage :
1
Lastpage :
6
Abstract :
Manufacturing and environmental variations cause timing errors that are typically avoided by conservative design guardbands or corrected by circuit level error detection and correction. These measures incur energy and performance penalties. This paper considers methods to reduce this cost by expanding the scope of variability mitigation through the software stack. In particular, we propose workload deployment methods that reduce the likelihood of timing errors in shared memory clusters of processor cores. This and other methods are incorporated in a runtime layer in the OpenMP framework that enables parsimonious countermeasures against timing errors induced by hardware variability. The runtime system “introspectively” monitors the costs of tasks execution on various cores and transparently associates descriptive metadata with the tasks. By utilizing the characterized metadata, we propose several policies that enhance the cluster choices for scheduling tasks to cores according to measured hardware variability and system workload. We devise efficient task scheduling strategies for simultaneous management of variability and workload by exploiting centralized and distributed approaches to workload distribution. Both schedulers surpass current state-of-the-art approaches; the distributed (or the centralized) achieves on average 30% (or 17%) energy, and 17% (4%) performance improvement.
Keywords :
embedded systems; scheduling; shared memory systems; OpenMP framework; circuit level error correction; circuit level error detection; cost reduction; embedded shared memory clusters; energy penalty; hardware variability mitigation; performance penalty; software stack; task scheduling strategy; workload deployment methods; workload distribution; Error analysis; Hardware; Instruction sets; Radiation detectors; Timing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Design Automation Conference (DAC), 2015 52nd ACM/EDAC/IEEE
Conference_Location :
San Francisco, CA
Type :
conf
DOI :
10.1145/2744769.2744915
Filename :
7167338
Link To Document :
بازگشت