DocumentCode
3688819
Title
Evaluating and exploiting impacts of dynamic power management schemes on system reliability
Author
Liangzhen Lai;Vikas Chandra;Puneet Gupta
Author_Institution
Electrical Engineering Department, UCLA, 90095, United States
fYear
2015
Firstpage
39
Lastpage
48
Abstract
Hardware reliability has been a major concern for nano-scale computing systems. Different hardware design choices, application workloads and software management schemes can jointly affect the system´s resilience. In this paper, we first develop a hardware evaluation platform based on an embedded/mobile development board and standard Linux kernel. We demonstrate the use of our platform to evaluate the system´s power and radiation-induced soft error rate in presence of system power management schemes and with different application workloads and various hardware design configurations. We also propose system/cloud-based virtual sensing to capture varying ambient conditions for reliability evaluation. New reliability management policies are proposed and implemented in Linux kernel to exploit the flexibility in different existing power management schemes. We demonstrate that our policies can achieve the system reliability target under varying application workloads and ambient conditions. Experiments show that our policies are efficient and with less than 3% additional power overhead compared to the optimal schemes characterized offline.
Keywords
"Hardware","Kernel","Linux","Software reliability","Switches"
Publisher
ieee
Conference_Titel
Compilers, Architecture and Synthesis for Embedded Systems (CASES), 2015 International Conference on
Type
conf
DOI
10.1109/CASES.2015.7324544
Filename
7324544
Link To Document