Title :
Evaluating Power-Monitoring Capabilities on IBM Blue Gene/P and Blue Gene/Q
Author :
Yoshii, Kazutomo ; Iskra, Kamil ; Gupta, Rinku ; Beckman, Pete ; Vishwanath, Venkatram ; Yu, Chenjie ; Coghlan, Susan
Author_Institution :
Math. & Comput. Sci. Div., Argonne Nat. Lab., Argonne, IL, USA
Abstract :
Power consumption is becoming a critical factor as we continue our quest toward exascale computing. Yet, actual power utilization of a complete system is an insufficiently studied research area. Estimating the power consumption of a large scale system is a nontrivial task because a large number of components are involved and because power requirements are affected by the (unpredictable) workloads. Clearly needed is a power-monitoring infrastructure that can provide timely and accurate feedback to system developers and application writers so that they can optimize the use of this precious resource. Many existing large-scale installations do feature power-monitoring sensors, however, those are part of environmental- and health monitoring sub systems and were not designed with application level power consumption measurements in mind. In this paper, we evaluate the existing power monitoring of IBM Blue Gene systems, with the goal of understanding what capabilities are available and how they fare with respect to spatial and temporal resolution, accuracy, latency, and other characteristics. We find that with a careful choice of dedicated micro benchmarks, we can obtain meaningful power consumption data even on Blue Gene/P, where the interval between available data points is measured in minutes. We next evaluate the monitoring subsystem on Blue Gene/Q, and are able to study the power characteristics of FPU and memory subsystems of Blue Gene/Q. We find the monitoring subsystem capable of providing second-scale resolution of power data conveniently separated between node components with seven seconds latency. This represents a significant improvement in power monitoring infrastructure, and hope future systems will enable real-time power measurement in order to better understand application behavior at a finer granularity.
Keywords :
computer power supplies; parallel machines; power consumption; system monitoring; FPU; IBM Blue Gene systems; IBM Blue Gene/P; IBM Blue Gene/Q; accurate feedback; application behavior; application level power consumption measurements; application writers; environmental monitoring subsystems; exascale computing; feature power-monitoring sensors; health monitoring subsystems; hope future systems; large scale system; large-scale installations; memory subsystems; power characteristics; power consumption data; power monitoring infrastructure; power requirements; power utilization; power-monitoring capabilities; power-monitoring infrastructure; real-time power measurement; second-scale resolution; spatial resolution; temporal resolution; Instruction sets; Memory management; Monitoring; Power demand; Power measurement; Stress; Blue Gene/P; Blue Gene/Q; HPC; Microbenchmarks; Power monitoring and profiling;
Conference_Titel :
Cluster Computing (CLUSTER), 2012 IEEE International Conference on
Conference_Location :
Beijing
Print_ISBN :
978-1-4673-2422-9
DOI :
10.1109/CLUSTER.2012.62