DocumentCode
2310717
Title
Blue Gene/Q resource management architecture
Author
Budnik, Tom ; Knudson, Brant ; Megerian, Mark ; Miller, Sam ; Mundy, Mike ; Stockdell, Will
Author_Institution
Syst. & Technol. Group, IBM, Rochester, MN, USA
fYear
2010
fDate
15-15 Nov. 2010
Firstpage
1
Lastpage
5
Abstract
As supercomputers scale to a million processor cores and beyond, the underlying resource management architecture needs to provide a flexible mechanism to manage the wide variety of workloads executing on the machine. In this paper we describe the novel approach of the Blue Gene/Q (BG/Q) supercomputer in addressing these workload requirements by providing resource management services that support both the high performance computing (HPC) and high-throughput computing (HTC) paradigms. We explore how the resource management implementations of the prior generation Blue Gene (BG/L and BG/P) systems evolved and led us down the path to developing services on BG/Q that focus on scalability, flexibility and efficiency. Also provided is an overview of the main components comprising the BG/Q resource management architecture and how they interact with one another. Introduced in this paper are BG/Q concepts for partitioning I/O and compute resources to provide I/O resiliency while at the same time providing for faster block (partition) boot times. New features, such as the ability to run a mix of HTC and HPC workloads on the same block are explained, and the advantages of this type of environment are examined. Similar to how Many-task computing (MTC) [1] aims to combine elements of HTC and HPC, the focus of BG/Q has been to unify the two models in a flexible manner where hybrid workloads having both HTC and HPC characteristics are managed simultaneously.
Keywords
multiprocessing systems; parallel architectures; parallel machines; resource allocation; blue gene-Q resource management architecture; blue gene/Q supercomputer; high performance computing; high-throughput computing; many-task computing; processor cores;
fLanguage
English
Publisher
ieee
Conference_Titel
Many-Task Computing on Grids and Supercomputers (MTAGS), 2010 IEEE Workshop on
Conference_Location
New Orleans, LA
Print_ISBN
978-1-4244-9704-1
Electronic_ISBN
978-1-4244-9705-8
Type
conf
DOI
10.1109/MTAGS.2010.5699434
Filename
5699434
Link To Document