DocumentCode
598571
Title
Characterizing output bottlenecks in a supercomputer
Author
Bing Xie ; Chase, John ; Dillow, David ; Drokin, O. ; Klasky, Scott ; Oral, Sarp ; Podhorszki, Norbert
Author_Institution
Duke Univ., Durham, NC, USA
fYear
2012
fDate
10-16 Nov. 2012
Firstpage
1
Lastpage
11
Abstract
Supercomputer I/O loads are often dominated by writes. HPC (High Performance Computing) file systems are designed to absorb these bursty outputs at high bandwidth through massive parallelism. However, the delivered write bandwidth often falls well below the peak. This paper characterizes the data absorption behavior of a center-wide shared Lustre parallel file system on the Jaguar supercomputer. We use a statistical methodology to address the challenges of accurately measuring a shared machine under production load and to obtain the distribution of bandwidth across samples of compute nodes, storage targets, and time intervals. We observe and quantify limitations from competing traffic, contention on storage servers and I/O routers, concurrency limitations in the client compute node operating systems, and the impact of variance (stragglers) on coupled output such as striping. We then examine the implications of our results for application performance and the design of I/O middleware systems on shared supercomputers.
Keywords
concurrency control; file organisation; input-output programs; middleware; parallel machines; HPC file systems; I/O middleware systems; I/O routers; Jaguar supercomputer; bandwidth distribution; center-wide shared Lustre parallel file system; client compute node operating systems; concurrency limitations; data absorption behavior; high performance computing; output bottlenecks; production load; shared supercomputers; statistical methodology; storage servers; Absorption; Aggregates; Bandwidth; Benchmark testing; Pipelines; Servers; Supercomputers;
fLanguage
English
Publisher
ieee
Conference_Titel
High Performance Computing, Networking, Storage and Analysis (SC), 2012 International Conference for
Conference_Location
Salt Lake City, UT
ISSN
2167-4329
Print_ISBN
978-1-4673-0805-2
Type
conf
DOI
10.1109/SC.2012.28
Filename
6468446
Link To Document