DocumentCode
166603
Title
It takes a village: Monitoring the blue waters supercomputer
Author
Semeraro, B.D. ; Sisneros, Robert ; Fullop, Joshi ; Bauer, Gregory H.
Author_Institution
Nat. Center for Supercomput. Applic., Univ. of Illinois at Urbana-Champaign, Urbana, IL, USA
fYear
2014
fDate
22-26 Sept. 2014
Firstpage
392
Lastpage
399
Abstract
The performance of science applications on modern HPC equipment depends on many factors. Architectural features, individual hardware characteristics, and scheduler traits all have an impact on how a particular application performs, not only in isolation but when run in concert with other user applications. Being able to correlate system events and conditions at particular times can give insight into causes of good or bad performance. Unfortunately, the information we seek is not necessarily in a readily accessible form. The problem at hand is how to enable efficient query of the raw data and flexible graphical representation of the results. Web applications that access an underlying database serve this sort of functionality for many science applications quite well. Our scenario of data access is not very different. The data collected for a large HPC environment is complex and grows in size with time. This aspect is different from applications that deal with more static data. It is the dynamic nature of the data that make the problem interesting. In this work we present our approach for the analysis and visualization of HPC system performance data based on database access and web based graphical presentation. We discuss the details of how data is collected and processed from raw logs into the database, how queries are formulated, and how the data are graphically displayed. This process includes dynamic formulation of the queries. Finally we discuss how the system is utilized to analyze system performance.
Keywords
computerised monitoring; data visualisation; parallel architectures; parallel machines; performance evaluation; processor scheduling; query formulation; Blue Waters supercomputer monitoring; HPC environment; HPC system performance data analysis; HPC system performance data visualization; Web based graphical presentation; architectural features; data access; database access; hardware characteristics; modern HPC equipment; queries dynamic formulation; scheduler traits; system events; system performance analysis; Buildings; Data collection; Data visualization; Databases; Measurement; Monitoring; System performance;
fLanguage
English
Publisher
ieee
Conference_Titel
Cluster Computing (CLUSTER), 2014 IEEE International Conference on
Conference_Location
Madrid
Type
conf
DOI
10.1109/CLUSTER.2014.6968671
Filename
6968671
Link To Document