Title :
Open XDMoD: A Tool for the Comprehensive Management of High-Performance Computing Resources
Author :
Palmer, Jeffrey T. ; Gallo, Steven M. ; Furlani, Thomas R. ; Jones, Matthew D. ; DeLeon, Robert L. ; White, Joseph P. ; Simakov, Nikolay ; Patra, Abani K. ; Sperhac, Jeanette ; Yearke, Thomas ; Rathsam, Ryan ; Innus, Martins ; Cornelius, Cynthia D. ; Brow
Author_Institution :
State Univ. of New York, Buffalo, NY, USA
Abstract :
Open XDMoD is an open source tool designed to facilitate the management of high-performance computing (HPC) systems. The Open XDMoD portal provides a rich set of analysis and charting tools that let users quickly display a wide variety of job accounting metrics over any desired timeframe. Two additional tools, which provide quality-of-service metrics and job-level performance data, have been developed and integrated with Open XDMoD to extend its functionality. These tools, combined in an integrated package through Open XDMoD, enable the comprehensive management of HPC resources, allowing HPC center personnel to ensure that the resource is operating efficiently and to determine what applications are running, how efficiently they´re running, and what resources they´re consuming, all of which are important to optimizing the HPC system.
Keywords :
parallel processing; portals; public domain software; quality of service; resource allocation; HPC resource management; Open XDMoD portal; XD metrics on demand; high-performance computing systems; job accounting metrics; job-level performance data; open source tool; quality-of-service metrics; Data warehouses; Measurement; Open source hardware; Open source software; Quality of service; HPC; HPC metrics; HPC resource management; Open XDMoD; SUPReMM; TACC_Stats; XDMoD; application kernels; high-performance computing; scientific computing;
Journal_Title :
Computing in Science & Engineering
DOI :
10.1109/MCSE.2015.68