Title :
The need for new monitoring and management technologies in large scale computing systems
Author :
Buchholz, Jochen ; Volk, Eugen
Author_Institution :
HLRS High Performance Comput. Center Stuttgart, Stuttgart, Germany
Abstract :
Currently administrators of high performance computing (HPC) resources are facing new challenges caused by several changes in the usage of the resources like rapidly growing user community and their needs on application level, interdisciplinary usage and therefore new functional requirements like storage at specific storage providers. The increasing complexity of the whole system administration needs to be supported technically. In this paper we explain the reasons why the administration of HPC resources is slightly different from other resources and show the consequences when these differences are not regarded by administration tools. After exposing their limitations and deficiencies we describe the upcoming needs from the HPC providers´ perspective in comparison to the currently available features. In order to solve addressed problems in a very generic way, we present a possible solution of a hierarchically structured monitoring and management framework.
Keywords :
mainframes; resource allocation; system monitoring; systems analysis; HPC resource management; hierarchically structured monitoring; high performance computing; large scale computing system; management technology; system administration;
Conference_Titel :
eChallenges, 2010
Conference_Location :
Warsaw
Print_ISBN :
978-1-4244-8390-7