Title : 
Extendable framework for monitoring heterogeneous multi-accelerator HPC cluster
         
        
            Author : 
Deepika, H.V. ; Mangala, N. ; Babu, N. Sarat Chandra
         
        
            Author_Institution : 
Hybrid Comput. Group, Centre for Dev. of Adv. Comput., Bangalore, India
         
        
        
        
        
        
            Abstract : 
The superior performance:power ratio of accelerators is motivating new cluster architectures with varied accelerator combinations. Monitoring ensures normal functioning of the cluster by detecting service degradations and prompt rectification. This paper describes a modular and extendable monitoring framework for heterogeneous multi-accelerator clusters which will be useful for future HPC systems. The framework can support third party software plugins to provide different functional features. A monitoring tool has been developed on the basis of this framework to monitor CPU, GPGPU and FPGA accelerators, network, storage, user jobs and other relevant services of a heterogeneous cluster; the tool is also capable of auto rectification to a certain extent.
         
        
            Keywords : 
field programmable gate arrays; graphics processing units; parallel processing; CPU accelerators; FPGA accelerators; GPGPU accelerators; HPC systems; accelerator combinations; accelerators power ratio; central processing unit; cluster monitoring; extendable monitoring framework; field programmable gate arrays; general-purpose graphics processing unit; heterogeneous multiaccelerator HPC cluster; high performance computing; service degradations; third party software plugins; Computer architecture; Databases; Field programmable gate arrays; Graphics processing units; Monitoring; Probes; Servers; FPGA; GPU; Many core; accelerator; cluster; heterogeneous; monitoring; plugin;
         
        
        
        
            Conference_Titel : 
Computing for Sustainable Global Development (INDIACom), 2014 International Conference on
         
        
            Conference_Location : 
New Delhi
         
        
            Print_ISBN : 
978-93-80544-10-6
         
        
        
            DOI : 
10.1109/IndiaCom.2014.6828136