Title :
Design and performance measurement of a high-performance computing cluster
Author :
George, Kenny ; Venugopal, Vinaya
Author_Institution :
Comput. Eng. Program, California State Univ., Fullerton, CA, USA
Abstract :
Graphics processor units (GPU) are specialized hardware accelerators that can be utilized for computations needing high parallelism and high memory bandwidth. Propelled by the attractive Flops/$ ratio and its capability to outperform a CPU cluster at the equivalent cost, large-scale GPU clusters are gaining popularity in the high-performance computing (HPC) community. However, the design challenges associated with the setup and application development process for an efficient HPC cluster includes: a) data movement and locality on the hardware accelerators; b) task mapping and allocation; and c) setting up a well-balanced system. In this paper, we present our experience setting up a GPU cluster for HPC applications; particularly signal processing for digital wideband receivers. We describe the architecture, hardware and software platform of the proposed cluster. The proposed GPU cluster implementing a 1.25 GHz digital wideband receiver was compared and contrasted against a HPC based predecessor receiver system. The adaptability of the GPU cluster was further demonstrated by utilizing it for a multiple receiver implementation that demanded higher data processing capability and throughput.
Keywords :
Fourier transforms; UHF devices; broadband networks; field programmable gate arrays; graphics processing units; microwave receivers; parallel processing; pattern clustering; task analysis; CPU cluster; Flops-$ ratio; HPC applications; HPC community; HPC-based predecessor receiver system; application development process; data locality; data movement; data processing capability; digital wideband receiver; digital wideband receivers; graphics processor units; hardware accelerators; hardware platform; high-performance computing cluster; large-scale GPU clusters; memory bandwidth; multiple receiver implementation; parallelism bandwidth; performance measurement; signal processing; software platform; task allocation; task mapping; Bandwidth; Data acquisition; Graphics processing unit; Random access memory; Receivers; Servers; Throughput; FPGA; GPU; HPC; cluster; fast Fourier transform; performance measurement; wideband receiver;
Conference_Titel :
Instrumentation and Measurement Technology Conference (I2MTC), 2012 IEEE International
Conference_Location :
Graz
Print_ISBN :
978-1-4577-1773-4
DOI :
10.1109/I2MTC.2012.6229359