DocumentCode :
2121933
Title :
Design space explorations for streaming accelerators using Streaming Architectural Simulator
Author :
Shafiq, Muhammad ; Pericas, Miquel ; Navarro, Nacho ; Ayguade, Eduard
Author_Institution :
Centre of Excellence in Sci. & Adv. Technol. (CESAT), Islamabad, Pakistan
fYear :
2013
fDate :
15-19 Jan. 2013
Firstpage :
169
Lastpage :
178
Abstract :
In the recent years streaming accelerators like GPUs have been pop-up as an effective step towards parallel computing. The wish-list for these devices span from having a support for thousands of small cores to a nature very close to the general purpose computing. This makes the design space very vast for the future accelerators containing thousands of parallel streaming cores. This complicates to exercise a right choice of the architectural configuration for the next generation devices. However, accurate design space exploration tools developed for the massively parallel architectures can ease this task. The main objectives of this work are twofold. (i) We present a complete environment of a trace driven simulator named SArcs (Streaming Architectural Simulator) for the streaming accelerators. (ii) We use our simulation tool-chain for the design space explorations of the GPU like streaming architectures. Our design space explorations for different architectural aspects of a GPU like device a e with reference to a base line established for NVIDIA´s Fermi architecture (GPU Tesla C2050). The explored aspects include the performation effects by the variations in the configurations of Streaming Multiprocessors Global Memory Bandwidth, Channles between SMs down to Memory Hierarchy and Cache Hierarchy. The explorations are performed using application kernels from Vector Reduction, 2D-Convolution. Matrix-Matrix Multiplication and 3D-Stencil. Results show that the configurations of the computational resources for the current Fermi GPU device can deliver higher performance with further improvement in the global memory bandwidth for the same device.
Keywords :
graphics processing units; multiprocessing systems; parallel architectures; 2D-convolution; 3D-stencil; Fermi GPU device; GPU Tesla C2050; GPU like streaming architectures; NVIDIA Fermi architecture; SArcs; Streaming Architectural Simulator; architectural aspects; cache hierarchy; computational resources; design space exploration tools; design space explorations; general purpose computing; massively parallel architectures; matrix-matrix multiplication; memory hierarchy; next generation devices; parallel computing; parallel streaming cores; simulation tool-chain; streaming accelerators; streaming architectural simulator; streaming multiprocessors global memory bandwidth; trace driven simulator; vector reduction; Graphics processing units; Instruction sets; Optical character recognition software; Space exploration;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Applied Sciences and Technology (IBCAST), 2013 10th International Bhurban Conference on
Conference_Location :
Islamabad
Print_ISBN :
978-1-4673-4425-8
Type :
conf
DOI :
10.1109/IBCAST.2013.6512151
Filename :
6512151
Link To Document :
بازگشت