DocumentCode
3000443
Title
Deriving a Methodology for Code Deployment on Multi-Core Platforms via Iterative Manual Optimizations
Author
McCool, Stuart ; Milligan, Peter ; Sage, Paul
Author_Institution
Sch. of Electron., Electr. Eng. & Comput. Sci., Queen´´s Univ. of Belfast, Belfast, UK
fYear
2012
fDate
21-25 May 2012
Firstpage
1406
Lastpage
1415
Abstract
In recent years, there has been what can only be described as an explosion in the types of processing devices one can expect to find within a given computer system. These include the multi-core CPU, the General Purpose Graphics Processing Unit (GPGPU) and the Accelerated Processing Unit (APU), to name but a few. The widespread uptake of these systems presents would-be users with at least two problems. Firstly, each device exposes a complex underlying architecture which must be appreciated in order to attain optimal performance. This is coupled with the fact that a single system can support an arbitrary number of such devices. Consequently, fully leveraging the performance capabilities of such a system must come at a cost -- increasingly prolonged development times. Adhering to a methodology will have the significant industrial impact of reducing these development times. This paper describes the continued formulation of such a novel methodology. Two real world scientific programs are optimized for execution on the CUDA platform. Double precision accuracy and optimized speedups (which include PCI-E transfer times) of 15x and 17x are achieved.
Keywords
multiprocessing systems; parallel architectures; APU; CUDA platform; GPGPU; PCI-E transfer; accelerated processing unit; code deployment; general purpose graphics processing unit; iterative manual optimizations; multicore CPU; multicore platforms; Acceleration; Graphics processing unit; Guidelines; Kernel; Performance evaluation; Registers; GPGPU; Racah; Slater; artificial intelligence; characterization; heterogeneous computing; methodology; toolset;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), 2012 IEEE 26th International
Conference_Location
Shanghai
Print_ISBN
978-1-4673-0974-5
Type
conf
DOI
10.1109/IPDPSW.2012.178
Filename
6270808
Link To Document