Title : 
Estimating the WCET of GPU-Accelerated Applications Using Hybrid Analysis
         
        
            Author : 
Betts, Alexander ; Donaldson, Alastair
         
        
            Author_Institution : 
Dept. of Comput., Imperial Coll. London, London, UK
         
        
        
        
        
        
            Abstract : 
The massive parallelism offered by Graphics Processing Units (GPUs) is now routinely exploited to accelerate computationally intensive tasks in a wide variety of application domains. Efficient GPU programming in languages such as CUDA and OpenCL requires careful application of hand optimisations to exploit parallelism and locality while minimising synchronisation. The effectiveness of such optimisations can be highly dependent on workload and the structure of input data, making it difficult to assess performance in general by testing alone. To address this, we study the problem of estimating the Worst-Case Execution Time (WCET) of GPU-accelerated applications. We propose the use of hybrid WCET analysis whereby execution times of small program segments are deduced from traces of execution and a calculation backend derived from the Control Flow Graph (CFG) produces a WCET estimate. Standard techniques which construct a CFG from a binary cannot be applied directly to GPU code because they miss implicit execution paths that arise due the way branches are implemented in hardware - we present a solution using standard compiler analysis. We further describe how to extend the basic hybrid WCET analysis of sequential code so that concurrent timing effects in the GPU execution model are incorporated. We have implemented our analysis as a tool built on top of the GPGPU-sim open source simulator. We evaluate our tool using a set of benchmarks drawn from the CUDA SDK: results show that effective modelling of concurrency is key to reducing pessimism in the WCET calculation.
         
        
            Keywords : 
concurrency control; flow graphs; graphics processing units; minimisation; parallel architectures; program compilers; CFG; CUDA SDK; GPGPU-sim open source simulator; GPU programming; GPU-accelerated application; OpenCL; benchmark; compiler analysis; concurrency modelling; control flow graph; graphics processing units; hand optimisation; hybrid WCET analysis; hybrid analysis; synchronisation minimisation; worst-case execution time estimation; Analytical models; Graphics processing units; Hardware; Instruction sets; Instruments; Programming; Standards;
         
        
        
        
            Conference_Titel : 
Real-Time Systems (ECRTS), 2013 25th Euromicro Conference on
         
        
            Conference_Location : 
Paris
         
        
        
            DOI : 
10.1109/ECRTS.2013.29