Title :
Accelerating inclusion-based pointer analysis on heterogeneous CPU-GPU systems
Author :
Yu Su ; Ding Ye ; Jingling Xue
Author_Institution :
Sch. of Comput. Sci. & Eng., Univ. of New South Wales, Sydney, NSW, Australia
Abstract :
This paper describes the first implementation of Andersen´s inclusion-based pointer analysis for C programs on a heterogeneous CPU-GPU system, where both its CPU and GPU cores are used. As an important graph algorithm, Andersen´s analysis is difficult to parallelise because it makes extensive modifications to the structure of the underlying graph, in a way that is highly input-dependent and statically hard to analyse. Existing parallel solutions run on either the CPU or GPU but not both, rendering the underlying computational resources underutilised and the ratios of CPU-only over GPU-only speedups for certain programs (i.e., graphs) unpredictable. We observe that a naive parallel solution of Andersen´s analysis on a CPU-GPU system suffers from poor performance due to workload imbalance. We introduce a solution that is centered around a new dynamic workload distribution scheme. The novelty lies in prioritising the distribution of different types of workloads, i.e., graph-rewriting rules in Andersen´s analysis to CPU or GPU according to the degrees of the processing unit´s suitability for processing them. This scheme is effective when combined with synchronisation-free execution of tasks (i.e., graph-rewriting rules) and difference propagation of points-to information between the CPU and GPU. For a set of seven C benchmarks evaluated, our CPU-GPU solution outperforms (on average) (1) the CPU-only solution by 50.6%, (2) the GPU-only solution by 78.5%, and (3) an oracle solution that behaves as the faster of (1) and (2) on every benchmark by 34.6%.
Keywords :
C language; graph theory; graphics processing units; program diagnostics; rewriting systems; Andersen inclusion-based pointer analysis; C programs; dynamic workload distribution scheme; graph algorithm; graph-rewriting rules; heterogeneous CPU-GPU system; synchronisation-free execution; workload imbalance; Australia; Benchmark testing; Graphics processing units; Semantics; Weaving;
Conference_Titel :
High Performance Computing (HiPC), 2013 20th International Conference on
Conference_Location :
Bangalore
DOI :
10.1109/HiPC.2013.6799110