Title :
Autotuning GPU-Accelerated QAP Solvers for Power and Performance
Author :
Abhilash Chaparala;Clara Novoa;Apan Qasem
Author_Institution :
Dept. of Comput. Sci., Texas State Univ., San Marcos, TX, USA
Abstract :
The Quadratic Assignment Problem (QAP) is an important combinatorial optimization problem with applications in many areas including operations research and chip design. QAP is known to be NP-hard and requires heuristic approaches for most real data sets. Although many algorithms have been proposed for solving QAPs, few have attempted to exploit the fine-grain data parallelism available on accelerator architectures and none have considered aspects of energy efficiency. In this paper, we present GPU-accelerated implementations of two QAP algorithms. We parameterize each algorithmic variant along several dimensions including thread granularity, occupancy, shared memory usage and register pressure. We develop an autotuning framework that explores this space of execution parameters and yields CUDA binaries that are efficient in terms of both performance and power consumption. We demonstrate the effectiveness of our tuning framework on an Nvidia Tesla K20c. On a series of experiments on the well-known QAPLIB data sets, our autotuned solutions run at least an order-of-magnitude faster than baseline implementations. The experimental results also provide key insight into the performance characteristics of accelerated QAP solvers. In particular, the results reveal that both algorithmic choice and the shape of the input data sets are key factors in finding an efficient implementation.
Keywords :
"Instruction sets","Optimization","Graphics processing units","Tuning","Sparse matrices","Registers","Kernel"
Conference_Titel :
High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on
DOI :
10.1109/HPCC-CSS-ICESS.2015.121