Title :
A high-performance and energy-efficient CT reconstruction algorithm for multi-terabyte datasets
Author :
Jimenez, Edward S. ; Orr, Laurel J. ; Thompson, Kyle R. ; Park, Rae-Hong
Author_Institution :
Sandia Nat. Labs., Albuquerque, NM, USA
fDate :
Oct. 27 2013-Nov. 2 2013
Abstract :
There has been much work done in implementing various GPU-based Computed Tomography reconstruction algorithms for medical applications showing tremendous improvement in computational performance. While many of these reconstruction algorithms could also be applied to industrial-scale datasets, the performance gains may be modest to non-existent due to a combination of algorithmic, hardware, or scalability limitations. Previous work presented showed an irregular dynamic approach to GPU-Reconstruction kernel execution for industrial-scale reconstructions that dramatically improved voxel processing throughput. However, the improved kernel execution magnified other system bottlenecks such as host memory bandwidth and storage read/write bandwidth, thus hindering performance gains. This paper presents a multi-GPU-based reconstruction algorithm capable of efficiently reconstructing large volumes (between 64 gigavoxels and 1 teravoxel volumes) not only faster than traditional CPU- and GPU-based reconstruction algorithms but also while consuming significantly less energy. The reconstruction algorithm exploits the irregular kernel approach from previous work as well as a modularized MIMD-like environment, heterogeneous parallelism, as well as macro- and micro-scale dynamic task allocation. The result is a portable and flexible reconstruction algorithm capable of executing on a wide range of architectures including mobile computers, workstations, supercomputers, and modestly-sized hetero or homogeneous clusters with any number of graphics processors.
Keywords :
computerised tomography; graphics processing units; image reconstruction; medical image processing; GPU-Reconstruction kernel execution; computational performance; computed tomography; graphics processing unit; heterogeneous clusters; heterogeneous parallelism; high-performance energy-efficient CT reconstruction algorithm; homogeneous clusters; host memory bandwidth; macroscale dynamic task allocation; medical applications; microscale dynamic task allocation; mobile computers; modularized MIMD-like environment; multiterabyte datasets; storage read/write bandwidth; supercomputers; voxel processing throughput; workstations; Computed tomography; Graphics processing units; Image reconstruction; Kernel; Performance evaluation; Reconstruction algorithms; Scalability;
Conference_Titel :
Nuclear Science Symposium and Medical Imaging Conference (NSS/MIC), 2013 IEEE
Conference_Location :
Seoul
Print_ISBN :
978-1-4799-0533-1
DOI :
10.1109/NSSMIC.2013.6829451