Title :
High Precision Integer Multiplication with a GPU
Author :
Emmart, Niall ; Weems, Charles
Author_Institution :
Comput. Sci. Dept., Univ. of Massachusetts, Amherst, MA, USA
Abstract :
We have improved our prior implementation of Strassen´s algorithm for high performance multiplication of very large integers on a general purpose graphics processor (GPU). A combination of algorithmic and implementation optimizations result in a factor of 2.3 speed improvement over our previous work, running on an NVIDIA 295. We have also reoptimized the implementation for an NVIDIA 480, from which we obtain a factor of up to 10 speedup in comparison with a Core i7 processor of the same technology generation. This paper discusses how we adapted the algorithm to operate within the limitations of the GPU and how we dealt with other issues encountered in the implementation process, as well as reporting performance results for a multiplications ranging from 255K bits, to 24.512M bits in size.
Keywords :
computer graphic equipment; coprocessors; fast Fourier transforms; general purpose computers; matrix multiplication; multiprocessing systems; optimisation; Core i7 processor; GPU; NVIDIA 295; NVIDIA 480; Strassen algorithm; general purpose graphics processor; high precision integer multiplication; optimization; speed improvement; technology generation; very large integer; Buffer storage; Graphics processing unit; Indexes; Instruction sets; Kernel; Layout; Registers;
Conference_Titel :
Parallel and Distributed Processing Workshops and Phd Forum (IPDPSW), 2011 IEEE International Symposium on
Conference_Location :
Shanghai
Print_ISBN :
978-1-61284-425-1
Electronic_ISBN :
1530-2075
DOI :
10.1109/IPDPS.2011.336