Title :
Performance Evaluation of Instruction Set Extensions for Long Integer Modular Arithmetic on a SPARC V8 Processor
Author :
Grossschadl, J. ; Tillich, Stefan ; Szekely, Alexander
Author_Institution :
Inst. for Appl. Inf. Process. & Commun., Graz Univ. of Technol., Graz, Austria
Abstract :
Many important algorithms for public-key cryptography rely on computation-intensive arithmetic operations like modular exponentiation on very long integers, typically in the range of 512 and 2048 bits. Modular exponentiation is generally realized through a sequence of modular multiplications and spends the majority of execution time in simple inner loops. Speeding up these performance-critical inner loop operations with custom instructions has, therefore, a significant impact on the total execution time of public-key cryptosystems. In this paper we analyze the performance of instruction set extensions for long integer arithmetic on a SPARC V8 processor. We discuss various implementation options and optimization opportunities for both modular multiplication and exponentiation. In particular, we introduce a partial loop unrolling (PLU) technique for modular multiplication which allows to achieve large performance gains at the cost of a moderate increase in code size, while maintaining the full flexibility of a "rolled-loop" implementation. In addition, we study window methods for modular exponentiation and analyze their impact on performance and memory requirements. Our experimental results, obtained with an FPGA prototype of the LEON-2 SPARC V8 core, show that a full 1024-bit modular exponentiation can be performed in about 12.5 ldr 106 clock cycles, which is a reasonable value for embedded devices like smart cards or sensor nodes.
Keywords :
field programmable gate arrays; instruction sets; microprocessor chips; public key cryptography; FPGA prototype; SPARC V8 processor; computation-intensive arithmetic operations; instruction set extensions; integer arithmetic; integer modular arithmetic; modular multiplications; partial loop unrolling; performance evaluation; public-key cryptography; Arithmetic; Clocks; Costs; Field programmable gate arrays; Intelligent sensors; Performance analysis; Performance gain; Prototypes; Public key cryptography; Smart cards;
Conference_Titel :
Digital System Design Architectures, Methods and Tools, 2007. DSD 2007. 10th Euromicro Conference on
Conference_Location :
Lubeck
Print_ISBN :
978-0-7695-2978-3
DOI :
10.1109/DSD.2007.4341542