Title :
Evaluating instruction set extensions for fast arithmetic on binary finite fields
Author :
Fiskiran, A. Murat ; Lee, Ruby B.
Author_Institution :
Dept. of Electr. Eng., Princeton Univ., NJ, USA
Abstract :
Binary finite fields GF(2n) are very commonly used in cryptography, particularly in public-key algorithms such as elliptic curve cryptography (ECC). On word-oriented programmable processors, field elements are generally represented as polynomials with coefficients from [0, 1]. Key arithmetic operations on these polynomials, such as squaring and multiplication, are not supported by integer-oriented processor architectures. Instead, these are implemented in software, causing a very large fraction of the cryptography execution time to be dominated by a few elementary operations. For example, more than 90% of the execution time of 163-bit ECC may be consumed by two simple field operations: squaring and multiplication. A few processor architectures have been proposed recently that include instructions for binary field arithmetic. However, these have only considered processors with small wordsizes and in-order, single-issue execution. The first contribution of this paper is to validate these new arithmetic instructions for processors with wider wordsizes and multiple-issue (e.g. superscalar) execution. We also consider the effects of varying the number of functional units and load/store pipes. We demonstrate that the combination of microarchitecture and new instructions provides speedups up to 22.4x for ECC point multiplication. Second, we show that if a bit-level reverse instruction is included in the instruction set, the size of the multiplier can be reduced by half without significant performance degradation. Third, we compare the benefits of superscalar execution with wordsize scaling. The latter has been used in recent processor architectures such as PLX and PAX as a new way to extract parallelism. We show that 2x wordsize scaling provides 70% better performance than 2-way superscalar execution. Finally, we suggest a low-cost method, which we call multi-word result execution, to realize some of the benefits of wordsize scaling in existing processors with fixed wordsizes.
Keywords :
digital arithmetic; instruction sets; parallel architectures; public key cryptography; arithmetic operations; binary field arithmetic; binary finite fields; bit-level reverse instruction; elliptic curve cryptography; instruction set extensions; integer-oriented processor architectures; load-store pipes; multiword result execution; public-key cryptography; superscalar execution; word-oriented programmable processors; wordsize scaling; Acceleration; Arithmetic; Computer architecture; Degradation; Elliptic curve cryptography; Embedded system; Galois fields; Microarchitecture; Polynomials; Public key cryptography;
Conference_Titel :
Application-Specific Systems, Architectures and Processors, 2004. Proceedings. 15th IEEE International Conference on
Print_ISBN :
0-7695-2226-2
DOI :
10.1109/ASAP.2004.1342464