Title :
Advanced SIMD: Extending the reach of contemporary SIMD architectures
Author :
Boettcher, M. ; Al-Hashimi, B.M. ; Eyole, Mbou ; Gabrielli, G. ; Reid, Alastair
Author_Institution :
Univ. of Southampton, Southampton, UK
Abstract :
SIMD extensions have gained widespread acceptance in modern microprocessors as a way to exploit data-level parallelism in general-purpose cores. Popular SIMD architectures (e.g. Intel SSE/AVX) have evolved by adding support for wider registers and datapaths, and advanced features like indexed memory accesses, per-lane predication and inter-lane instructions, at the cost of additional silicon area and design complexity. This paper evaluates the performance impact of such advanced features on a set of workloads considered hard to vectorize for traditional SIMD architectures. Their sensitivity to the most relevant design parameters (e.g. register/datapath width and L1 data cache configuration) is quantified and discussed. We developed an ARMv7 NEON based ISA extension (ARGON), augmented a cycle accurate simulation framework for it, and derived a set of benchmarks from the Berkeley dwarfs. Our analyses demonstrate how ARGON can, depending on the structure of an algorithm, achieve speedups of 1.5x to 16x.
Keywords :
instruction sets; microcontrollers; multiprocessing systems; parallel architectures; ARGON; ARMv7 NEON based ISA extension; Berkeley dwarfs; advanced SIMD; contemporary SIMD architectures; cycle accurate simulation framework; data-level parallelism; design complexity; general-purpose cores; indexed memory accesses; inter-lane instructions; microprocessors; per-lane predication; silicon area; single instruction multiple data instruction set extensions; Argon; Benchmark testing; Computer architecture; Program processors; Registers; Timing; Vectors;
Conference_Titel :
Design, Automation and Test in Europe Conference and Exhibition (DATE), 2014
Conference_Location :
Dresden
DOI :
10.7873/DATE.2014.037