Title :
Low-power vectorial VLIW architecture for maximum parallelism exploitation of dynamic programming algorithms
Author :
Cruz, Miguel ; Tomas, Pedro ; Roma, Nuno
Author_Institution :
Inst. Super. Tecnico, Univ. de Lisboa, Lisbon, Portugal
Abstract :
Dynamic Programming algorithms are widely used in many areas, to divide a complex problem into several simpler sub-problems, with many dependencies. Typical approaches explore data level parallelism by relying on spacialized vector instructions. However, the fully-parallelizable scheme is often not compliant with the memory organization of general purpose processors, leading to a less optimal parallelism, with worse performance. The proposed architecture exploits both data and instruction level parallelism, by statically scheduling a bundle of instructions to several different vector execution units. This achieves better performance than vector-only architectures, and has lower hardware requirements and thus lower power consumption. Performance and energy efficiency metrics were used to benchmark the proposed architecture against a dual issue, out-of-order ARM Cortex-A9 and a dedicated ASIP architecture. In a fair comparison where all processors compute 16 dynamic programming cells in parallel, results show that the proposed architecture can achieve a 3.24x and 2.35x better performance-energy efficiency than the ARM Cortex-A9 and the dedicated ASIP, respectively, and a performance improvement of 2.54x and 5.01× regarding the ARM and the dedicated ASIP, respectively.
Keywords :
dynamic programming; mathematics computing; parallel processing; data level parallelism; dynamic programming cells; energy efficiency metrics; fully-parallelizable scheme; general purpose processors; instruction level parallelism; maximum parallelism exploitation; memory organization; performance improvement; performance-energy efficiency; power consumption; spacialized vector instructions; vector execution units; vectorial VLIW architecture; Memory management; Parallel processing; Random access memory; Registers; VLIW; Vectors; Data Level Parallelism; Dynamic Programming; Instruction Level Parallelism; Multiple Instruction Multiple Data Architecture; VLIW; low-power;
Conference_Titel :
High Performance Computing & Simulation (HPCS), 2014 International Conference on
Conference_Location :
Bologna
Print_ISBN :
978-1-4799-5312-7
DOI :
10.1109/HPCSim.2014.6903673