DocumentCode :
2534948
Title :
Exploiting vector parallelism in software pipelined loops
Author :
Larsen, Samuel ; Rabbah, Rodric ; Amarasinghe, Saman
Author_Institution :
Comput. Sci. & Artificial Intelligence Lab., Massachusetts Inst. of Technol., Cambridge, MA
fYear :
2005
fDate :
16-16 Nov. 2005
Lastpage :
129
Abstract :
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditional vectorization technology first developed for supercomputers. In contrast, scalar hardware is typically targeted using ILP techniques such as software pipelining. This paper presents a novel approach for exploiting vector parallelism in software pipelined loops. The proposed methodology (i) lowers the burden on the scalar resources by offloading computation to the vector functional units, (ii) explicitly manages communication of operands between scalar and vector instructions, (in) naturally handles misaligned vector memory operations, and (iv) partially (or fully) inhibits the optimization when vectorization will decrease performance. Our approach results in better resource utilization and allows for software pipelining with shorter initiation intervals. The proposed optimization is applied in the compiler backend, where vectorization decisions are more amenable to cost analysis. This is unique in that traditional vectorization optimizations are usually carried out at the statement level. Although our technique most naturally complements statically scheduled machines, we believe it is applicable to any architecture that tightly integrates support for instruction and data level parallelism. We evaluate our methodology using nine SPEC FP benchmarks. In comparison to software pipelining, our approach achieves a maximum speedup of 1.38times, with an average of 1.11times
Keywords :
parallel processing; pipeline processing; program control structures; vector processor systems; SPEC FP benchmarks; compiler backend; processor design; resource utilization; scalar hardware; software pipelined loops; vector memory operations; vector parallelism; Communication system operations and management; Computer aided instruction; Hardware; Instruction sets; Memory management; Parallel processing; Pipeline processing; Process design; Resource management; Supercomputers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Microarchitecture, 2005. MICRO-38. Proceedings. 38th Annual IEEE/ACM International Symposium on
Conference_Location :
Barcelona
Print_ISBN :
0-7695-2440-0
Type :
conf
DOI :
10.1109/MICRO.2005.20
Filename :
1540953
Link To Document :
بازگشت