Title :
Effective usage of vector registers in decoupled vector architectures
Author :
Villa, Luis ; Espasa, Roger ; Valero, Mako
Author_Institution :
Dept. d´´Arquitectura de Comput., Univ. Politecnica de Catalunya, Barcelona, Spain
Abstract :
The paper presents a study of the impact of reducing the vector register size in a decoupled vector architecture. In traditional in-order vector architectures long vector registers have typically been the norm. The authors present data which shows that, even for highly vectorizable codes, only a small fraction of all elements of a long vector register are actually used. They also show that reducing the register size in a traditional vector architecture in an attempt to reduce hardware cost and maximize register utilization results in a severe performance degradation. However they combine the decoupling technique with the vector register reduction and show that the resulting architecture tolerates very well the register size cuts. They simulate a selection of Perfect Club and Specfp92 programs using a trace driven approach and compare the execution time in a conventional vector architecture with a decoupled vector architecture using different registers sizes. Halving the register size and using decoupling provides speedups between 1.04-1.49 over a traditional in-order vector machines. Even reducing the register length to 1/4 the original size (and in some cases, to 1/8) the performance of the decoupled machine is better than a conventional vector model. Moreover they observe that the resulting decoupled machine with short registers tolerates very well long memory latencies
Keywords :
computational complexity; performance evaluation; vector processor systems; virtual machines; Perfect Club programs; Specfp92 programs; decoupled vector architectures; decoupling technique; execution time; hardware cost reduction; in-order vector architectures; long memory latencies; maximized register utilization; simulation; trace driven approach; vector register size reduction; vector registers; Computer architecture; Costs; Degradation; Delay; Engines; Hardware; Out of order; Registers; Space technology; Vector processors;
Conference_Titel :
Parallel and Distributed Processing, 1998. PDP '98. Proceedings of the Sixth Euromicro Workshop on
Conference_Location :
Madrid
Print_ISBN :
0-8186-8332-5
DOI :
10.1109/EMPDP.1998.647238