Title :
A three-parameter fast Givens QR algorithm for superscalar processors
Author :
Carrig, James J., Jr. ; Meyer, Gerard G L
Author_Institution :
Dept. of Electr. & Comput. Eng., Johns Hopkins Univ., Baltimore, MD, USA
Abstract :
We present a three parameter fast Givens QR algorithm that exploits parallelism to improve performance on superscalar processors. We provide a selection of parameter values for which the new algorithm reduces to the standard algorithm, but show that non-standard values minimize the number of cache misses, memory references and pipeline stalls. Using a tractable model of a superscalar machine architecture, we derive rules for estimating the optimal combination of parameter values. Applying these rules, we observe a speedup over the standard algorithm of 2.4 on the Intel Pentium Pro system, 2.0 on a single thin POWER2 processor of the IBM SP2, 1.6 on a single wide POWER2 processor of the IBM SP2, and 4.2 on a single R8000 processor of the SGI POWER Challenge XL
Keywords :
Kalman filters; eigenvalues and eigenfunctions; least squares approximations; parallel processing; performance evaluation; signal processing; IBM SP2; Intel Pentium Pro system; POWER2 processor; SGI POWER Challenge XL; cache misses; fast Givens QR algorithm; memory references; parallelism; parameter values; performance improvement; pipeline stalls; single R8000 processor; superscalar machine architecture; superscalar processors; tractable model; Algorithm design and analysis; High performance computing; Laboratories; Least squares methods; Libraries; Matrix decomposition; Parallel processing; Pipelines; Power system modeling; Signal processing algorithms;
Conference_Titel :
Parallel Processing, 1996. Vol.3. Software., Proceedings of the 1996 International Conference on
Conference_Location :
Ithaca, NY
Print_ISBN :
0-8186-7623-X
DOI :
10.1109/ICPP.1996.537375