• DocumentCode
    2978692
  • Title

    A three-parameter fast Givens QR algorithm for superscalar processors

  • Author

    Carrig, James J., Jr. ; Meyer, Gerard G L

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Johns Hopkins Univ., Baltimore, MD, USA
  • Volume
    2
  • fYear
    1996
  • fDate
    12-16 Aug 1996
  • Firstpage
    11
  • Abstract
    We present a three parameter fast Givens QR algorithm that exploits parallelism to improve performance on superscalar processors. We provide a selection of parameter values for which the new algorithm reduces to the standard algorithm, but show that non-standard values minimize the number of cache misses, memory references and pipeline stalls. Using a tractable model of a superscalar machine architecture, we derive rules for estimating the optimal combination of parameter values. Applying these rules, we observe a speedup over the standard algorithm of 2.4 on the Intel Pentium Pro system, 2.0 on a single thin POWER2 processor of the IBM SP2, 1.6 on a single wide POWER2 processor of the IBM SP2, and 4.2 on a single R8000 processor of the SGI POWER Challenge XL
  • Keywords
    Kalman filters; eigenvalues and eigenfunctions; least squares approximations; parallel processing; performance evaluation; signal processing; IBM SP2; Intel Pentium Pro system; POWER2 processor; SGI POWER Challenge XL; cache misses; fast Givens QR algorithm; memory references; parallelism; parameter values; performance improvement; pipeline stalls; single R8000 processor; superscalar machine architecture; superscalar processors; tractable model; Algorithm design and analysis; High performance computing; Laboratories; Least squares methods; Libraries; Matrix decomposition; Parallel processing; Pipelines; Power system modeling; Signal processing algorithms;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing, 1996. Vol.3. Software., Proceedings of the 1996 International Conference on
  • Conference_Location
    Ithaca, NY
  • ISSN
    0190-3918
  • Print_ISBN
    0-8186-7623-X
  • Type

    conf

  • DOI
    10.1109/ICPP.1996.537375
  • Filename
    537375