• DocumentCode
    1492571
  • Title

    A comparative analysis of cache designs for vector processing

  • Author

    Sun, Tong ; Yang, Qing

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Rhode Island Univ., Kingston, RI, USA
  • Volume
    48
  • Issue
    3
  • fYear
    1999
  • fDate
    3/1/1999 12:00:00 AM
  • Firstpage
    331
  • Lastpage
    344
  • Abstract
    This paper presents an experimental study on cache memory designs for vector computers. We use an execution-driven simulator to evaluate vector cache performance of a set of application programs from Perfect Club and SPEC92 benchmark suites. Our simulation results uncover a few important facts which were unknown before: First of all, the prime-mapped cache that we newly proposed shows great performance potential in vector processing environment. Because of its conflict-free property, the prime-mapped cache performs significantly better than conventional cache designs for all applications considered. Second, performance results on the benchmarks indicate that data locality in vector processing does exist, although the effects of line size, associativity, replacement algorithm, and prefetching scheme on cache performance are very different from what has been commonly believed. A medium size vector cache (e.g., 128 Kbytes) eliminates the necessity for a large number of interleaved memory banks in vector computers. Our experiments show that the vector computer that has a medium size prime-mapped cache with small cache line size and limited amount of prefetching provides significant speedup over conventional vector computers without cache. Performance results reported in this paper can also provide guidance to general-purpose computer designers to enhance cache performance for numerical applications
  • Keywords
    cache storage; digital simulation; memory architecture; performance evaluation; vector processor systems; Perfect Club; SPEC92 benchmark suites; application programs; associativity; benchmarks; cache designs; comparative analysis; conflict-free property; execution-driven simulator; prefetching scheme; prime-mapped cache; replacement algorithm; simulation results; vector processing; Application software; Assembly; Bandwidth; Cache memory; Computational modeling; Computer Society; Prefetching; Process design; Registers; Sun;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/12.754999
  • Filename
    754999