• DocumentCode
    1829512
  • Title

    A performance study of software and hardware data prefetching schemes

  • Author

    Chen, Tien-Fu ; Baer, Jean-Loup

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Chung Cheng Univ., Chiayi, Taiwan
  • fYear
    1994
  • fDate
    18-21 Apr 1994
  • Firstpage
    223
  • Lastpage
    232
  • Abstract
    Prefetching, i.e., exploiting the overlap of processor computations with data accesses, is one of several approaches for tolerating memory latencies. Prefetching can be either hardware-based or software-directed or a combination of both. Hardware-based prefetching, requiring some support unit connected to the cache, can dynamically handle prefetches at run-time without compiler intervention. Software-directed approaches rely on compiler technology to insert explicit prefetch instructions. Mowry et al.´s software scheme (1991,1992) and the authors´ hardware approach (1991) are two representative schemes. In this paper, the authors evaluate approximations to these two schemes in the context of a shared-memory multiprocessor environment. Their qualitative comparisons indicate that both schemes are able to reduce cache misses in the domain of linear array references. When complex data access patterns are considered, the software approach has compile-time information to perform sophisticated prefetching whereas the hardware scheme has the advantage of manipulating dynamic information. The performance results from an instruction-level simulation of four benchmarks confirm these observations. Simulations show that the hardware scheme introduces more memory traffic into the network and that the software scheme introduces a non-negligible instruction execution overhead. An approach combining software and hardware schemes is proposed; it shows promise in reducing the memory latency with least overhead
  • Keywords
    memory architecture; performance evaluation; shared memory systems; storage management; benchmarks; data prefetching schemes; hardware-based; memory latencies; memory latency; overhead; performance study; shared-memory multiprocessor; software-directed; Computer science; Data engineering; Delay; Hardware; Manipulator dynamics; Prefetching; Runtime; Software performance; Telecommunication traffic; Traffic control;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Architecture, 1994., Proceedings the 21st Annual International Symposium on
  • Conference_Location
    Chicago, IL
  • Print_ISBN
    0-8186-5510-0
  • Type

    conf

  • DOI
    10.1109/ISCA.1994.288147
  • Filename
    288147