• DocumentCode
    1705082
  • Title

    Improving the data cache performance of multiprocessor operating systems

  • Author

    Xia, Chun ; Torrellas, Josep

  • Author_Institution
    Center for Supercomput. Res. & Dev., Illinois Univ., Urbana, IL, USA
  • fYear
    1996
  • Firstpage
    85
  • Lastpage
    94
  • Abstract
    Bus-based shared-memory multiprocessors with coherent caches have recently become very popular. To achieve high performance, these systems rely on increasingly sophisticated cache hierarchies. However, while these machines often run loads with substantial operating system activity, performance measurements have consistently indicated that the operating system uses the data cache hierarchy poorly. In this paper, we address the issue of how to eliminate most of the data cache misses in a multiprocessor operating system while still using off-the-shelf processors. We use a performance monitor to examine traces of a 4-processor machine running four system-intensive loads under UNIX. Based on our observations, we propose hardware and software support that targets block operations, coherence activity, and cache conflicts. For block operations, simple cache bypassing or prefetching schemes are undesirable. Instead, it is best to use a DMA-like scheme that pipelines the data transfer in the bus without involving the processor. Coherence misses are handled with data, privatization and relocation, and the use of updates for a small core of shared variables. Finally, the remaining miss hot spots are handled with data prefetching. Overall, our simulations show that all these optimizations combined eliminate or hide 75% of the operating system data misses in 32-Kbyte primary caches. Furthermore, they speed up the operating system by 19%
  • Keywords
    cache storage; performance evaluation; shared memory systems; 4-processor machine; DMA-like scheme; coherence activity; coherent caches; data cache performance; data transfer; multiprocessor operating systems; performance measurements; performance monitor; shared-memory multiprocessors; Computer applications; Condition monitoring; Contracts; Hardware; Operating systems; Pipelines; Prefetching; Privatization; Research and development; Stress;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High-Performance Computer Architecture, 1996. Proceedings., Second International Symposium on
  • Conference_Location
    San Jose, CA
  • Print_ISBN
    0-8186-7237-4
  • Type

    conf

  • DOI
    10.1109/HPCA.1996.501176
  • Filename
    501176