• DocumentCode
    1659118
  • Title

    Active memory techniques for ccNUMA multiprocessors

  • Author

    Kim, Daehyun ; Chaudhuri, Mainak ; Heinrich, Mark

  • Author_Institution
    Comput. Syst. Lab., Cornell Univ., Ithaca, NY, USA
  • fYear
    2003
  • Abstract
    Our recent work on uniprocessor and single-node multiprocessor (SMP) active memory systems uses address remapping techniques in conjunction with extended cache coherence protocols to improve access locality in processor caches. We extend our previous work in this paper and introduce the novel concept of multi-node active memory systems. We present the design of multi-node active memory cache coherence protocols to help reduce remote memory latency and improve scalability of matrix transpose and parallel reduction on distributed shared memory (DSM) multiprocessors. We evaluate our design on seven applications through execution-driven simulation on small and medium-scale multiprocessors. On a 32-processor system, an active-memory optimized matrix transpose attains speedup from 1.53 to 2.01 while parallel reduction achieves speedup from 1.19 to 2.81 over normal parallel executions.
  • Keywords
    cache storage; delays; distributed shared memory systems; matrix algebra; parallel programming; performance evaluation; protocols; DSM multiprocessors; cache coherence protocols; ccNUMA multiprocessors; distributed shared memory; execution-driven simulation; matrix transpose; multi-node active memory systems; parallel reduction; remote memory latency; scalability; speedup; Access protocols; Computer architecture; Control systems; Delay; Hardware; Laboratories; Network interfaces; Prefetching; Scalability; Scattering;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel and Distributed Processing Symposium, 2003. Proceedings. International
  • ISSN
    1530-2075
  • Print_ISBN
    0-7695-1926-1
  • Type

    conf

  • DOI
    10.1109/IPDPS.2003.1213085
  • Filename
    1213085