• DocumentCode
    3759159
  • Title

    Integrating 3D Resistive Memory Cache into GPGPU for Energy-Efficient Data Processing

  • Author

    Jie Zhang;David Donofrio;John Shalf;Myoungsoo Jung

  • fYear
    2015
  • Firstpage
    496
  • Lastpage
    497
  • Abstract
    General purpose graphics processing units (GPUs) have become a promising solution to process massive data by taking advantages of multithreading. Thanks to thread-level parallelism, GPU-accelerated applications improve the overall system performance by up to 40 times, compared to CPU-only architecture. However, data-intensive GPU applications often generate large amount of irregular data accesses, which results in cache thrashing and contention problems. The cache thrashing in turn can introduce a large number of off-chip memory accesses, which not only wastes tremendous energy to move data around on-chip cache and off-chip global memory, but also significantly limits system performance due to many stalled load/store instructions. In this work, we redesign the shared last-level cache (LLC) of GPU devices by introducing non-volatile memory (NVM), which can address the cache thrashing issues with low energy consumption. Specifically, we investigate two architectural approaches, one of each employs a 2D planar resistive random-access memory (RRAM) as our baseline NVM-cache and a 3D-stacked RRAM technology. Our baseline NVM-cache replaces the SRAM-based L2 cache with RRAM of similar area size; a memory die consists of eight subarrays, one of which a small fraction of memristor island by constructing 512x512 matrix. Since the feature size of SRAM is around 125 F2 (while that of RRAM around 4 F2), it can offer around 30x bigger storage capacity than the SRAM-based cache. To make our baseline NVM-cache denser, we proposed 3D-stacked NVM-cache, which piles up four memory layers, and each of them has a single pre-decode logic.
  • Keywords
    "Nonvolatile memory","Graphics processing units","Random access memory","Computer architecture","Integrated circuit modeling","Energy efficiency","Instruction sets"
  • Publisher
    ieee
  • Conference_Titel
    Parallel Architecture and Compilation (PACT), 2015 International Conference on
  • ISSN
    1089-795X
  • Type

    conf

  • DOI
    10.1109/PACT.2015.60
  • Filename
    7429338