• DocumentCode
    3198103
  • Title

    A memory optimization technique for software-managed scratchpad memory in GPUs

  • Author

    Moazeni, Maryam ; Bui, Alex ; Sarrafzadeh, Majid

  • Author_Institution
    Comput. Sci. Dept., Univ. of California, Los Angeles, CA, USA
  • fYear
    2009
  • fDate
    27-28 July 2009
  • Firstpage
    43
  • Lastpage
    49
  • Abstract
    With the appearance of massively parallel and inexpensive platforms such as the G80 generation of NVIDIA GPUs, more real-life applications will be designed or ported to these platforms. This requires structured transformation methods that remove existing application bottlenecks in these platforms. Balancing the usage of on-chip resources, used for improving the application performance, in these platforms is often non-intuitive and some applications will run into resource limits. In this paper, we present a memory optimization technique for the software-managed scratchpad memory in the G80 architecture to alleviate the constraints of using the scratchpad memory. We propose a memory optimization scheme that minimizes the usage of memory space by discovering the chances of memory reuse with the goal of maximizing the application performance. Our solution is based on graph coloring. We evaluated our memory optimization scheme by a set of experiments on an image processing benchmark suite in medical imaging domain using NVIDIA Quadro FX 5600 and CUDA. Implementations based on our proposed memory optimization scheme showed up to 37% decrease in execution time comparing to their naive GPU implementations.
  • Keywords
    coprocessors; graph colouring; storage management; CUDA; G80 architecture; NVIDIA GPU; NVIDIA Quadro FX 5600; graph coloring; graphics processing unit; image processing; medical imaging domain; memory optimization; memory reuse; memory space; on-chip resources; software-managed scratchpad memory; Application software; Bandwidth; Central Processing Unit; Computer architecture; Computer science; Delay; Memory management; Parallel processing; Runtime; Yarn; CUDA; GPU Computing; Memory Optimization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Application Specific Processors, 2009. SASP '09. IEEE 7th Symposium on
  • Conference_Location
    San Francisco, CA
  • Print_ISBN
    978-1-4244-4939-2
  • Electronic_ISBN
    978-1-4244-4938-5
  • Type

    conf

  • DOI
    10.1109/SASP.2009.5226334
  • Filename
    5226334