• DocumentCode
    624370
  • Title

    A highly efficient, thread-safe software cache implementation for tightly-coupled multicore clusters

  • Author

    Pinto, Claudio ; Benini, Luca

  • Author_Institution
    DEI Dept., Univ. of Bologna, Bologna, Italy
  • fYear
    2013
  • fDate
    5-7 June 2013
  • Firstpage
    281
  • Lastpage
    288
  • Abstract
    A widely adopted design paradigm for many-core accelerators features processing elements grouped in clusters. Due to area, power and design simplicity, processors in the same clusters are often not equipped with data-caches but rather share a tightly coupled data memory (TCDM). Even if the use of a TCDM is more energy and area efficient than a cache it requires a higher programming effort because memory needs to be explicitly managed with DMA-based L3 to TCDM copies. In this context Software Caches can be used to automatically transfer data between the local TCDM and the external memory, simplifying the task of the programmer. In this paper we present an implementation of Software Cache for the STMicroelectronics STHORM many-core accelerator, featuring a L1 TCDM shared by 16 processors in a cluster. Our main contribution is the design of a fast and thread-safe cache allowing parallel access from different processing elements inside the same cluster. We evaluate our implementation with micro-benchmarks as well as a real world application from the computer vision domain. Results show that a software cache provides major performance improvements with respect to L3 allocation of large data structures even when it is aggressively shared among many parallel threads.
  • Keywords
    cache storage; data structures; electronic data interchange; multi-threading; multiprocessing systems; DMA-based L3; L1 TCDM; L3 allocation; STMicroelectronics STHORM many-core accelerator; TCDM copy; computer vision domain; data structures; data-caches; design paradigm; external memory; local TCDM; many-core accelerators; micro-benchmarks; parallel access; parallel threads; performance improvements; processing elements; programming effort; software caches; thread-safe cache; thread-safe software cache implementation; tightly coupled data memory; tightly-coupled multicore clusters; transfer data; Data structures; Hardware; Program processors; Radiation detectors; Synchronization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Application-Specific Systems, Architectures and Processors (ASAP), 2013 IEEE 24th International Conference on
  • Conference_Location
    Washington, DC
  • ISSN
    2160-0511
  • Print_ISBN
    978-1-4799-0494-5
  • Type

    conf

  • DOI
    10.1109/ASAP.2013.6567591
  • Filename
    6567591