• DocumentCode
    1564184
  • Title

    Automatic synthesis of customized local memories for multicluster application accelerators

  • Author

    Kudlur, Manjunath ; Fan, Kevin ; Chu, Michael ; Mahlke, Scott

  • Author_Institution
    Adv. Comput. Archit. Lab., Michigan Univ., Ann Arbor, MI, USA
  • fYear
    2004
  • Firstpage
    304
  • Lastpage
    314
  • Abstract
    Distributed local memories, or scratchpads, have been shown to effectively reduce cost and power consumption of application-specific accelerators while maintaining performance. The design of the local memory organization must take several factors into account, including the memory bandwidth and size requirements of the program and the distribution of program data among the memories. In addition, when register structures and function units in the accelerator are clustered, the effects of intercluster communication should be taken into account. This work proposes a technique to synthesize the local memory architecture of a clustered accelerator using a phase-ordered approach. First, the dataflow graph is pre-partitioned to define a performance-centric grouping of the operations. Second, memory synthesis is performed by combining multiple data structures into a set of physical memories that minimizes cost while maintaining a performance threshold. Finally, post-partitioning is performed to determine the final assignment of operations to clusters given the memory organization. Results show that customization reduces memory cost from 2% to 59% over a naive scheme that utilizes one physical memory per program data structure. Further, pre-partitioning is shown to reduce the intercluster communication required to achieve a fixed performance.
  • Keywords
    data flow graphs; data structures; distributed memory systems; high level synthesis; memory architecture; application-specific accelerators; customized local memories; dataflow graph; distributed local memories; intercluster communication; local memory architecture; local memory organization; memory bandwidth; memory synthesis; multicluster application accelerators; program data structure; register function units; register structures; scratchpads; Acceleration; Application software; Bandwidth; Computer architecture; Costs; Data structures; Delay; Energy consumption; Laboratories; Registers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Application-Specific Systems, Architectures and Processors, 2004. Proceedings. 15th IEEE International Conference on
  • ISSN
    2160-0511
  • Print_ISBN
    0-7695-2226-2
  • Type

    conf

  • DOI
    10.1109/ASAP.2004.1342480
  • Filename
    1342480