• DocumentCode
    55653
  • Title

    MAHA: An Energy-Efficient Malleable Hardware Accelerator for Data-Intensive Applications

  • Author

    Paul, Somnath ; Krishna, Aswin ; Wenchao Qian ; Karam, Robert ; Bhunia, Swarup

  • Author_Institution
    Intel Corp., Santa Clara, CA, USA
  • Volume
    23
  • Issue
    6
  • fYear
    2015
  • fDate
    Jun-15
  • Firstpage
    1005
  • Lastpage
    1016
  • Abstract
    For data-intensive applications, energy expended in on-chip computation constitutes only a small fraction of the total energy consumption. The primary contribution comes from transporting data between off-chip memory and on-chip computing elements-a limitation referred to as the Von-Neumann bottleneck. In such a scenario, improving the compute energy through parallel processing or on-chip hardware acceleration brings minor improvements to the total energy requirement of the system. We note that an effective solution to mitigate the Von-Neumann bottleneck is to develop a framework that enables computing in off-chip nonvolatile memory arrays, where the data reside permanently. In this paper, we present a malleable hardware (MAHA) reconfigurable framework that modifies nonvolatile CMOS-compatible flash memory array for on-demand reconfigurable computing. MAHA is a spatio-temporal mixed-granular hardware reconfigurable framework, which utilizes the memory for storage as well as lookup table-based computation (hence malleable) and uses a low-overhead hierarchical interconnect fabric for communication between processing elements. A detailed design of the malleable hardware together with a comprehensive application mapping flow is presented. Design overheads carefully estimated at the 45-nm technology node indicate that for a set of common kernels, MAHA achieves a 91X improvement in energy efficiency over a software-only solution with negligible impact on memory performance in normal mode. The proposed design changes incur only 6% memory area overhead.
  • Keywords
    energy conservation; flash memories; power aware computing; MAHA reconfigurable framework; Von-Neumann bottleneck; complimentary metal oxide semiconductors; data-intensive applications; energy consumption; energy-efficient malleable hardware accelerator; hierarchical interconnect fabric; lookup table-based computation; nonvolatile CMOS-compatible flash memory array; off-chip memory element; off-chip nonvolatile memory arrays; on-chip computation; on-chip computing element; on-chip hardware acceleration; parallel processing; processing elements; spatio-temporal mixed-granular hardware reconfigurable framework; Arrays; Ash; Hardware; Nonvolatile memory; System-on-chip; Table lookup; Energy efficiency; NAND flash; Von-Neumann bottleneck; in-memory computing; nand flash;
  • fLanguage
    English
  • Journal_Title
    Very Large Scale Integration (VLSI) Systems, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1063-8210
  • Type

    jour

  • DOI
    10.1109/TVLSI.2014.2332538
  • Filename
    6891374