• DocumentCode
    2947517
  • Title

    Dynamically Specialized Datapaths for energy efficient computing

  • Author

    Govindaraju, Venkatraman ; Ho, Chen-Han ; Sankaralingam, Karthikeyan

  • Author_Institution
    Vertical Res. Group, Univ. of Wisconsin-Madison, Madison, WI, USA
  • fYear
    2011
  • fDate
    12-16 Feb. 2011
  • Firstpage
    503
  • Lastpage
    514
  • Abstract
    Due to limits in technology scaling, energy efficiency of logic devices is decreasing in successive generations. To provide continued performance improvements without increasing power, regardless of the sequential or parallel nature of the application, microarchitectural energy efficiency must improve. We propose Dynamically Specialized Datapaths to improve the energy efficiency of general purpose programmable processors. The key insights of this work are the following. First, applications execute in phases and these phases can be determined by creating a path-tree of basic-blocks rooted at the inner-most loop. Second, specialized datapaths corresponding to these path-trees, which we refer to as DySER blocks, can be constructed by interconnecting a set of heterogeneous computation units with a circuit-switched network. These blocks can be easily integrated with a processor pipeline. A synthesized RTL implementation using an industry 55nm technology library shows a 64-functional-unit DySER block occupies approximately the same area as a 64 KB single-ported SRAM and can execute at 2 GHz. We extend the GCC compiler to identify path-trees and code-mapping to DySER and evaluate the PAR-SEC, SPEC and Parboil benchmarks suites. Our results show that in most cases two DySER blocks can achieve the same performance (within 5%) as having a specialized hardware module for each path-tree. A 64-FU DySER block can cover 12% to 100% of the dynamically executed instruction stream. When integrated with a dual-issue out-of-order processor, two DySER blocks provide geometric mean speedup of 2.1X (1.15X to 10X), and geometric mean energy reduction of 40% (up to 70%), and 60% energy reduction if no performance improvement is required.
  • Keywords
    SRAM chips; logic circuits; microprocessor chips; pipeline processing; power aware computing; programmable logic devices; 64-functional-unit DySER block; GCC compiler; PARSEC; RTL; SPEC; circuit-switched network; code-mapping; dynamic specialized datapath; energy efficient computing; general purpose programmable processors; geometric mean energy reduction; logic devices; microarchitectural energy efficiency; path-trees; pipeline processing; single-ported SRAM; Arrays; Benchmark testing; Decoding; Hardware; Pipelines; Program processors; Registers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computer Architecture (HPCA), 2011 IEEE 17th International Symposium on
  • Conference_Location
    San Antonio, TX
  • ISSN
    1530-0897
  • Print_ISBN
    978-1-4244-9432-3
  • Type

    conf

  • DOI
    10.1109/HPCA.2011.5749755
  • Filename
    5749755