• DocumentCode
    1205638
  • Title

    Instruction replication for reducing delays due to inter-PE communication latency

  • Author

    Aggarwal, Aneesh ; Franklin, Manoj

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Binghamton Univ., NY, USA
  • Volume
    54
  • Issue
    12
  • fYear
    2005
  • Firstpage
    1496
  • Lastpage
    1507
  • Abstract
    As feature sizes are becoming smaller, wire delays are becoming very critical. Clustering is a popular decentralization approach to reduce the impact of shrinking technologies on clock speed. In this approach, the centralized instruction window is replaced with multiple smaller windows, called clusters (PEs). The performance of these clustered processors depends on the amount of inter-PE communication and load imbalance incurred by the distribution algorithm used to distribute instructions among the PEs. In this paper, we investigate a novel approach of reducing the impact of inter-PE communication latency, while preserving good load balance. The basic idea is to selectively replicate instructions in those PEs where their results are required. The replication is done based on heuristics that weigh the potential benefits of replication. We found that, with instruction replication, the IPC of a clustered processor is significantly higher than that obtained without instruction replication and is within just 8 percent of that of a superscalar configuration with a centralized instruction scheduler.
  • Keywords
    clocks; delays; parallel architectures; processor scheduling; resource allocation; centralized instruction scheduler window; clock speed; clustered processor; delays reduction; distribution algorithm; instruction replication; inter-PE communication latency; interconnection latency; load balance; multiple smaller windows; shrinking technologies; superscalar configuration; task assignment; Clocks; Clustering algorithms; Computer Society; Delay; Dynamic scheduling; Energy efficiency; Load management; MOSFETs; Processor scheduling; Wire; Index Terms- Clustered processors; instruction replication; interconnection latency; load balancing; task assignment.;
  • fLanguage
    English
  • Journal_Title
    Computers, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9340
  • Type

    jour

  • DOI
    10.1109/TC.2005.197
  • Filename
    1524932