• DocumentCode
    3473155
  • Title

    Instruction Format Based Selective Execution for Register Port Complexity Reduction in High-Performance Processors

  • Author

    Sangireddy, Rama

  • Author_Institution
    Dept. of Electr. Eng., Univ. of Texas at Dallas, TX
  • fYear
    2006
  • fDate
    10-12 April 2006
  • Firstpage
    227
  • Lastpage
    232
  • Abstract
    As the width of the processor grows, complexity of a register file (RF) with multiple ports grows more than linearly and leads to larger register access time and higher power consumption. Analysis of characteristics of the Spec2000 benchmark programs when run in an 8-wide processor reveals that only two or less two-source instructions (that require both source registers) are executed in a cycle for a significant portion of total execution time (more than 98% tune for Spec2000 integer and 93% tune for Spec2000 floating-point). Thus the analysis observes that the register port bandwidth is highly underutilized for a significant portion of tune in general purpose computing. In this paper, we propose a novel technique to significantly reduce the number of register ports with a very minor modification in the select logic to issue only a limited number of two-source instructions. This is achieved with no significant impact on processor´s performance. The novelty of the technique is that it is easy to implement and succeeds in reducing the access tune, power, and area of the register file, without shifting burden, in terms of these factors, to any other logic on the chip. With this technique in an 8-wide processor, as compared to a conventional 128-entry RF with 16 read ports, for Spec2000 integer programs a register file can be designed with 11 or 10 read ports as these configurations result in instructions per cycle (IPC) degradation of only 0.929% and 3.38%, respectively. This significantly low degradation in IPC is achieved while reducing the register access tune by 9% and 12%, respectively, and reducing power by 35% and 50%, respectively. For Spec2000 floating-point programs, a register file can be designed with 12 read ports (1.16% IPC loss, 8% less access tune, and 28% less power) or with 11 read ports (3.5% IPC loss, 9% less access time, and 35 % less power). The paper analyzes the performance of all the possible flavors of the proposed technique for register file- - in both 4-wide and 8-wide processors, and presents a choice of the performance and register port complexity combination to the designer
  • Keywords
    computational complexity; file organisation; instruction sets; 8-wide processor; Spec2000 benchmark program; Spec2000 floating-point program; Spec2000 integer program; high performance processor; instructions per cycle degradation; processor width; register access time; register access tune; register file complexity; register port bandwidth; register port complexity combination; register port complexity reduction; two-source instruction format; Bandwidth; Degradation; Energy consumption; High performance computing; Laboratories; Logic; Performance analysis; Power engineering computing; Radio frequency; Registers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology: New Generations, 2006. ITNG 2006. Third International Conference on
  • Conference_Location
    Las Vegas, NV
  • Print_ISBN
    0-7695-2497-4
  • Type

    conf

  • DOI
    10.1109/ITNG.2006.73
  • Filename
    1611598