• DocumentCode
    2845694
  • Title

    Optimizing Issue Queue Reliability to Soft Errors on Simultaneous Multithreaded Architectures

  • Author

    Fu, Xin ; Zhang, Wangyuan ; Li, Tao ; Fortes, José

  • Author_Institution
    Dept. of ECE, Univ. of Florida, Gainesville, FL
  • fYear
    2008
  • fDate
    9-12 Sept. 2008
  • Firstpage
    190
  • Lastpage
    197
  • Abstract
    The issue queue (IQ) is a key microarchitecture structure for exploiting instruction-level and thread-level parallelism in dynamically scheduled simultaneous multithreaded (SMT) processors. However, exploiting more parallelism yields high susceptibility to transient faults on a conventional IQ. With the rapidly increasing soft error rates, the IQ is likely to be a reliability hot-spot on SMT processors fabricated with advanced technology nodes using smaller and denser transistors with lower threshold voltages and tighter noise margins. In this paper, we explore microarchitecture techniques to optimize IQ reliability to soft error on SMT architectures. We propose to use off-line instruction vulnerability profiling to identify reliability critical instructions. The gathered information is then used to guide reliability-aware instruction scheduling and resource allocation in multithreaded execution environments. We evaluate the efficiency of the proposed schemes across various SMT workload mixes. Extensive simulation results show that, on average, our microarchitecture level soft error mitigation techniques can significantly reduce IQ vulnerability by 42% with 1% performance improvement. To maintain runtime IQ reliability for pre-defined thresholds, we propose dynamic vulnerability management (DVM) mechanisms. Experimental results show that our DVM techniques can effectively achieve desired reliability/performance tradeoffs.
  • Keywords
    multi-threading; multiprocessing systems; parallel architectures; scheduling; SMT processors; dynamic vulnerability management; dynamically scheduled simultaneous multithreaded processors; instruction-level parallelism; issue queue reliability; multithreaded execution environments; reliability-aware instruction scheduling; resource allocation; simultaneous multithreaded architectures; thread-level parallelism; Computer errors; Dynamic scheduling; Error analysis; Microarchitecture; Processor scheduling; Resource management; Surface-mount technology; Threshold voltage; Transistors; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing, 2008. ICPP '08. 37th International Conference on
  • Conference_Location
    Portland, OR
  • ISSN
    0190-3918
  • Print_ISBN
    978-0-7695-3374-2
  • Electronic_ISBN
    0190-3918
  • Type

    conf

  • DOI
    10.1109/ICPP.2008.23
  • Filename
    4625849