• DocumentCode
    169104
  • Title

    Profiling and Reducing Micro-Architecture Bottlenecks at the Hardware Level

  • Author

    Moreira, Francis B. ; Alves, Marco A. Z. ; Diener, Matthias ; Navaux, Philippe Olivier Alexandre ; Koren, Israel

  • Author_Institution
    Inf. Inst., Fed. Univ. of Rio Grande do Sul, Porto Alegre, Brazil
  • fYear
    2014
  • fDate
    22-24 Oct. 2014
  • Firstpage
    222
  • Lastpage
    229
  • Abstract
    Most mechanisms in current superscalar processors use instruction granularity information for speculation, such as branch predictors or prefetchers. However, many of these characteristics can be obtained at the basic block level, increasing the amount of code that can be covered while requiring less space to store the data. Furthermore, the code can be profiled more accurately and provide a higher variety of information by analyzing different instruction types inside a block. Because of these advantages, block-level analysis can offer more opportunities for mechanisms that use this information. For example, it is possible to integrate information about branch prediction and memory accesses to provide precise information for speculative mechanisms, increasing accuracy and performance. We propose a Block-Level Architecture Profiler (BLAP), an online mechanism that profiles bottlenecks at the micro architectural level, such as delinquent memory loads, hard-to-predict branches and contention for functional units. BLAP works at the basic block level, providing information that can be used to reduce the impact of these bottlenecks. A prefetch dropping mechanism and a memory controller policy were developed to use the profiled information provided by BLAP. Together, these mechanisms are able to improve performance by up to 17.39% (3.90% on average). Our technique showed average gains of 13.14% when evaluated under high memory pressure due to highly aggressive prefetch.
  • Keywords
    computer architecture; storage management; BLAP; block-level analysis; block-level architecture profiler; branch predictors; instruction granularity information; instruction types; memory controller policy; microarchitecture bottlenecks; prefetch dropping mechanism; prefetchers; superscalar processors; Buffer storage; Correlation; Hardware; Multiplexing; Program processors; Radiation detectors; Registers;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Architecture and High Performance Computing (SBAC-PAD), 2014 IEEE 26th International Symposium on
  • Conference_Location
    Jussieu
  • ISSN
    1550-6533
  • Type

    conf

  • DOI
    10.1109/SBAC-PAD.2014.19
  • Filename
    6970668