DocumentCode :
169104
Title :
Profiling and Reducing Micro-Architecture Bottlenecks at the Hardware Level
Author :
Moreira, Francis B. ; Alves, Marco A. Z. ; Diener, Matthias ; Navaux, Philippe Olivier Alexandre ; Koren, Israel
Author_Institution :
Inf. Inst., Fed. Univ. of Rio Grande do Sul, Porto Alegre, Brazil
fYear :
2014
fDate :
22-24 Oct. 2014
Firstpage :
222
Lastpage :
229
Abstract :
Most mechanisms in current superscalar processors use instruction granularity information for speculation, such as branch predictors or prefetchers. However, many of these characteristics can be obtained at the basic block level, increasing the amount of code that can be covered while requiring less space to store the data. Furthermore, the code can be profiled more accurately and provide a higher variety of information by analyzing different instruction types inside a block. Because of these advantages, block-level analysis can offer more opportunities for mechanisms that use this information. For example, it is possible to integrate information about branch prediction and memory accesses to provide precise information for speculative mechanisms, increasing accuracy and performance. We propose a Block-Level Architecture Profiler (BLAP), an online mechanism that profiles bottlenecks at the micro architectural level, such as delinquent memory loads, hard-to-predict branches and contention for functional units. BLAP works at the basic block level, providing information that can be used to reduce the impact of these bottlenecks. A prefetch dropping mechanism and a memory controller policy were developed to use the profiled information provided by BLAP. Together, these mechanisms are able to improve performance by up to 17.39% (3.90% on average). Our technique showed average gains of 13.14% when evaluated under high memory pressure due to highly aggressive prefetch.
Keywords :
computer architecture; storage management; BLAP; block-level analysis; block-level architecture profiler; branch predictors; instruction granularity information; instruction types; memory controller policy; microarchitecture bottlenecks; prefetch dropping mechanism; prefetchers; superscalar processors; Buffer storage; Correlation; Hardware; Multiplexing; Program processors; Radiation detectors; Registers;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer Architecture and High Performance Computing (SBAC-PAD), 2014 IEEE 26th International Symposium on
Conference_Location :
Jussieu
ISSN :
1550-6533
Type :
conf
DOI :
10.1109/SBAC-PAD.2014.19
Filename :
6970668
Link To Document :
بازگشت