DocumentCode :
1882730
Title :
Control-Flow Decoupling
Author :
Sheikh, Rami ; Tuck, James ; Rotenberg, Eric
fYear :
2012
fDate :
1-5 Dec. 2012
Firstpage :
329
Lastpage :
340
Abstract :
Mobile and PC/server class processor companies continue to roll out flagship core micro architectures that are faster than their predecessors. Meanwhile placing more cores on a chip coupled with constant supply voltage puts per-core energy consumption at a premium. Hence, the challenge is to find future micro architecture optimizations that not only increase performance but also conserve energy. Eliminating branch mispredictions -- which waste both time and energy -- is valuable in this respect. We first explore the control-flow landscape by characterizing mispredictions in four benchmark suites. We find that a third of mispredictions-per-1K-instructions (MPKI) come from what we call separable branches: branches with large control-dependent regions (not suitable for if-conversion), whose backward slices do not depend on their control-dependent instructions or have only a short dependence. We propose control-flow decoupling (CFD) to eradicate mispredictions of separable branches. The idea is to separate the loop containing the branch into two loops: the first contains only the branch´s predicate computation and the second contains the branch and its control-dependent instructions. The first loop communicates branch outcomes to the second loop through an architectural queue. Micro architecturally, the queue resides in the fetch unit to drive timely, non-speculative fetching or skipping of successive dynamic instances of the control-dependent region. Either the programmer or compiler can transform a loop for CFD, and we evaluate both. On a micro architecture configured similar to Intel´s Sandy Bridge core, CFD increases performance by up to 43%, and reduces energy consumption by up to 41%. Moreover, for some applications, CFD is a necessary catalyst for future complexity-effective large-window architectures to tolerate memory latency.
Keywords :
computer architecture; microprocessor chips; multiprocessing systems; program compilers; CFD; Intel Sandy Bridge core; MPKI; PC-server class processor companies; architectural queue; branch mispredictions; branch predicate computation; compiler; complexity-effective large-window architectures; constant supply voltage; control-dependent instructions; control-flow decoupling; control-flow landscape; flagship core microarchitectures; memory latency; microarchitecture optimizations; mispredictions-per-1K-instructions; mobile class processor companies; nonspeculative fetching; per-core energy consumption; programmer; ISA extensions; branch prediction; hardware/software codesign; pre-execution; predication; superscalar processor;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Microarchitecture (MICRO), 2012 45th Annual IEEE/ACM International Symposium on
Conference_Location :
Vancouver, BC
ISSN :
1072-4451
Print_ISBN :
978-1-4673-4819-5
Type :
conf
DOI :
10.1109/MICRO.2012.38
Filename :
6493631
Link To Document :
بازگشت