Title :
A simple mechanism for detecting ineffectual instructions in slipstream processors
Author :
Koppanalil, Jinson J. ; Rotenberg, Eric
Author_Institution :
ARM Inc., Austin, TX, USA
fDate :
4/1/2004 12:00:00 AM
Abstract :
A slipstream processor accelerates a program by speculatively removing repeatedly ineffectual instructions. Detecting the roots of ineffectual computation: unreferenced writes, nonmodifying writes, and correctly predicted branches, is straightforward. On the other hand, detecting ineffectual instructions in the backward slices of these root instructions currently requires complex back-propagation circuitry. We observe that, by logically monitoring the speculative program (instead of the original program), back-propagation can be reduced to detecting unreferenced writes. That is, once root instructions are actually removed, instructions at the next higher level in the backward slice become newly exposed unreferenced writes in the speculative program. This new algorithm, called implicit back-propagation, eliminates complex hardware and achieves an average performance improvement of 11.8 percent, only marginally lower than the 12.3 percent improvement achieved with explicit back-propagation. We further simplify the hardware component by electing not to detect ineffectual memory writes, focusing only on ineffectual register writes. A minimal implementation consisting of only a register-indexed table (similar to an architectural register file) achieves a good balance between complexity and performance (11.2 percent average performance improvement with implicit back-propagation and without detection of ineffectual memory writes).
Keywords :
backpropagation; circuit complexity; multi-threading; parallel architectures; program control structures; program diagnostics; architectural register file; chip multiprocessor; implicit back-propagation; ineffectual instruction detection; memory write; microarchitecture; multithreading; register-indexed table; slipstream processor; speculative program; Acceleration; Backpropagation algorithms; Circuits; Clocks; Hardware; Monitoring; Multithreading; Parallel processing; Registers; Throughput;
Journal_Title :
Computers, IEEE Transactions on
DOI :
10.1109/TC.2004.1268397