Title :
Reduced code size modulo scheduling in the absence of hardware support
Author :
Llosa, Josep ; Freudenberger, Stefan M.
Abstract :
Modulo scheduling is a very effective instruction scheduling technique that exploits Instruction Level Parallelism (ILP) in loop bodies by overlapping the execution of successive iterations. Unfortunately, modulo scheduling has been shown to cause heavy code expansion. To avoid the penalties of code expansion, some processors have dedicated hardware support for modulo scheduled loops. However, this dedicated hardware support has a cost in chip area, cycle time, processor complexity, and compiler complexity. This paper shows that the right combination of scheduling heuristics combined with speculative modulo scheduling can significantly reduce code expansion. In addition, several code generation schema heuristics are proposed to further reduce code expansion. The evaluations show that loops can be effectively modulo scheduled with an average code expansion only 1.5 times the original loop size. Compared with a state of the art modulo scheduler, our code size sensitive heuristics reduce the size of embedded domain benchmarks binaries by 30% on average. While performance is mostly unchanged, some applications show speed-ups up to 20% due to a reduction in instruction cache capacity misses.
Keywords :
parallel architectures; program compilers; program control structures; scheduling; Instruction Level Parallelism; VLIW architectures; code generation; compiler complexity; embedded domain benchmarks; hardware support; instruction scheduling; loop bodies; modulo scheduling; processor complexity; software pipelining; Application software; Computer aided instruction; Concurrent computing; Costs; Digital signal processing; Hardware; Laboratories; Pipeline processing; Processor scheduling; Registers;
Conference_Titel :
Microarchitecture, 2002. (MICRO-35). Proceedings. 35th Annual IEEE/ACM International Symposium on
Print_ISBN :
0-7695-1859-1
DOI :
10.1109/MICRO.2002.1176242