DocumentCode :
2214591
Title :
Aggressive compiler optimization and parallelization with thread-level speculation
Author :
Chen, Li-Ling ; Wu, Youfeng
Author_Institution :
Intel Labs., Intel Corp., Santa Clara, CA
fYear :
2003
fDate :
9-9 Oct. 2003
Firstpage :
607
Lastpage :
614
Abstract :
We present a technique that exploits close collaboration between the compiler and the speculative multithreaded hardware to explore aggressive optimizations and parallelization for scalar programs. The compiler aggressively optimizes the frequently executed code in user programs by predicting an execution path or the values of long-latency instructions. Based on the predicted hot execution path, the compiler forms regions of greatly simplified data and control flow graphs and then performs aggressive optimizations on the formed regions. Thread level speculation (TLS) helps expose program parallelism and guarantees program correctness when the prediction is incorrect. With the collaboration of compilers and speculative multithreaded support, the program performance can be significantly improved. The preliminary results with simple trace regions demonstrate that the performance gain on dynamic compiler schedule cycles can be 33% for some benchmark and about 10%, on the average, for all the eight SpecInt95 benchmarks. For SpecInt2k, the performance gain is up to 23% with the conservative execution model. With a cycle accurate simulator with the conservative execution model, the overall performance gain by considering runtime factors (e.g., cache misses and branch misprediction) for vortex and m88ksim is 12% and 14.7%, respectively. The performance gain can be higher with more sophisticated region formation and region-based optimizations
Keywords :
data flow graphs; multi-threading; optimising compilers; parallel architectures; parallelising compilers; performance evaluation; SpecInt95 benchmarks; aggressive compiler optimizations; conservative execution model; control flow graphs; cycle accurate simulator; dynamic compiler; high-performance architecture; long-latency instructions; program correctness; program parallelism; program performance; region formation; region-based optimizations; scalar programs; speculative execution; speculative multithreaded hardware; thread-level parallelism; user programs; Collaboration; Dynamic compiler; Dynamic scheduling; Flow graphs; Hardware; Optimizing compilers; Performance gain; Program processors; Runtime; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing, 2003. Proceedings. 2003 International Conference on
Conference_Location :
Kaohsiung
ISSN :
0190-3918
Print_ISBN :
0-7695-2017-0
Type :
conf
DOI :
10.1109/ICPP.2003.1240629
Filename :
1240629
Link To Document :
بازگشت