DocumentCode :
315889
Title :
Combining loop fusion with prefetching on shared-memory multiprocessors
Author :
Manjikian, Naraig
Author_Institution :
Dept. of Electr. & Comput. Eng., Toronto Univ., Ont., Canada
fYear :
1997
fDate :
11-15 Aug 1997
Firstpage :
78
Lastpage :
82
Abstract :
The performance of programs consisting of parallel loops on shared-memory multiprocessors is limited by long memory latencies as processor speeds increase more rapidly than memory speeds. Two complementary techniques for addressing memory latency and improving performance are: (a) cache locality enhancement for latency reduction and (b) data prefetching for latency tolerance. This paper studies the benefit of combining loop fusion for locality enhancement with prefetching. Experimental results are reported for multiprocessors with support for prefetching. For a complete application on an SGI Power Challenge R10000, combining loop fusion with prefetching improves parallel speedup by 46%
Keywords :
cache storage; shared memory systems; software performance evaluation; SGI Power Challenge R10000; cache locality enhancement; data prefetching; latency reduction; long memory latencies; loop fusion; memory latency; parallel loops; prefetching; shared-memory multiprocessors; Concurrent computing; Delay; Filters; Fuses; Hardware; Jacobian matrices; Lapping; Microprocessors; Parallel processing; Prefetching;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel Processing, 1997., Proceedings of the 1997 International Conference on
Conference_Location :
Bloomington, IL
ISSN :
0190-3918
Print_ISBN :
0-8186-8108-X
Type :
conf
DOI :
10.1109/ICPP.1997.622560
Filename :
622560
Link To Document :
بازگشت