DocumentCode
3350998
Title
Applying array contraction to a sequence of DOALL loops
Author
Song, Yonghong ; Li, Zhiyuan
Author_Institution
Sun MicroSysterms Inc., Santa Clara, CA, USA
fYear
2004
fDate
15-18 Aug. 2004
Firstpage
46
Abstract
Efficient program execution on multiprocessor computers requires both sufficient parallelism and good data locality. Recent research found that, using a combination of loop shifting, loop fusion, and array contraction, one can reduce the memory required to execute a sequence of serial loops, thereby to improve the cache locality. This paper studies how to extend such a memory-reduction scheme to a sequence of DOALL loops, which are executed in parallel on multiprocessors. Two methods are proposed to overcome difficulties caused by loop-carried dependences. Data copy-in is performed to remove anti-dependences between different parallel threads, and computation duplication is performed to remove flow dependences. Experiments performed on a number of benchmark programs show that the proposed technique improves both cache locality and parallel execution speed for the DOALL loops. The scheme achieves an average speedup of 1.41 for 17 programs on a 4-processor SUN machine.
Keywords
cache storage; multi-threading; multiprocessing systems; parallel machines; program control structures; 4-processor SUN machine; DOALL loops; array contraction; benchmark program; cache locality; loop fusion; loop shifting; memory-reduction scheme; parallel execution; parallel multiprocessor computer; parallel threads; program execution; Computer networks; Concurrent computing; Data flow computing; Intelligent networks; Parallel processing; Sun; Yarn;
fLanguage
English
Publisher
ieee
Conference_Titel
Parallel Processing, 2004. ICPP 2004. International Conference on
ISSN
0190-3918
Print_ISBN
0-7695-2197-5
Type
conf
DOI
10.1109/ICPP.2004.1327903
Filename
1327903
Link To Document