DocumentCode :
1647213
Title :
Impact of heterogeneity on DSM performance
Author :
Figueiredo, Renato J O ; Fortes, José A B
Author_Institution :
Sch. of Electr. & Comput. Eng., Purdue Univ., West Lafayette, IN, USA
fYear :
2000
fDate :
6/22/1905 12:00:00 AM
Firstpage :
26
Lastpage :
35
Abstract :
This paper explores area/parallelism tradeoffs in the design of distributed shared-memory (DSM) multiprocessors built out of large single-chip computing nodes. In this context, area-efficiency arguments motivate a heterogeneous organization consisting of few nodes with large caches designed for single-thread parallelism, and a larger number of nodes with smaller caches designed for multi-thread parallelism. Quantitative performance of such organization is reported for a set of homogeneous multiprocessor programs from the SPLASH-2 benchmark suite. These programs are mapped onto the heterogeneous processors without source code modifications via static thread assignment policies. Simulation-based analysis is used to compare the performance of heterogeneous and homogeneous DSMs that occupy the same silicon area. The analysis shows that a 4-node heterogeneous DSM with 21 processors outperforms its homogeneous counterpart with 4 processors by an average age of 36% for the studied multiprocessor workload, while having the same performance for sequential codes. A sensitivity analysis based on a factorial design experiment is used to study the implications of processor, memory, and network heterogeneity on overall cost and performance of a heterogeneous DSM. The studied benchmarks are affected, on average, primarily by heterogeneity in processor performance (59.3%), followed by cache sizes (18.2%), memory latency (14.6%), and network latency (5.6%)
Keywords :
distributed shared memory systems; performance evaluation; sensitivity analysis; sequential codes; SPLASH-2 benchmark suite; area/parallelism tradeoffs; cache sizes; distributed shared memory performance; heterogeneity; heterogeneous organization; homogeneous multiprocessor programs; memory latency; multi-thread parallelism; network latency; quantitative performance; sensitivity analysis; sequential codes; simulation-based analysis; single-chip computing nodes; static thread assignment; Analytical models; Concurrent computing; Costs; Delay; Distributed computing; Parallel processing; Performance analysis; Sensitivity analysis; Silicon; Yarn;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
High-Performance Computer Architecture, 2000. HPCA-6. Proceedings. Sixth International Symposium on
Conference_Location :
Touluse
Print_ISBN :
0-7695-0550-3
Type :
conf
DOI :
10.1109/HPCA.2000.824336
Filename :
824336
Link To Document :
بازگشت