DocumentCode :
2834947
Title :
Tuning system-dependent applications with alternative MPI calls: a case study
Author :
Le, Thuy T.
Author_Institution :
Dept. of Electr. Eng., San Jose State Univ., CA, USA
fYear :
2005
fDate :
11-13 Aug. 2005
Firstpage :
137
Lastpage :
143
Abstract :
This paper shows the effectiveness of using optimized MPI calls for MPI based applications on different architectures. Using optimized MPI calls can result in reasonable performance gain for most of MPI based applications running on most of high-performance distributed systems. Since relative performance of different MPI function calls and system architectures can be uncorrelated, tuning system-dependent MPI applications by exploring the alternatives of using different MPI calls is the simplest but most effective optimization method. The paper first shows that for a particular system, there are noticeable performance differences between using various MPI calls that result in the same communication pattern. These performance differences are in fact not similar across different systems. The paper then shows that good performance optimization for an MPI application on different systems can be obtained by using different MPI calls for different systems. The communication patterns that were experimented in this paper include the point-to-point and collective communications. The MPI based application used for this study is the general-purpose transient dynamic finite element application and the benchmark problems are the public domain 3D car crash problems. The experiment results show that for the same communication purpose, using alternative MPI calls can result in quite different communication performance on the Fujitsu HPC2500 system and the 8-node AMD Athlon cluster, but very much the same performance on the other systems such as the Intel Itanium2 and the AMD Opteron clusters.
Keywords :
message passing; optimisation; performance evaluation; remote procedure calls; 3D car crash problem; 8-node AMD Athlon cluster; AMD Opteron cluster; Fujitsu HPC2500 system; Intel Itanium2; MPI call; collective communication; general-purpose transient dynamic finite element application; high-performance distributed system; optimization method; point-to-point communication; system-dependent application tuning; Application software; Computer aided software engineering; Concurrent computing; Delay; Finite element methods; Hardware; Message passing; Optimization methods; Performance gain; Vehicle crash testing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Software Engineering Research, Management and Applications, 2005. Third ACIS International Conference on
Print_ISBN :
0-7695-2297-1
Type :
conf
DOI :
10.1109/SERA.2005.67
Filename :
1563154
Link To Document :
بازگشت