Title :
Performance analysis and optimization on a parallel atmospheric general circulation model code
Author :
Lou, John Z. ; Farrara, John D.
Author_Institution :
Jet Propulsion Lab., California Inst. of Technol., Pasadena, CA, USA
Abstract :
An analysis is presented of the primary factors influencing the performance of a parallel implementation of the UCLA atmospheric general circulation model (AGCM) on distributed memory, massively parallel computer systems. Several modifications to the original parallel AGCM code aimed at improving its numerical efficiency, load balance and single node code performance are discussed. The impact of these optimization strategies on the performance on two of the state of the art parallel computers, the Intel Paragon and Cray T3D, is presented and analyzed. It is found that implementation of a load balanced FFT algorithm results in a reduction in overall execution time of approximately 45% compared to the original convolution based algorithm. Preliminary results of the application of a load balancing scheme for the physics part of the AGCM code suggest additional reductions in execution time of 15-20% can be achieved. Finally, several strategies for improving the single node performance of the code are presented, and the results obtained thus far suggest reductions in execution time in the range of 30-40% are possible
Keywords :
climatology; distributed memory systems; geophysics computing; parallel programming; resource allocation; software performance evaluation; terrestrial atmosphere; AGCM; Cray T3D; Intel Paragon; convolution based algorithm; distributed memory massively parallel computer systems; execution time; load balance; load balanced FFT algorithm; load balancing scheme; numerical efficiency; parallel atmospheric general circulation model code; parallel implementation; performance analysis; single node code performance; single node performance; Atmospheric modeling; Cloud computing; Computational modeling; Concurrent computing; Distributed computing; Grid computing; Laboratories; Performance analysis; Propulsion; Scalability;
Conference_Titel :
Parallel Processing Symposium, 1997. Proceedings., 11th International
Conference_Location :
Genva
Print_ISBN :
0-8186-7793-7
DOI :
10.1109/IPPS.1997.580879