Title :
Parallel Algorithm Design and Performance Evaluation of FDTD on 3 Different Architectures: Cluster, Homogeneous Multicore and Cell/B.E.
Author :
Xu, Meilian ; Thulasiraman, Parimala
Author_Institution :
Dept. of Comput. Sci., Univ. of Manitoba, Winnipeg, MB
Abstract :
Clusters built from single-core systems are cost-effective as for the performance improvement and availability. However, the hardware constraints put limitations on the performance of single-core systems. Hence, it is difficult to meet with the increasing high performance requirements of diversified applications at different levels for general purpose computing. A promising feasible solution is the novice multi-core systems which extend the parallelism to CPU level by integrating multiple processing units on a single die. This paper uses finite-difference time-domain (FDTD) algorithm as a case study, designing suitable parallel FDTD algorithms for three architectures: distributed-memory machines with single-core processors, shared-memory machines with dual-core processors, and the Cell Broadband Engine (Cell/B.E.) processor with nine heterogeneous cores. The experiment results show that the Cell/B.E. processor using 8 SPEs achieves a significant speedups of 7.05 faster than AMD single-core Opteron processor and 3.37 than AMD dual-core Opeteron processor at the processor level.
Keywords :
distributed memory systems; finite difference time-domain analysis; general purpose computers; mathematics computing; parallel algorithms; shared memory systems; AMD dual-core Opeteron processor; AMD single-core Opteron processor; Cell Broadband Engine processor; Cell-BE architecture; FDTD; cluster architecture; distributed-memory machines; dual-core processors; finite-difference time-domain algorithm; general purpose computing; homogeneous multicore architecture; parallel algorithm design; shared-memory machines; Algorithm design and analysis; Availability; Central Processing Unit; Finite difference methods; Hardware; High performance computing; Multicore processing; Parallel algorithms; Parallel processing; Time domain analysis; Cell Broadband Engine Processor (Cell/B.E.); Direct Memory Access (DMA); Finite-Difference Time-Domain (FDTD); Message Passing Interface (MPI); Multi-Core Processor; Synergistic Processor Element (SPE);
Conference_Titel :
High Performance Computing and Communications, 2008. HPCC '08. 10th IEEE International Conference on
Conference_Location :
Dalian
Print_ISBN :
978-0-7695-3352-0
DOI :
10.1109/HPCC.2008.85