Title :
Experimental Analysis of SMP Scalability in the Presence of Coherence Traffic and Snoop Filtering
Author :
Al-Mouhamed, M.A. ; Daud, K.A.
Author_Institution :
Dept. of Comput. Eng., King Fahd Univ. of Pet. & Miner., Dhahran, Saudi Arabia
Abstract :
Commodity multi-core SMPs may generate an enormous amount of coherency traffic. However, the impact of coherence traffic and snoop filtering on parallel program scalability has not attracted sufficient attention. We experimentally analyze the shared data access patterns of four typical applications having different memory layout. An OpenMp optimized execution model is derived for each application with emphasis on data dependencies and implied coherence messages. Using an 8-core SMP we present the obtained speedups versus change in the number of cores and problem scale. A discussion of potential limitation on scalability due to the application or SMP is presented. To assess the coherence behavior and its impact on scalability of parallel programs, a synthetic benchmark which alternates the data block ownership among two cores of the same or different processors is presented. It is found that coherence overheads including snoop filtering are responsible of significant limitation on parallel program scalability. For 8-core SMPs, speedup can be reduced by factors of 2.5 and 5 for row-major and column-major access patterns as compared to the use of private data, respectively. A truly parallel coherence protocol implementation is needed to provide truly scalable shared-memory model.
Keywords :
application program interfaces; distributed memory systems; information retrieval; memory architecture; message passing; parallel programming; performance evaluation; protocols; 8-core SMP; OpenMp optimized execution model; coherence behavior assessment; coherency traffic; column-major access patterns; commodity multicore SMP scalability; data block ownership; parallel coherence protocol implementation; parallel program scalability; potential limitation; row-major access patterns; scalable shared-memory model; shared data access patterns; snoop filtering; speedup reduction; Arrays; Coherence; Computational modeling; Jacobian matrices; Mathematical model; Program processors; Scalability; HPC; distributed-memory; parallel programming; performance evaluation and speedup; shared-memory system;
Conference_Titel :
High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on
Conference_Location :
Liverpool
Print_ISBN :
978-1-4673-2164-8
DOI :
10.1109/HPCC.2012.21