• DocumentCode
    1825899
  • Title

    Experimental Analysis of SMP Scalability in the Presence of Coherence Traffic and Snoop Filtering

  • Author

    Al-Mouhamed, M.A. ; Daud, K.A.

  • Author_Institution
    Dept. of Comput. Eng., King Fahd Univ. of Pet. & Miner., Dhahran, Saudi Arabia
  • fYear
    2012
  • fDate
    25-27 June 2012
  • Firstpage
    81
  • Lastpage
    88
  • Abstract
    Commodity multi-core SMPs may generate an enormous amount of coherency traffic. However, the impact of coherence traffic and snoop filtering on parallel program scalability has not attracted sufficient attention. We experimentally analyze the shared data access patterns of four typical applications having different memory layout. An OpenMp optimized execution model is derived for each application with emphasis on data dependencies and implied coherence messages. Using an 8-core SMP we present the obtained speedups versus change in the number of cores and problem scale. A discussion of potential limitation on scalability due to the application or SMP is presented. To assess the coherence behavior and its impact on scalability of parallel programs, a synthetic benchmark which alternates the data block ownership among two cores of the same or different processors is presented. It is found that coherence overheads including snoop filtering are responsible of significant limitation on parallel program scalability. For 8-core SMPs, speedup can be reduced by factors of 2.5 and 5 for row-major and column-major access patterns as compared to the use of private data, respectively. A truly parallel coherence protocol implementation is needed to provide truly scalable shared-memory model.
  • Keywords
    application program interfaces; distributed memory systems; information retrieval; memory architecture; message passing; parallel programming; performance evaluation; protocols; 8-core SMP; OpenMp optimized execution model; coherence behavior assessment; coherency traffic; column-major access patterns; commodity multicore SMP scalability; data block ownership; parallel coherence protocol implementation; parallel program scalability; potential limitation; row-major access patterns; scalable shared-memory model; shared data access patterns; snoop filtering; speedup reduction; Arrays; Coherence; Computational modeling; Jacobian matrices; Mathematical model; Program processors; Scalability; HPC; distributed-memory; parallel programming; performance evaluation and speedup; shared-memory system;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems (HPCC-ICESS), 2012 IEEE 14th International Conference on
  • Conference_Location
    Liverpool
  • Print_ISBN
    978-1-4673-2164-8
  • Type

    conf

  • DOI
    10.1109/HPCC.2012.21
  • Filename
    6332162