Title :
Exploiting State-of-the-Art x86 Architectures in Scientific Computing
Author :
Heinecke, Alexander ; Auckenthaler, Thomas ; Trinitis, Carsten
Author_Institution :
Inst. fur Inf., Tech. Univ. Munchen, Garching, Germany
Abstract :
In recent years, general purpose ×86 architectures have undergone significant modifications towards high performance computing capabilities. Lately, technologies like wider vector units or Fused Multiply-Add (FMA) instruction, which were mainly known from GPU arcitectures, have been introduced. In this paper, we examine the performance of current ×86 architectures, namely Intel Sandy Bridge and AMD Bulldozer, for four different parallel workloads with different properties. These properties comprise optimally cache-blocked algorithms as well as adaptive grid structures resulting in memory latency and bandwidth bound executions. The achieved performance on both architectures is very promising, and, if extrapolated towards upcoming server silicon, can be regarded as on par with current high-end GPU based accelerators.
Keywords :
cache storage; graphics processing units; instruction sets; natural sciences computing; parallel processing; ×86 architectures; AMD Bulldozer; FMA instruction; GPU arcitectures; Intel Sandy Bridge; adaptive grid structures; bandwidth bound executions; cache-blocked algorithms; fused multiply-add instruction; high performance computing capabilities; high-end GPU based accelerators; memory latency; parallel workloads; scientific computing; server silicon; wider vector units; Bridges; Computer architecture; Land vehicles; Matrix decomposition; Registers; Symmetric matrices; Vectors; AMD; CPU architectures; Intel; Multi-core; parallel applications; vectorization;
Conference_Titel :
Parallel and Distributed Computing (ISPDC), 2012 11th International Symposium on
Conference_Location :
Munich/Garching, Bavaria
Print_ISBN :
978-1-4673-2599-8
DOI :
10.1109/ISPDC.2012.15