Author_Institution :
Dept.E&E Engineering, University of Stellenbosch, 7600, South Africa
Abstract :
Many recent papers on accelerating computations (in particular for the FDTD, which lends itself readily to parallel computation) have used GPGPUs [general-purpose graphics processing units], but contemporary CPUs offer a variety of options for performance acceleration, too. Often, these are somewhat easier to code. This month´s contribution provides a detailed investigation of the use of steaming SIMD extensions instructions on x86 architectures. The authors carefully ana lyze hardware aspects, in particular the impact of cache alignment on performance, and provide interesting results.