Title :
Performance and energy evaluation of data prefetching on intel Xeon Phi
Author :
Guttman, Diana ; Kandemir, Mahmut Taylan ; Arunachalamy, Meenakshi ; Calina, Vlad
Author_Institution :
Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
Abstract :
There is an urgent need to evaluate the existing parallelism and data locality-oriented techniques on emerging manycore machines using multithreaded applications. Data prefetching is a well-known latency hiding technique that comes with various hardware- and software-based implementations in almost all commercial machines. A well-tuned prefetcher can reduce the observed data access latencies significantly by bringing the soonto- be-requested data into the cache ahead of time, eventually improving application execution time. Motivated by this, we present in this paper a detailed performance and power characterization of software (compiler-guided) and hardware data prefetching on an Intel Xeon Phi-based system. Our main contributions are (i) an analysis of the interactions between hardware and software prefetching, showing how hardware prefetching can throttle itself in response to software; (ii) results on the power and energy behavior of prefetching, showing how performance and energy gains outweigh the increased power cost of prefetching; and (iii) an evaluation of the use of intrinsic prefetch instructions to prefetch for applications with difficult-to-detect access patterns.
Keywords :
data encapsulation; multi-threading; parallel processing; performance evaluation; program compilers; storage management; Intel Xeon Phi; application execution time; compiler-guided data prefetching; data locality-oriented technique; energy evaluation; hardware data prefetching; hardware-based implementation; intrinsic prefetch instructions; latency hiding technique; manycore machines; multithreaded applications; parallelism technique; performance characterization; performance evaluation; power characterization; software data prefetching; software-based implementation; Benchmark testing; Coprocessors; Hardware; Measurement; Microwave integrated circuits; Prefetching;
Conference_Titel :
Performance Analysis of Systems and Software (ISPASS), 2015 IEEE International Symposium on
Conference_Location :
Philadelphia, PA
DOI :
10.1109/ISPASS.2015.7095814