Title :
ACDT: Architected Composite Data Types trading-in unfettered data access for improved execution
Author :
Marquez, Andres ; Manzano, Joseph ; Song, Shuaiwen Leon ; Meister, Benoit ; Shrestha, Sunil ; St. John, Thomas ; Guang Gao
Author_Institution :
Pacific Northwest Nat. Lab., Richland, WA, USA
Abstract :
With Exascale performance and its challenges in mind, one ubiquitous concern among architects is energy efficiency. Petascale systems projected to Exascale systems are unsustainable at current power consumption rates. One major contributor to system-wide power consumption is the number of memory operations leading to data movement and management techniques applied by the runtime system. To address this problem, we present the concept of the Architected Composite Data Types (ACDT) framework. The framework is made aware of data composites, assigning them a specific layout, transformations and operators. Data manipulation overhead is amortized over a larger number of elements and program performance and power efficiency can be significantly improved. We developed the fundamentals of an ACDT framework on a massively multithreaded adaptive runtime system geared towards Exascale clusters. Showcasing the capability of ACDT, we exercised the framework with two representative processing kernels - Matrix Vector Multiply and the Cholesky Decomposition - applied to sparse matrices. As transformation modules, we applied optimized compress/decompress engines and configured invariant operators for maximum energy/performance efficiency. Additionally, we explored two different approaches based on transformation opaqueness in relation to the application. Under the first approach, the application is agnostic to compression and decompression activity. Such approach entails minimal changes to the original application code, but leaves out potential application-specific optimizations. The second approach exposes the decompression process to the application, hereby exposing optimization opportunities that can only be exploited with application knowledge. The experimental results show that the two approaches have their strengths in HW and SW respectively, where the SW approach can yield performance and power improvements that are an order of magnitude better than ACDT-oblivious, hand-optim- zed implementations. We consider the ACDT runtime framework an important component of compute nodes that will lead towards power efficient Exascale clusters.
Keywords :
database management systems; energy conservation; multi-threading; optimisation; power aware computing; power consumption; sparse matrices; ACDT framework; Cholesky decomposition; application-specific optimizations; architected composite data types; compress/decompress engines; data composites; data management techniques; data manipulation overhead; data movement; decompression activity; energy efficiency; exascale clusters; exascale performance; exascale systems; matrix vector multiply; memory operations; multithreaded adaptive runtime system; performance efficiency; petascale systems; power consumption rates; power efficiency; processing kernels; program performance; sparse matrices; transformation modules; transformation opaqueness; unfettered data access; Adaptive systems; Arrays; Engines; Instruction sets; Matrix decomposition; Runtime; Sparse matrices;
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2014 20th IEEE International Conference on
DOI :
10.1109/PADSW.2014.7097820