Title :
Partially Separated Page Tables for Efficient Operating System Assisted Hierarchical Memory Management on Heterogeneous Architectures
Author :
Gerofi, B. ; Shimada, Akira ; Hori, A. ; Ishikawa, Yozo
Author_Institution :
RIKEN Adv. Inst. for Comput. Sci., Kobe, Japan
Abstract :
Heterogeneous architectures, where a multicore processor is accompanied with a large number of simpler, but more power-efficient CPU cores optimized for parallel workloads, are receiving a lot of attention recently. At present, these co-processors, such as the Intel Xeon Phi product family, come with limited on-board memory, which requires partitioning computational problems manually into pieces that can fit into the device´s RAM, as well as efficiently overlapping computation and communication. In this paper we propose an application transparent, operating system (OS)assisted hierarchical memory management system, where the OS orchestrates data movement between the host and the device and updates the process virtual memory address space accordingly. We identify the main scalability issues of frequent address space changes, such as the increasing price of TLB invalidations with the growing number of CPU cores, and propose partially separated page tables with address-range CPU masks to overcome the problem. With partially separated page tables each core maintains its own set of mappings of the computation area, enabling the OS to perform address space updates in a scalable manner, and involve a particular CPU core in TLB invalidation only if it is absolutely necessary. Furthermore, we propose dedicated data movement cores in order to efficiently overlap computation and communication. We provide experimental results on stencil computation, a common HPCkernel, and show that OS assisted memory management has the potential for scalable transparent data movement.
Keywords :
multiprocessing systems; operating systems (computers); storage management; TLB invalidations; address-range CPU masks; data movement; heterogeneous architectures; multicore processor; operating system assisted hierarchical memory management; parallel workloads; partitioning computational problems; power-efficient CPU cores; process virtual memory address space; separated page tables; Instruction sets; Kernel; Memory management; Multicore processing; Random access memory; coprocessor; manycore; memory management; operating systems; page tables;
Conference_Titel :
Cluster, Cloud and Grid Computing (CCGrid), 2013 13th IEEE/ACM International Symposium on
Conference_Location :
Delft
Print_ISBN :
978-1-4673-6465-2
DOI :
10.1109/CCGrid.2013.59