Title :
An Exploration of Page Replication for NoC-Based On-Chip Distributed Memory Systems
Author :
Weiwei Fu ; Mingmin Yuan ; Tianzhou Chen ; Li Liu
Author_Institution :
Coll. of Comput. Sci. & Technol., Zhejiang Univ., Hangzhou, China
Abstract :
By exploiting deeply the memory-level and communication-level parallelisms provided by multiple memory controllers (MCs) and interconnect components, NoC-based on-chip distributed memory systems (DMSs) have received great attention to hit the great ´memory wall´. System scaling makes memory access distance and massive contention found in MCs and interconnects critical to system performance. Data replication is a popular optimization method which has been widely employed in multi-socket NUMA systems, last level NUCA caches and cloud systems. In this paper, we explore page repli-cation schemes for NoC-based on-chip DMSs at architectural level. We discuss the external force, internal force and feasibil-ity of establishing page replicas. External force comes from long memory access radius for shared data, which introduces massive memory access traffic. Internal force is derived from overlaps of traffic hotspots and page access centers for globally shared data. High read-write ratios for most significantly ac-cesses pages illustrate the feasibility of replication. According to these observations, we propose models and algorithms to decide the trigger and actual position for establishing replicas and estimate the benefits and overheads. Besides, we discuss some details of content synchronization, replica cancellation schemes. Simulation results show the effectiveness of the sug-gested mechanisms in various scenarios.
Keywords :
distributed shared memory systems; network-on-chip; paged storage; parallel memories; synchronisation; NoC-based on-chip DMS; NoC-based on-chip distributed memory systems; cloud systems; communication-level parallelism; content synchronization; external force; feasibility; globally shared data; interconnect components; internal force; last level NUCA cache; memory access distance; memory access traffic; memory controllers; memory-level parallelism; multisocket NUMA system; optimization method; page access centers; page replication; read-write ratios; replica cancellation scheme; system performance; system scaling; traffic hotspots; Estimation; Force; Memory management; Monitoring; Parallel processing; Synchronization; System-on-chip; Data Replication; Network-on-chip; On-chip distributed memory systems;
Conference_Titel :
Parallel, Distributed and Network-Based Processing (PDP), 2014 22nd Euromicro International Conference on
Conference_Location :
Torino
DOI :
10.1109/PDP.2014.84