Title :
Jericho: Achieving scalability through optimal data placement on multicore systems
Author :
Mavridis, Stelios ; Sfakianakis, Yannis ; Papagiannis, Anastasios ; Marazakis, Manolis ; Bilas, Angelos
Author_Institution :
Inst. of Comput. Sci. (ICS), Found. for Res. & Technol. - Hellas (FORTH), Heraklion, Greece
Abstract :
Achieving high I/O throughput on modern servers presents significant challenges. With increasing core counts, server memory architectures become less uniform, both in terms of latency as well as bandwidth. In particular, the bandwidth of the interconnect among NUMA nodes is limited compared to local memory bandwidth. Moreover, interconnect congestion and contention introduce additional latency on remote accesses. These challenges severely limit the maximum achievable storage throughput and IOPS rate. Therefore, data and thread placement are critical for data-intensive applications running on NUMA architectures. In this paper we present Jericho, a new I/O stack for the Linux kernel that improves affinity between application threads, kernel threads, and buffers in the storage I/O path. Jericho consists of a NUMA-aware filesystem and a DRAM cache organized in slices mapped to NUMA nodes. The Jericho filesystem implements our task placement policy by dynamically migrating application threads that issue I/Os based on the location of the corresponding I/O buffers. The Jericho DRAM I/O cache, a replacement for the Linux page-cache, splits buffer memory in slices, and uses per-slice kernel I/O threads for I/O request processing. Our evaluation shows that running the FIO microbenchmark on a modern 64-core server with an unmodified Linux kernel results in only 5% of the memory accesses being served by local memory. With Jericho, more than 95% of accesses become local, with a corresponding 2x performance improvement.
Keywords :
DRAM chips; Linux; buffer circuits; buffer storage; file servers; memory architecture; multi-threading; multiprocessing systems; operating system kernels; DRAM cache; FIO microbenchmark; I/O buffers; I/O request processing; Jericho DRAM I/O cache; Jericho filesystem; Linux page-cache; NUMA architectures; NUMA nodes; NUMA-aware filesystem; application threads; buffer memory; data-intensive applications; interconnect congestion; kernel I/O threads; kernel threads; memory accesses; modern servers; multicore systems; optimal data placement; server memory architectures; storage I/O path; task placement policy; unmodified Linux kernel; Context; Instruction sets; Kernel; Linux; Pipelines; Servers; Throughput;
Conference_Titel :
Mass Storage Systems and Technologies (MSST), 2014 30th Symposium on
Conference_Location :
Santa Clara, CA
DOI :
10.1109/MSST.2014.6855538