NUMA-Aware Scalable and Efficient In-Memory Aggregation on Large Domains

Author

Li Wang ; Minqi Zhou ; Zhenjie Zhang ; Ming-Chien Shan ; Aoying Zhou

Author_Institution

Software Eng. Inst., East China Normal Univ., Shanghai, China

Volume

27

Issue

4

fYear

2015

fDate

April 1 2015

Firstpage

1071

Lastpage

1084

Abstract

Business Intelligence (BI) is recognized as one of the most important IT applications in the coming big data era. In recent years, non-uniform memory access (NUMA) has become the de-facto architecture of multiprocessors on the new generation of enterprise servers. Such new architecture brings new challenges to optimization techniques on traditional operators in BI. Aggregation, for example, is one of the basic building blocks of BI, while its processing performance with existing hash-based algorithms scales poorly in terms of the number of cores under NUMA architecture. In this paper, we provide new solutions to tackle the problem of parallel hash-based aggregation, especially targeting at domains of extremely large cardinality. We propose a NUMA-aware radix partitioning (NaRP) method which divides the original huge relation table into subsets, without invoking expensive remote memory access between nodes of the cores. We also present a new efficient aggregation algorithm (EAA), to aggregate the partitioned data in parallel with low cache coherence miss and locking costs. Theoretical analysis as well as empirical study on an IBM X5 server prove that our proposals are at least two times faster than existing methods.

Keywords

Big Data; cache storage; competitive intelligence; cryptography; memory architecture; optimisation; parallel processing; BI; Big Data; EAA; IBM X5 server; IT applications; NUMA architecture; NUMA-aware radix partitioning method; NUMA-aware scalable aggregation; NaRP method; business intelligence; cache coherence miss; data partitioning; de-facto architecture; efficient aggregation algorithm; enterprise servers; hash-based algorithms; in-memory aggregation; locking costs; multiprocessors; nonuniform memory access; optimization techniques; parallel hash-based aggregation; Bismuth; Coherence; Multicore processing; Partitioning algorithms; Program processors; Servers; Aggregation; cache miss; in-memory databases; radix-partitioning;

fLanguage

English

Journal_Title

Knowledge and Data Engineering, IEEE Transactions on

Publisher

ieee

ISSN

1041-4347

Type

jour

DOI

10.1109/TKDE.2014.2359675

Filename

6906264