Title :
Optimizing Cache Locality for Irregular Data Accesses on Many-Core Intel Xeon Phi Accelerator Chip
Author :
Nhat Phuong Tran ; Dong Hoon Choi ; Myungho Lee
Author_Institution :
Dept. of Comput. Sci. & Eng., Myongji Univ., Yong In, South Korea
Abstract :
Many-core accelerator chips are becoming increasingly popular these days for its high performance floating-point performance exceeding 1 Tflops per chip. Aho-Corasick (AC) is a multiple patterns string matching algorithm commonly used in computer and network security, bioinformatics, among others. In order to simultaneously match a number of string patterns with respect to the input text data, a Deterministic Finite Automata (DFA) is constructed from a given set of pattern strings. The DFA is referenced almost randomly, whereas the input data is sequentially accessed. As the number of pattern strings increases, the irregular DFA accesses lead to poor cache locality and low overall performance. In this paper, we present a cache locality optimizing parallelization on the many-core accelerator chip, the Intel Xeon Phi. A given set of pattern strings is partitioned into into multiple sets of a smaller number of patterns so that multiple small DFAs are constructed instead of single large DFA. The accesses to multiple small DFAs lead to significantly smaller cache footprints in each core´s cache and result in impressive performance improvements. Experimental results on the Intel Xeon Phi 5110P show that our approach delivers up to 2.00 × speedup compared with the previous approach using single large DFA.
Keywords :
cache storage; deterministic automata; finite automata; microprocessor chips; multiprocessing systems; parallel processing; string matching; Aho-Corasick; DFA; Intel Xeon Phi; cache footprints; cache locality optimizing parallelization; deterministic finite automata; many-core accelerator chips; multiple patterns string matching algorithm; string patterns; Automata; Computer architecture; Coprocessors; Graphics processing units; Partitioning algorithms; Pattern matching; Aho-Corasick algorithm; Many-core accelerator; cache locality; parallelization;
Conference_Titel :
High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst (HPCC,CSS,ICESS), 2014 IEEE Intl Conf on
Print_ISBN :
978-1-4799-6122-1
DOI :
10.1109/HPCC.2014.29