Title :
String matching with multicore CPUs: Performing better with the Aho-Corasick algorithm
Author :
Arudchutha, S. ; Nishanthy, T. ; Ragel, R.G.
Author_Institution :
Dept. of Comput. Eng., Univ. of Peradeniya, Peradeniya, Sri Lanka
Abstract :
Multiple string matching is known as locating all the occurrences of a given number of patterns in an arbitrary string. It is used in bio-computing applications where the algorithms are commonly used for retrieval of information such as sequence analysis and gene/protein identification. Extremely large amount of data in the form of strings has to be processed in such bio-computing applications. Therefore, improving the performance of multiple string matching algorithms is always desirable. Multicore architectures are capable of providing better performance by parallelizing the multiple string matching algorithms. The Aho-Corasick algorithm is the one that is commonly used in exact multiple string matching algorithms. The focus of this paper is the acceleration of Aho-Corasick algorithm through a multicore CPU based software implementation. Through our implementation and evaluation of results, we prove that our method performs better compared to the state of the art.
Keywords :
biocomputing; information retrieval; multiprocessing systems; string matching; Aho-Corasick algorithm; biocomputing applications; gene identification; information retrieval; multicore CPU; multicore CPU based software implementation; multicore architectures; multiple string matching algorithms; protein identification; sequence analysis; Algorithm design and analysis; Instruction sets; Multicore processing; Pattern matching; Software algorithms; Throughput; Aho-Corasick; Multicore processor; POSIX threads;
Conference_Titel :
Industrial and Information Systems (ICIIS), 2013 8th IEEE International Conference on
Conference_Location :
Peradeniya
Print_ISBN :
978-1-4799-0908-7
DOI :
10.1109/ICIInfS.2013.6731987