DocumentCode :
3588696
Title :
GPU acceleration of finding frequent patterns over large biological sequence
Author :
Shufang Du ; Longjiang Guo ; Chunyu Ai ; Jinbao Li ; Meirui Ren ; Yahong Guo
Author_Institution :
Sch. of Comput. Sci. & Technol., Heilongjiang Univ., Harbin, China
fYear :
2014
Firstpage :
648
Lastpage :
655
Abstract :
Biological frequent patterns usually correspond to the important function (or structure) in biological sequences. Along with the rapid growth of biological sequences, it is significant to find frequent patterns over a large bio-sequence efficiently. However, most of existing algorithms need to produce lots of short patterns or projected databases, which influence the efficiency badly and also increase the cost of space. Graphics processing units (GPUs) embracing many core computing devices, have been extensively applied to accelerate computation performance in many areas. In order to meet the demand of biologists, we redefine the frequent pattern problem with length constraints for finding frequent patterns. We present pruning optimization method for the serial algorithm (POSA), and based on this technique, we propose a parallel algorithm (POPA) which not only reduces the time complexity with a low space cost but also obtains better performance on CUDA. To validate the presented algorithms, we implemented the algorithms on multiple-core CPU and various GPU devices. Also, CUDA optimization techniques are applied to speed up calculation in the paper. Finally, experimental results show that compared with the serial algorithm on CPU with six cores, POSA achieves 1.2~4.5 speedup, and POPA gains 3~20 speedup.
Keywords :
bioinformatics; graphics processing units; parallel algorithms; pattern classification; CUDA optimization technique; Compute Unified Device Architecture; GPU acceleration; POPA parallel algorithm; POSA; biological frequent pattern; biological sequence; graphics processing unit; many core computing device; pruning optimization method for the serial algorithm; time complexity; Arrays; Biology; Data mining; Graphics processing units; Indexes; Parallel algorithms; Acceleration; CUDA; Frequent Pattern; Large Biological Sequence;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Parallel and Distributed Systems (ICPADS), 2014 20th IEEE International Conference on
Type :
conf
DOI :
10.1109/PADSW.2014.7097865
Filename :
7097865
Link To Document :
بازگشت