Title :
NGS read data compression using parallel computing algorithm
Author :
Biji C.L.;Achuthsankar S. Nair; Arun P.R;Jojo George
Author_Institution :
Dept. of Computational Biology and Bioinformatics, University of Kerala, Thiruvananthapuram, Pin-695581, India
Abstract :
Analysing and storing the high-throughput sequencing data from next generation sequencing technologies is facing great bottlenecks, hampered by the big data emerging in Terabyte range from the Next Generation Sequencing (NGS) machine. The present trend demands more sophisticated parallel computing algorithms for managing the data explosion. We propose a parallel implementation of MFCompress algorithm using message passing interface model. In the NGS Read Compression using parallel computing algorithm, the input file is split into different number of parts based on the number of nodes and each processor uses the multiple finite-context models for compression. For testing the proposed approach, we have selected read dataset from the range 50MB to 10 GB. The algorithm reported a best compression of 0.33 bpb and a speedup ratio of 6 with an average of 23 times disk space reduction.
Keywords :
"Bioinformatics","Genomics","DNA","Encoding","Computational modeling","Europe","Random access memory"
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on
DOI :
10.1109/BIBM.2015.7359890