DocumentCode :
3491332
Title :
Parallel-META: A high-performance computational pipeline for metagenomic data analysis
Author :
Su, Xiaoquan ; Xu, Jian ; Ning, Kang
Author_Institution :
Qingdao Inst. of Bioenergy & Bioprocess Technol., Chinese Acad. of Sci., Qingdao, China
fYear :
2011
fDate :
2-4 Sept. 2011
Firstpage :
173
Lastpage :
178
Abstract :
Metagenomics method directly sequences and analyzes genome information from microbial communities. There are usually more than hundreds of genomes from different microbial species in the same community, and the main computational tasks for metagenomics data analysis include taxonomical and functional component of these genomes in the microbial community. Metagenomic data analysis is both data- and computation- intensive, which requires extensive computational power. Most of the current metagenomic data analysis softwares were designed to be used on a single computer, which could not match with the fast increasing number of large metagenomic projects´ computational requirements. Therefore, advanced computational methods and pipelines have to be developed to cope with such need for efficient analyses. In this paper, we proposed Parallel-META, a GPU- and multi-core-CPU-based open-source pipeline for metagenomic data analysis, which enabled the efficient and parallel analysis of multiple metagenomic datasets. In Parallel-META, the similarity-based database search was parallelized based on GPU computing and multi-core CPU computing optimization. Experiments have shown that Parallel-META has at least 15 times speed-up compared to traditional metagenomic data analysis method, with the same accuracy of the results (http://www.bioenergychina.org:8800/).
Keywords :
bioinformatics; cellular biophysics; coprocessors; genomics; microorganisms; molecular biophysics; parallel processing; GPU based open source pipeline; Parallel-META; advanced computational methods; genome functional component; genome information analysis; genome taxonomical component; high performance computational pipeline; metagenomic data analysis software; microbial communities; multicore CPU based open source pipeline; similarity based database search; Communities; Data analysis; Graphics processing unit; Hidden Markov models; Instruction sets; Pipelines; Random access memory; data-intensive computing; high performance computing; metagenomics;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Systems Biology (ISB), 2011 IEEE International Conference on
Conference_Location :
Zhuhai
Print_ISBN :
978-1-4577-1661-4
Electronic_ISBN :
978-1-4577-1665-2
Type :
conf
DOI :
10.1109/ISB.2011.6033151
Filename :
6033151
Link To Document :
بازگشت