مرکز منطقه ای اطلاع رساني علوم و فناوري - Parallel-META: A high-performance computational pipeline for metagenomic data analysis

DocumentCode :

3491332

Title :

Parallel-META: A high-performance computational pipeline for metagenomic data analysis

Author :

Su, Xiaoquan ; Xu, Jian ; Ning, Kang

Author_Institution :

Qingdao Inst. of Bioenergy & Bioprocess Technol., Chinese Acad. of Sci., Qingdao, China

fYear :

2011

fDate :

2-4 Sept. 2011

Firstpage :

173

Lastpage :

178

Abstract :

Metagenomics method directly sequences and analyzes genome information from microbial communities. There are usually more than hundreds of genomes from different microbial species in the same community, and the main computational tasks for metagenomics data analysis include taxonomical and functional component of these genomes in the microbial community. Metagenomic data analysis is both data- and computation- intensive, which requires extensive computational power. Most of the current metagenomic data analysis softwares were designed to be used on a single computer, which could not match with the fast increasing number of large metagenomic projects´ computational requirements. Therefore, advanced computational methods and pipelines have to be developed to cope with such need for efficient analyses. In this paper, we proposed Parallel-META, a GPU- and multi-core-CPU-based open-source pipeline for metagenomic data analysis, which enabled the efficient and parallel analysis of multiple metagenomic datasets. In Parallel-META, the similarity-based database search was parallelized based on GPU computing and multi-core CPU computing optimization. Experiments have shown that Parallel-META has at least 15 times speed-up compared to traditional metagenomic data analysis method, with the same accuracy of the results (http://www.bioenergychina.org:8800/).

Keywords :

bioinformatics; cellular biophysics; coprocessors; genomics; microorganisms; molecular biophysics; parallel processing; GPU based open source pipeline; Parallel-META; advanced computational methods; genome functional component; genome information analysis; genome taxonomical component; high performance computational pipeline; metagenomic data analysis software; microbial communities; multicore CPU based open source pipeline; similarity based database search; Communities; Data analysis; Graphics processing unit; Hidden Markov models; Instruction sets; Pipelines; Random access memory; data-intensive computing; high performance computing; metagenomics;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Systems Biology (ISB), 2011 IEEE International Conference on

Conference_Location :

Zhuhai

Print_ISBN :

978-1-4577-1661-4

Electronic_ISBN :

978-1-4577-1665-2

Type :

conf

DOI :

10.1109/ISB.2011.6033151

Filename :

6033151

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3491332