Title :
A workflow for parallel and distributed computing of large-scale genomic data
Author :
Hyun-Hwa Choi ; Byoung-Seob Kim ; Shin-Young Ahn ; Seung-Jo Bae
Author_Institution :
Dept. of Cloud Comput. Res., Electron. & Telecommun. Res. Inst., Daejeon, South Korea
Abstract :
Workflow management systems are emerging as dominant solution in bioinformatics because they enable researchers to analyze the huge amount of data generated by modern laboratory equipment. The growth of genomic data generated by next generation sequencing (NGS) results in an increasing need to analyze data on distributed computer clusters. In this paper, we construct a semi-automated workflow system for the analysis of large-scale sequence data sets, describe a pipeline designed with parallel computation to perform the optimal computational steps required to analyze whole genome sequence data, and report the overall execution time of the pipeline using cores on multiple machines.
Keywords :
bioinformatics; data handling; parallel processing; workflow management software; NGS; bioinformatics; distributed computer clusters; distributed computing; laboratory equipment; large scale genomic data; multiple machines; next generation sequencing; parallel computing; workflow management systems; Bioinformatics; Employment; Genomics; Pipelines; Sequential analysis; Servers; Software tools; bioinformatics; genomic data; next generation sequencing; pipeline;
Conference_Titel :
Internet Technology and Secured Transactions (ICITST), 2013 8th International Conference for
Conference_Location :
London
DOI :
10.1109/ICITST.2013.6750194