مرکز منطقه ای اطلاع رساني علوم و فناوري - Assembling genomes on large-scale parallel computers

DocumentCode :

2041090

Title :

Assembling genomes on large-scale parallel computers

Author :

Kalyanaraman, Anantharaman ; Emrich, Scott J. ; Schnable, Patrick S. ; Aluru, Srinivas

Author_Institution :

Dept. of Electr. & Comput. Eng., Iowa State Univ., Ames, IA, USA

fYear :

2006

fDate :

25-29 April 2006

Abstract :

Assembly of large genomes from tens of millions of short genomic fragments is computationally demanding requiring hundreds of gigabytes of memory and tens of thousands of CPU hours. New gene-enrichment sequencing strategies are expected to further exacerbate this situation. In this paper, we present a massively parallel genome assembly framework. The unique features of our approach include space-efficient and on-demand algorithms that consume only linear space, and heuristic strategies that reduce the number of expensive pairwise sequence alignments while maintaining assembly quality. As part of the ongoing efforts in maize genome sequencing, we applied our assembly framework to the largest available collection of maize genomic data. We report the partitioning of more than 1.6 million fragments of over 1.25 billion nucleotides total size into genomic islands in 2 hours on 1,024 processors of an IBM BlueGene/L supercomputer.

Keywords :

biology computing; genetics; parallel processing; gene-enrichment sequencing strategy; heuristic strategy; large-scale parallel computer; maize genomic data; ondemand algorithm; pairwise sequence alignment; parallel genome assembly; space-efficient algorithm; Assembly; Bioinformatics; Biological cells; Biology computing; Concurrent computing; DNA; Genomics; Large-scale systems; Organisms; Sequences;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel and Distributed Processing Symposium, 2006. IPDPS 2006. 20th International

Print_ISBN :

1-4244-0054-6

Type :

conf

DOI :

10.1109/IPDPS.2006.1639259

Filename :

1639259

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2041090