مرکز منطقه ای اطلاع رساني علوم و فناوري - Constructing Similarity Graphs from Large-Scale Biological Sequence Collections

DocumentCode :

167336

Title :

Constructing Similarity Graphs from Large-Scale Biological Sequence Collections

Author :

Zola, Jaroslaw

Author_Institution :

Rutgers Discovery Inf. Inst., Rutgers Univ., Piscataway, NJ, USA

fYear :

2014

fDate :

19-23 May 2014

Firstpage :

500

Lastpage :

507

Abstract :

Detecting similar pairs in large biological sequence collections is one of the most commonly performed tasks in computational biology. With the advent of high throughput sequencing technologies the problem regained significance as data sets with millions of sequences became ubiquitous. This paper is an initial report on our parallel, distributed memory and sketching-based approach to constructing large-scale sequence similarity graphs. We develop load balancing techniques, derived from multi-way number partitioning and work stealing, to manage computational imbalance and ensure scalability on thousands of processors. Our experimental results show that the method is efficient, and can be used to analyze data sets with millions of DNA sequences in acceptable time limits.

Keywords :

biocomputing; data analysis; graph theory; resource allocation; DNA sequences; computational biology; data set analysis; large-scale biological sequence collections; large-scale sequence similarity graphs; load balancing techniques; multiway number partitioning; parallel distributed memory; processors; sketching-based approach; throughput sequencing technologies; work stealing; Biology; Indexes; Load management; Matrix decomposition; Program processors; Scalability; Silicon; load balancing; min-wise independent permutations; parallel computational biology; sequence similarity;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International

Conference_Location :

Phoenix, AZ

Print_ISBN :

978-1-4799-4117-9

Type :

conf

DOI :

10.1109/IPDPSW.2014.63

Filename :

6969429

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=167336