مرکز منطقه ای اطلاع رساني علوم و فناوري - Searching in parallel for similar strings [biological sequences]

DocumentCode :

1184526

Title :

Searching in parallel for similar strings [biological sequences]

Author :

Rigoutsos, Isidore ; Califano, Andrea

Author_Institution :

IBM Thomas J. Watson Res. Center, Yorktown Heights, NY, USA

Volume :

Issue :

fYear :

1994

Firstpage :

Lastpage :

Abstract :

Distributed computation, probabilistic indexing and hashing techniques combine to create a novel approach to processing very large biological-sequence databases. Other data-intensive tasks could also benefit. Our indexing-based approach enables fast similarity searching through a large database of strings. Thanks to a redundant table-lookup scheme, recovering database items that match a test sequence requires minimal data access. We have implemented a uniprocessor version of this approach called Flash (Fast Lookup Algorithm for String Homology) as well as a distributed version, dFlash, using a cluster of seven non-dedicated workstations connected through a local area network. In this article, we present an approach for retrieving homologies in databases of proteins.<>

Keywords :

biology computing; distributed algorithms; file organisation; indexing; proteins; very large databases; Fast Lookup Algorithm for String Homology; Flash; biological-sequence databases; dFlash; data-intensive tasks; database item recovery; distributed computation; fast similarity searching; hashing techniques; local area network; minimal data access; nondedicated workstation cluster; parallel searching; probabilistic indexing; proteins; redundant table-lookup scheme; similar strings; uniprocessor version; Biology computing; Clustering algorithms; Distributed computing; Distributed databases; Indexing; Information retrieval; Local area networks; Proteins; Testing; Workstations;

fLanguage :

English

Journal_Title :

Computational Science & Engineering, IEEE

Publisher :

ieee

ISSN :

1070-9924

Type :

jour

DOI :

10.1109/99.326666

Filename :

326666

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1184526