A Fast Parallel Longest Common Subsequence Algorithm Based on Pruning Rules

Author

Wei Liu

Author_Institution

Dept. of Comput. Sci., Yangzhou Univ.

Volume

1

fYear

2006

fDate

20-24 June 2006

Firstpage

27

Lastpage

34

Abstract

Searching for the longest common subsequence (LCS) of biosequences is one of the most important problems in bioinformatics. A fast algorithm for LCS problem FAST_LCS is presented. The algorithm first seeks the successors of the initial identical character pairs according to a successor table to obtain all the identical pairs and their levels. By tracing back from the identical character pair at the highest level, strong pruning rules are developed. For two sequences X and Y with length n and m, respectively, the memory required for FAST_LCS is max{4*(n+1)+4*(m+1), L}, where L is the number of identical character pairs. The time complexity of parallel computing is O(|LCS(X,Y)|), where |LCS(X,Y)| is the length of the LCS of X, Y. Experimental result on the gene sequences of tigr database using MPP parallel computer Shenteng 1800 shows that our algorithm can find the exact solutions significantly more efficiently than other LCS algorithms

Keywords

biology computing; computational complexity; parallel algorithms; FAST_LCS; LCS problem; MPP parallel computer; fast parallel longest common subsequence algorithm; gene sequence; parallel computing; pruning rules; tigr database; time complexity; Bioinformatics; Biology computing; Computer science; Concurrent computing; DNA; Databases; Dynamic programming; Genomics; Parallel processing; Sequences; Bioinformatics; identical character pair; longest common; subsequence;

fLanguage

English

Publisher

ieee

Conference_Titel

Computer and Computational Sciences, 2006. IMSCCS '06. First International Multi-Symposiums on

Conference_Location

Hanzhou, Zhejiang

Print_ISBN

0-7695-2581-4

Type

conf

DOI

10.1109/IMSCCS.2006.6

Filename

4673521