DocumentCode
2900076
Title
A Parallel Algorithm for Solving LCS of Multiple Bioseqences
Author
Liu, Wei ; Chen, Ling
Author_Institution
Dept. of Comput. Sci., Yangzhou Univ.
fYear
2006
fDate
13-16 Aug. 2006
Firstpage
4316
Lastpage
4321
Abstract
A fast algorithm named FAST_LCS is presented for the longest common substring (LCS) problem. The algorithm firstly seeks the successors of the initial identical character tuples according to successor tables to obtain all the identical tuples and their levels. Then the result of LCS can be obtained by tracing back from the identical character tuple with the largest level. For n biosequences X 1, X2, ..., Xn, the time complexity of sequentially execution of our algorithm is O(L). Here L is the number of identical character tuples. The time complexity of our parallel algorithm is O(|LCS|), where LCS is the length of the longest common substring of X1, X2, ... , Xn. This time complexity is independent of the number of sequences n. Some experiments have been conducted on the gene sequences of tigr database using MPP parallel computer Shenteng 1800. The results show that our algorithm is accurate and more efficient than other LCS algorithms
Keywords
biology computing; computational complexity; database management systems; genetics; parallel algorithms; sequences; string matching; Shenteng 1800 MPP parallel computer; gene sequences; longest common substring problem; multiple bioseqences; parallel algorithm; tigr database; time complexity; Bioinformatics; Biology computing; Computer science; Concurrent computing; Cybernetics; DNA; Databases; Machine learning; Parallel algorithms; Proteins; Sequences; Software algorithms; Bioinformatics; identical character tuple; longest common subsequence;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics, 2006 International Conference on
Conference_Location
Dalian, China
Print_ISBN
1-4244-0061-9
Type
conf
DOI
10.1109/ICMLC.2006.259020
Filename
4028832
Link To Document