DocumentCode :
2900076
Title :
A Parallel Algorithm for Solving LCS of Multiple Bioseqences
Author :
Liu, Wei ; Chen, Ling
Author_Institution :
Dept. of Comput. Sci., Yangzhou Univ.
fYear :
2006
fDate :
13-16 Aug. 2006
Firstpage :
4316
Lastpage :
4321
Abstract :
A fast algorithm named FAST_LCS is presented for the longest common substring (LCS) problem. The algorithm firstly seeks the successors of the initial identical character tuples according to successor tables to obtain all the identical tuples and their levels. Then the result of LCS can be obtained by tracing back from the identical character tuple with the largest level. For n biosequences X 1, X2, ..., Xn, the time complexity of sequentially execution of our algorithm is O(L). Here L is the number of identical character tuples. The time complexity of our parallel algorithm is O(|LCS|), where LCS is the length of the longest common substring of X1, X2, ... , Xn. This time complexity is independent of the number of sequences n. Some experiments have been conducted on the gene sequences of tigr database using MPP parallel computer Shenteng 1800. The results show that our algorithm is accurate and more efficient than other LCS algorithms
Keywords :
biology computing; computational complexity; database management systems; genetics; parallel algorithms; sequences; string matching; Shenteng 1800 MPP parallel computer; gene sequences; longest common substring problem; multiple bioseqences; parallel algorithm; tigr database; time complexity; Bioinformatics; Biology computing; Computer science; Concurrent computing; Cybernetics; DNA; Databases; Machine learning; Parallel algorithms; Proteins; Sequences; Software algorithms; Bioinformatics; identical character tuple; longest common subsequence;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Machine Learning and Cybernetics, 2006 International Conference on
Conference_Location :
Dalian, China
Print_ISBN :
1-4244-0061-9
Type :
conf
DOI :
10.1109/ICMLC.2006.259020
Filename :
4028832
Link To Document :
بازگشت