DocumentCode :
50830
Title :
Finding All Longest Common Segments in Protein Structures Efficiently
Author :
Yen Kaow Ng ; Linzhi Yin ; Ono, Hirotaka ; Shuai Cheng Li
Author_Institution :
Dept. of Comput. Sci., Univ. Tunku Abdul Rahman, Kampar, Malaysia
Volume :
12
Issue :
3
fYear :
2015
fDate :
May-June 1 2015
Firstpage :
644
Lastpage :
655
Abstract :
The Local/Global Alignment (Zemla, 2003), or LGA, is a popular method for the comparison of protein structures. One of the two components of LGA requires us to compute the longest common contiguous segments between two protein structures. That is, given two structures A = (a1, ... ,an) and B = (b1, ... ,bn) where ak, bk ϵ ℝ3, we are to find, among all the segments f = (ai, ... ,aj) and g = (bi, ... ,bj) that fulfill a certain criterion regarding their similarity, those of the maximum length. We consider the following criteria: (1) the root mean squared deviation (RMSD) between f and g is to be within a given t E R; (2) f and g can be superposed such that for each k, i ≤ k ≤ j, ||ak - bk|| ≤ t for a given t E R. We give an algorithm of O(n log n + ni) time complexity when the first requirement applies, where I is the maximum length of the segments fulfilling the criterion. We show an FPTAS which, for any ϵ ℝ, finds a segment of length at least l, but of RMSD up to (1 + ϵ)t, in O(n log n + n=ϵ) time. We propose an FPTAS which for any given ϵ R, finds all the segments f and g of the maximum length which can be superposed such that for each k, i ≤ k ≤ j, ||ak - bk || ≤ (1 + ϵ)t, thus fulfilling the second requirement approximately. The algorithm has a time complexity of O(n log2 n=ϵ5) when consecutive points in A are separated by the same distance (which is the case with protein structures). These worst-case runtime complexities are verified using C++ implementations of the algorithms, which we have made available at http://alcs.sourceforge.net/.
Keywords :
bioinformatics; computational complexity; molecular biophysics; molecular configurations; proteins; C++ implementations; FPTAS; LGA; O(n log n + ni) time complexity; RMSD; local/global alignment; longest common segments; protein structures; root mean squared deviation; Approximation algorithms; Bioinformatics; Computational biology; IEEE transactions; Proteins; Runtime; Time complexity; LGA; Local/Global Alignment; Local/Global Alignment (LGA); PTAS; RMSD; longest common structure;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2014.2372782
Filename :
6963458
Link To Document :
بازگشت