Title :
CTSS: a robust and efficient method for protein structure alignment based on local geometrical and biological features
Author :
Can, Tolga ; Wang, Yuan-Fang
Author_Institution :
Dept. Comput. Sci., California Univ., Santa Barbara, CA, USA
Abstract :
We present a new method for conducting protein structure similarity searches, which improves on the accuracy, robustness, and efficiency of some existing techniques. Our method is grounded in the theory of differential geometry on 3D space curve matching. We generate shape signatures for proteins that are invariant, localized, robust, compact, and biologically meaningful. To improve matching accuracy, we smooth the noisy raw atomic coordinate data with spline fitting. To improve matching efficiency, we adopt a hierarchical coarse-to-fine strategy. We use an efficient hashing-based technique to screen out unlikely candidates and perform detailed pairwise alignments only for a small number of candidates that survive the screening process. Contrary to other hashing based techniques, our technique employs domain specific information (not just geometric information) in constructing the hash key, and hence, is more tuned to the domain of biology. Furthermore, the invariancy, localization, and compactness of the shape signatures allow us to utilize a well-known local sequence alignment algorithm for aligning two protein structures. One measure of the efficacy of the proposed technique is that we were able to discover new, meaningful motifs that were not reported by other structure alignment methods.
Keywords :
biology computing; differential geometry; genetic algorithms; molecular biophysics; proteins; splines (mathematics); string matching; 3D space curve matching; biological features; conducting protein structure similarity searches; detailed pairwise alignments; differential geometry theory; domain specific information; hash key; hashing-based technique; hierarchical coarse-to-fine strategy; local sequence alignment algorithm; matching accuracy; matching efficiency; noisy raw atomic coordinate data; protein structure alignment; proteins structure; screening process; shape signatures; spline fitting; Atomic measurements; Biology; Computer science; Feature extraction; Geometry; Noise shaping; Protein engineering; Robustness; Sequences; Shape;
Conference_Titel :
Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE
Print_ISBN :
0-7695-2000-6
DOI :
10.1109/CSB.2003.1227316