DocumentCode :
3614633
Title :
Towards index-based similarity search for protein structure databases
Author :
O. Camoglu;T. Kahveci;A.K. Singh
Author_Institution :
Dept. of Comput. Sci., California Univ., Santa Barbara, CA, USA
fYear :
2003
fDate :
6/25/1905 12:00:00 AM
Firstpage :
148
Lastpage :
158
Abstract :
We propose two methods for finding similarities in protein structure databases. Our techniques extract feature vectors on triplets of SSEs (secondary structure elements) of proteins. These feature vectors are then indexed using a multidimensional index structure. Our first technique considers the problem of finding proteins similar to a given query protein in a protein dataset. This technique quickly finds promising proteins using the index structure. These proteins are then aligned to the query protein using a popular pairwise alignment tool such as VAST. We also develop a novel statistical model to estimate the goodness of a match using the SSEs. Our second technique considers the problem of joining two protein datasets to find an all-to-all similarity. Experimental results show that our techniques improve the pruning time of VAST3 to 3.5 times while keeping the sensitivity similar.
Keywords :
"Proteins","Iterative algorithms","Atomic measurements","Dynamic programming","Spatial databases","Amino acids","Heuristic algorithms","Computer science","Feature extraction","Multidimensional systems"
Publisher :
ieee
Conference_Titel :
Bioinformatics Conference, 2003. CSB 2003. Proceedings of the 2003 IEEE
Print_ISBN :
0-7695-2000-6
Type :
conf
DOI :
10.1109/CSB.2003.1227314
Filename :
1227314
Link To Document :
بازگشت