DocumentCode :
983762
Title :
Finding patterns on protein surfaces: algorithms and applications to protein classification
Author :
Wang, Xiong
Author_Institution :
Dept. of Comput. Sci., California State Univ., Fullerton, CA, USA
Volume :
17
Issue :
8
fYear :
2005
Firstpage :
1065
Lastpage :
1078
Abstract :
A successful application of data mining to bioinformatics is protein classification. A number of techniques have been developed to classify proteins according to important features in their sequences, secondary structures, or three-dimensional structures. In this paper, we introduce a novel approach to protein classification based on significant patterns discovered on the surface of a protein. We define a notion called α-surface. We discuss the geometric properties of α-surface and present an algorithm that calculates the α-surface from a finite set of points in R3. We apply the algorithm to extracting the α-surface of a protein and use a pattern discovery algorithm to discover frequently occurring patterns on the surfaces. The pattern discovery algorithm utilizes a new index structure called the ΔB+ tree. We use these patterns to classify the proteins. While most existing techniques focus on the binary classification problem, we apply our approach to classifying three families of proteins. Experimental results show the good performance of the proposed approach.
Keywords :
biochemistry; biology computing; data mining; pattern classification; proteins; α-surface; binary classification problem; biochemistry; bioinformatics; data mining; geometric properties; index structure; medicine; pattern discovery algorithm; protein classification; Biochemistry; Bioinformatics; Classification algorithms; Data mining; Drugs; Fingerprint recognition; Pharmaceuticals; Protein engineering; Sequences; Spatial databases; Index Terms- KDD; biochemistry; classification; data mining; medicine.; structural pattern discovery;
fLanguage :
English
Journal_Title :
Knowledge and Data Engineering, IEEE Transactions on
Publisher :
ieee
ISSN :
1041-4347
Type :
jour
DOI :
10.1109/TKDE.2005.126
Filename :
1458700
Link To Document :
بازگشت