DocumentCode :
3714359
Title :
Predicting Protein-protein interaction using co-occurring Aligned Pattern Clusters
Author :
Antonio Sze-To;Sanderz Fung;En-Shiun Annie Lee;Andrew K. C. Wong
Author_Institution :
Systems Design Engineering, University of Waterloo, Canada
fYear :
2015
Firstpage :
55
Lastpage :
60
Abstract :
Understanding Protein-protein interaction (PPI) is of fundamental importance in deciphering cellular processes. Predicting PPIs is thus critical in making new discoveries in the biological domains. Traditionally, new PPIs are identified through biochemical experiments but such methods are labor-intensive, expensive, time-consuming and technically ineffective due to high false positive rates. Computational docking is an alternative but requires the three-dimensional structures of the target proteins which are not always accessible. Sequence-based prediction is the most readily applicable and cost-effective method. It exploits known PPI Databases to construct classifiers for predicting unknown PPIs based only on sequence data. However, existing methods, adopting features that fix the pattern length and use exact patterns, are biologically unrealistic. Also, those based on SVM and String Kernel are hardly biologically interpretable since they do not compute the features. Recently, we have developed a new method for predicting PPI known as WeMine-P2P based on our WeMine Aligned Pattern Clustering algorithm which discovers and identifies the localized and co-occurring conserved patterns and regions allowing variable length and pattern variations. As our first attempt, under 40 independent experiments, we showed that (1) WeMine-P2P outperforms the well-known algorithm, PIPE2 which also utilizes co-occurring amino acid sequence segments but does not allow variable lengths and pattern variations; (2) Unlike SVM-based methods, WeMine-P2P renders interpretable biological features; (3) WeMine-P2P achieves satisfactory PPI prediction performance, comparable to the SVM-based methods particularly in unseen protein sequences, with a potential reduction of feature dimension of 1280x. WeMine-P2P is extendable to other biosequence interactions such as predicting Protein-DNA interactions.
Keywords :
"Proteins","Support vector machines","Training"
Publisher :
ieee
Conference_Titel :
Bioinformatics and Biomedicine (BIBM), 2015 IEEE International Conference on
Type :
conf
DOI :
10.1109/BIBM.2015.7359655
Filename :
7359655
Link To Document :
بازگشت