DocumentCode :
1756804
Title :
UDoNC: An Algorithm for Identifying Essential Proteins Based on Protein Domains and Protein-Protein Interaction Networks
Author :
Wei Peng ; Jianxin Wang ; Yingjiao Cheng ; Yu Lu ; Fangxiang Wu ; Yi Pan
Author_Institution :
Sch. of Inf. Sci. & Eng., Central South Univ., Changsha, China
Volume :
12
Issue :
2
fYear :
2015
fDate :
March-April 2015
Firstpage :
276
Lastpage :
288
Abstract :
Prediction of essential proteins which are crucial to an organism´s survival is important for disease analysis and drug design, as well as the understanding of cellular life. The majority of prediction methods infer the possibility of proteins to be essential by using the network topology. However, these methods are limited to the completeness of available protein-protein interaction (PPI) data and depend on the network accuracy. To overcome these limitations, some computational methods have been proposed. However, seldom of them solve this problem by taking consideration of protein domains. In this work, we first analyze the correlation between the essentiality of proteins and their domain features based on data of 13 species. We find that the proteins containing more protein domain types which rarely occur in other proteins tend to be essential. Accordingly, we propose a new prediction method, named UDoNC, by combining the domain features of proteins with their topological properties in PPI network. In UDoNC, the essentiality of proteins is decided by the number and the frequency of their protein domain types, as well as the essentiality of their adjacent edges measured by edge clustering coefficient. The experimental results on S. cerevisiae data show that UDoNC outperforms other existing methods in terms of area under the curve (AUC). Additionally, UDoNC can also perform well in predicting essential proteins on data of E. coli.
Keywords :
biology computing; cellular biophysics; feature extraction; molecular biophysics; molecular configurations; pattern clustering; proteins; E. coli; PPI network; S. cerevisiae data; UDoNC; area under-the-curve; cellular life; disease analysis; domain features; drug design; edge clustering coefficient; essential protein identification algorithm; network accuracy; network topology; protein domain types; protein-protein interaction networks; Bioinformatics; Computational biology; Correlation; Educational institutions; Frequency-domain analysis; IEEE transactions; Proteins; Essential proteins; protein domain; protein-protein interaction networks;
fLanguage :
English
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher :
ieee
ISSN :
1545-5963
Type :
jour
DOI :
10.1109/TCBB.2014.2338317
Filename :
6853342
Link To Document :
بازگشت