DocumentCode
60018
Title
Identification of Protein Complexes from Tandem Affinity Purification/Mass Spectrometry Data via Biased Random Walk
Author
Bingjing Cai ; Haiying Wang ; Huiru Zheng ; Hui Wang
Author_Institution
Sch. of Comput. & Math., Univ. of Ulster, Newtownabbey, UK
Volume
12
Issue
2
fYear
2015
fDate
March-April 2015
Firstpage
455
Lastpage
466
Abstract
Systematic identification of protein complexes from protein-protein interaction networks (PPIs) is an important application of data mining in life science. Over the past decades, various new clustering techniques have been developed based on modelling PPIs as binary relations. Non-binary information of co-complex relations (prey/bait) in PPIs data derived from tandem affinity purification/mass spectrometry (TAP-MS) experiments has been unfairly disregarded. In this paper, we propose a Biased Random Walk based algorithm for detecting protein complexes from TAP-MS data, resulting in the random walk with restarting baits (RWRB). RWRB is developed based on Random walk with restart. The main contribution of RWRB is the incorporation of co-complex relations in TAP-MS PPI networks into the clustering process, by implementing a new restarting strategy during the process of random walk. Through experimentation on un-weighted and weighted TAP-MS data sets, we validated biological significance of our results by mapping them to manually curated complexes. Results showed that, by incorporating non-binary, co-membership information, significant improvement has been achieved in terms of both statistical measurements and biological relevance. Better accuracy demonstrates that the proposed method outperformed several state-of-the-art clustering algorithms for the detection of protein complexes in TAP-MS data.
Keywords
bioinformatics; data mining; mass spectroscopic chemical analysis; molecular biophysics; molecular configurations; pattern clustering; proteins; purification; random processes; statistical analysis; two-dimensional electron gas; TAP-MS data; biased random walk based algorithm; binary relations; clustering process; cocomplex relations; data mining; life science; manually curated complexes; nonbinary comembership information; nonbinary information; protein complexes identification; protein-protein interaction networks; restarting baits; state-of-the-art clustering algorithms; statistical measurements; systematic identification; tandem affinity purification-mass spectrometry data; Algorithm design and analysis; Clustering algorithms; Prediction algorithms; Proteins; Sensitivity; Tuning; Vectors; Protein-protein interaction network; protein complexes; random walk with restart; tandem affinity purification/mass spectrometry;
fLanguage
English
Journal_Title
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
Publisher
ieee
ISSN
1545-5963
Type
jour
DOI
10.1109/TCBB.2014.2352616
Filename
6894198
Link To Document