• DocumentCode
    60018
  • Title

    Identification of Protein Complexes from Tandem Affinity Purification/Mass Spectrometry Data via Biased Random Walk

  • Author

    Bingjing Cai ; Haiying Wang ; Huiru Zheng ; Hui Wang

  • Author_Institution
    Sch. of Comput. & Math., Univ. of Ulster, Newtownabbey, UK
  • Volume
    12
  • Issue
    2
  • fYear
    2015
  • fDate
    March-April 2015
  • Firstpage
    455
  • Lastpage
    466
  • Abstract
    Systematic identification of protein complexes from protein-protein interaction networks (PPIs) is an important application of data mining in life science. Over the past decades, various new clustering techniques have been developed based on modelling PPIs as binary relations. Non-binary information of co-complex relations (prey/bait) in PPIs data derived from tandem affinity purification/mass spectrometry (TAP-MS) experiments has been unfairly disregarded. In this paper, we propose a Biased Random Walk based algorithm for detecting protein complexes from TAP-MS data, resulting in the random walk with restarting baits (RWRB). RWRB is developed based on Random walk with restart. The main contribution of RWRB is the incorporation of co-complex relations in TAP-MS PPI networks into the clustering process, by implementing a new restarting strategy during the process of random walk. Through experimentation on un-weighted and weighted TAP-MS data sets, we validated biological significance of our results by mapping them to manually curated complexes. Results showed that, by incorporating non-binary, co-membership information, significant improvement has been achieved in terms of both statistical measurements and biological relevance. Better accuracy demonstrates that the proposed method outperformed several state-of-the-art clustering algorithms for the detection of protein complexes in TAP-MS data.
  • Keywords
    bioinformatics; data mining; mass spectroscopic chemical analysis; molecular biophysics; molecular configurations; pattern clustering; proteins; purification; random processes; statistical analysis; two-dimensional electron gas; TAP-MS data; biased random walk based algorithm; binary relations; clustering process; cocomplex relations; data mining; life science; manually curated complexes; nonbinary comembership information; nonbinary information; protein complexes identification; protein-protein interaction networks; restarting baits; state-of-the-art clustering algorithms; statistical measurements; systematic identification; tandem affinity purification-mass spectrometry data; Algorithm design and analysis; Clustering algorithms; Prediction algorithms; Proteins; Sensitivity; Tuning; Vectors; Protein-protein interaction network; protein complexes; random walk with restart; tandem affinity purification/mass spectrometry;
  • fLanguage
    English
  • Journal_Title
    Computational Biology and Bioinformatics, IEEE/ACM Transactions on
  • Publisher
    ieee
  • ISSN
    1545-5963
  • Type

    jour

  • DOI
    10.1109/TCBB.2014.2352616
  • Filename
    6894198