• DocumentCode
    33149
  • Title

    Clustering based on multiple biological information: approach for predicting protein complexes

  • Author

    Xiwei Tang ; Qilong Feng ; Jianxin Wang ; Yiming He ; Yi Pan

  • Author_Institution
    Sch. of Inf. Sci. & Eng., Central South Univ., Changsha, China
  • Volume
    7
  • Issue
    5
  • fYear
    2013
  • fDate
    Oct-13
  • Firstpage
    223
  • Lastpage
    230
  • Abstract
    Protein complexes are a cornerstone of many biological processes. Protein-protein interaction (PPI) data enable a number of computational methods for predicting protein complexes. However, the insufficiency of the PPI data significantly lowers the accuracy of computational methods. In the current work, the authors develop a novel method named clustering based on multiple biological information (CMBI) to discover protein complexes via the integration of multiple biological resources including gene expression profiles, essential protein information and PPI data. First, CMBI defines the functional similarity of each pair of interacting proteins based on the edge-clustering coefficient and the Pearson correlation coefficient. Second, CMBI selects essential proteins as seeds to build the protein complexes. A redundancy-filtering procedure is performed to eliminate redundant complexes. In addition to the essential proteins, CMBI also uses other proteins as seeds to expand protein complexes. To check the performance of CMBI, the authors compare the complexes discovered by CMBI with the ones found by other techniques by matching the predicted complexes against the reference complexes. The authors use subsequently GO::TermFinder to analyse the complexes predicted by various methods. Finally, the effect of parameters T and R is investigated. The results from GO functional enrichment and matching analyses show that CMBI performs significantly better than the state-of-the-art methods.
  • Keywords
    bioinformatics; correlation methods; filtering theory; genetics; genomics; molecular biophysics; pattern clustering; proteins; GO functional enrichment; GO-TermFinder; Pearson correlation coefficient; biological processes; computational methods; edge-clustering coefficient; gene expression profiles; matching analysis; multiple biological information clustering; multiple biological resources; protein complexes; protein-protein interaction; redundancy-filtering procedure;
  • fLanguage
    English
  • Journal_Title
    Systems Biology, IET
  • Publisher
    iet
  • ISSN
    1751-8849
  • Type

    jour

  • DOI
    10.1049/iet-syb.2012.0052
  • Filename
    6616083