• DocumentCode
    1521263
  • Title

    Functional Neighbors: Inferring Relationships between Nonhomologous Protein Families Using Family-Specific Packing Motifs

  • Author

    Bandyopadhyay, Deepak ; Huan, Jun ; Liu, Jinze ; Prins, Jan ; Snoeyink, Jack ; Wang, Wei ; Tropsha, Alexander

  • Author_Institution
    Dept. of Comput. & Struct. Chem., GlaxoSmithKline, Collegeville, PA, USA
  • Volume
    14
  • Issue
    5
  • fYear
    2010
  • Firstpage
    1137
  • Lastpage
    1143
  • Abstract
    We describe a new approach for inferring the functional relationships between nonhomologous protein families by looking at statistical enrichment of alternative function predictions in classification hierarchies such as Gene Ontology (GO) and Structural Classification of Proteins (SCOP). Protein structures are represented by robust graph representations, and the fast frequent subgraph mining algorithm is applied to protein families to generate sets of family-specific packing motifs, i.e., amino acid residue-packing patterns shared by most family members but infrequent in other proteins. The function of a protein is inferred by identifying in it motifs characteristic of a known family. We employ these family-specific motifs to elucidate functional relationships between families in the GO and SCOP hierarchies. Specifically, we postulate that two families are functionally related if one family is statistically enriched by motifs characteristic of another family, i.e., if the number of proteins in a family containing a motif from another family is greater than expected by chance. This function-inference method can help annotate proteins of unknown function, establish functional neighbors of existing families, and help specify alternate functions for known proteins.
  • Keywords
    bioinformatics; genetics; genomics; inference mechanisms; molecular biophysics; proteins; GO; SCOP; amino acid residue-packing patterns; function-inference method; gene ontology; nonhomologous protein families; structural classification of proteins; subgraph mining algorithm; Delaunay tessellation; Gene Ontology (GO); Structural Classification of Proteins (SCOP); enrichment evaluation; frequent subgraph mining; functional neighbors; protein structure; remote homology; Algorithms; Computational Biology; Data Mining; Genomics; Models, Molecular; NADP; Nuclear Proteins; Phosphoprotein Phosphatases; Protein Conformation; Protein Interaction Domains and Motifs; Proteins;
  • fLanguage
    English
  • Journal_Title
    Information Technology in Biomedicine, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1089-7771
  • Type

    jour

  • DOI
    10.1109/TITB.2010.2053550
  • Filename
    5491173