• DocumentCode
    2370515
  • Title

    Domain content based protein function prediction using incomplete GO annotation information

  • Author

    Tan, Lirong ; Yu, Zhiwen ; Wong, Hau-San

  • Author_Institution
    Dept. of Comput. Sci., City Univ. of Hong Kong, Hong Kong, China
  • fYear
    2009
  • fDate
    1-4 Nov. 2009
  • Firstpage
    50
  • Lastpage
    55
  • Abstract
    Given the essential role of protein in life processes, computational assignment of protein functions has become one of the most important tasks in the area of bioinformatics. While Gene Ontology (GO) has been widely used in functional annotation, new approaches to address the problem of annotation incompleteness, which can leverage the support of the GO framework, are imminently required. In this paper, two new models are proposed to predict GO terms from domain content: a Correlation Coefficient based model (CC-M) and a Support Vector Machine (SVM) based model (SVM-M). We have developed our models in the form of predictors for all GO terms with manually curated annotations. In comparison with the Bayesian probabilistic approach published previously [Forslund et al., 2008], our methods are demonstrated to have better capability in dealing with incomplete training data. In particular, the CC-M method is suitable for GO terms with extremely low occurrence frequency, and the SVM-M method for the remaining GO terms. Therefore, CC-M and SVM-M are subsequently integrated into a single model (CC-SVM), with their respective advantages combined.
  • Keywords
    Bayes methods; bioinformatics; ontologies (artificial intelligence); proteins; support vector machines; Bayesian probabilistic approach; GO annotation information; bioinformatics; correlation coefficient-based model; domain content-based protein function prediction; gene ontology; support vector machine; Bayesian methods; Bioinformatics; Biology computing; Computer science; Databases; Ontologies; Predictive models; Protein engineering; Support vector machines; Training data; Domain; GO term; Protein function prediction;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Bioinformatics and Biomedicine Workshop, 2009. BIBMW 2009. IEEE International Conference on
  • Conference_Location
    Washington, DC
  • Print_ISBN
    978-1-4244-5121-0
  • Type

    conf

  • DOI
    10.1109/BIBMW.2009.5332136
  • Filename
    5332136