• DocumentCode
    875882
  • Title

    Combining Multisource Information Through Functional-Annotation-Based Weighting: Gene Function Prediction in Yeast

  • Author

    Ray, Shubhra Sankar ; Bandyopadhyay, Sanghamitra ; Pal, Sankar K.

  • Author_Institution
    Center for Soft Comput. Res., Indian Stat. Inst., Kolkata
  • Volume
    56
  • Issue
    2
  • fYear
    2009
  • Firstpage
    229
  • Lastpage
    236
  • Abstract
    Motivation: One of the important goals of biological investigation is to predict the function of unclassified gene. Although there is a rich literature on multi data source integration for gene function prediction, there is hardly any similar work in the framework of data source weighting using functional annotations of classified genes. In this investigation, we propose a new scoring framework, called biological score (BS) and incorporating data source weighting, for predicting the function of some of the unclassified yeast genes. Methods: The BS is computed by first evaluating the similarities between genes, arising from different data sources, in a common framework, and then integrating them in a linear combination style through weights. The relative weight of each data source is determined adaptively by utilizing the information on yeast gene ontology (GO)-slim process annotations of classified genes, available from Saccharomyces Genome Database (SGD). Genes are clustered by a method called K-BS, where, for each gene, a cluster comprising that gene and its K nearest neighbors is computed using the proposed score (BS). The performances of BS and K-BS are evaluated with gene annotations available from Munich Information Center for Protein Sequences (MIPS). Results: We predict the functional categories of 417 classified genes from 417 clusters with 0.98 positive predictive value using K-BS. The functional categories of 12 unclassified yeast genes are also predicted. Conclusion: Our experimental results indicate that considering multiple data sources and estimating their weights with annotations of classified genes can considerably enhance the performance of BS. It has been found that even a small proportion of annotated genes can provide improvements in finding true positive gene pairs using BS.
  • Keywords
    bioinformatics; genetics; Saccharomyces Genome Database; combinatorial optimization; combining multisource information; functional-annotation-based weighting; gene expression; gene function prediction; phenotypic profile; protein sequence; transitive homology; yeast gene ontology-slim process annotations; Bayesian methods; Bioinformatics; Biology computing; Clustering algorithms; Databases; Fungi; Gene expression; Genomics; Iron; Nearest neighbor searches; Ontologies; Postal services; Proteins; Throughput; Bioinformatics; combinatorial optimization; gene expression; phenotypic profile; protein sequence; transitive homology; Cluster Analysis; Computational Biology; Databases, Genetic; Gene Expression Profiling; Genes, Fungal; Models, Genetic; Oligonucleotide Array Sequence Analysis; Protein Interaction Mapping; Reproducibility of Results; Saccharomyces cerevisiae; Saccharomyces cerevisiae Proteins; Sequence Analysis, Protein;
  • fLanguage
    English
  • Journal_Title
    Biomedical Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9294
  • Type

    jour

  • DOI
    10.1109/TBME.2008.2005955
  • Filename
    4636709