Title :
A New Path Based Hybrid Measure for Gene Ontology Similarity
Author :
Bandyopadhyay, Supriyo ; Mallick, Koushik
Author_Institution :
Machine Intell. Unit, Indian Stat. Inst., Kolkata, India
Abstract :
Gene Ontology (GO) consists of a controlled vocabulary of terms, annotating a gene or gene product, structured in a directed acyclic graph. In the graph, semantic relations connect the terms, that represent the knowledge of functional description and cellular component information of gene products. GO similarity gives us a numerical representation of biological relationship between a gene set, which can be used to infer various biological facts such as protein interaction, structural similarity, gene clustering, etc. Here we introduce a new shortest path based hybrid measure of ontological similarity between two terms which combines both structure of the GO graph and information content of the terms. Here the similarity between two terms t1 and t2, referred to as GOSimPBHM(t1,t2), has two components; one obtained from the common ancestors of t1 and t2. The other from their remaining ancestors. The proposed path based hybrid measure does not suffer from the well-known shallow annotation problem. Its superiority with respect to some other popular measures is established for protein protein interaction prediction, correlation with gene expression and functional classification of genes in a biological pathway. Finally, the proposed measure is utilized to compute the average GO similarity score among the genes that are experimentally validated targets of some microRNAs. Results demonstrate that the targets of a given miRNA have a high degree of similarity in the biological process category of GO.
Keywords :
RNA; cellular biophysics; genetics; molecular biophysics; numerical analysis; ontologies (artificial intelligence); proteins; vocabulary; GO graph; ancestors; biological pathway; biological process; cellular component; controlled vocabulary; directed acyclic graph; functional classification; gene clustering; gene expression; gene ontology similarity; gene product; gene products; miRNA; microRNA; numerical representation; protein structural similarity; protein-protein interaction; semantic relations; Correlation; Gene expression; Integrated circuits; Ontologies; Proteins; Semantics; Gene ontology similarity; functional classification of genes; information content; microRNA; protein interaction prediction; semantic similarity; term similarity;
Journal_Title :
Computational Biology and Bioinformatics, IEEE/ACM Transactions on
DOI :
10.1109/TCBB.2013.149