Title :
Determining the Semantic Similarities Among Gene Ontology Terms
Author_Institution :
Dept. of Electr. & Comput. Eng., Khalifa Univ. of Sci., Abu Dhabi, United Arab Emirates
Abstract :
We present in this paper novel techniques that determine the semantic relationships among Gene Ontology (GO) terms. We implemented these techniques in a prototype system called GoSE, which resides between user application and GO database. Given a set S of GO terms, GoSE would return another set S´ of GO terms, where each term in S´ is semantically related to each term in S. Most current research is focused on determining the semantic similarities among GO ontology terms based solely on their IDs and proximity to one another in the GO graph structure, while overlooking the contexts of the terms, which may lead to erroneous results. The context of a GO term T is the set of other terms, whose existence in the GO graph structure is dependent on T . We propose novel techniques that determine the contexts of terms based on the concept of existence dependency. We present a stack-based sort-merge algorithm employing these techniques for determining the semantic similarities among GO terms. We evaluated GoSE experimentally and compared it with three existing methods. The results of measuring the semantic similarities among genes in KEGG and Pfam pathways retrieved from the DBGET and Sanger Pfam databases, respectively, have shown that our method outperforms the other three methods in recall and precision.
Keywords :
biology computing; genetics; graph theory; ontologies (artificial intelligence); DBGET databases; GoSE; KEGG pathways; Sanger Pfam databases; gene ontology database; gene ontology graph structure; gene ontology terms; semantic similarity; stack-based sort-merge algorithm; Color; Context; Databases; Ontologies; Proteins; Semantics; Gene ontology (GO); related terms; semantic similarity;
Journal_Title :
Biomedical and Health Informatics, IEEE Journal of
DOI :
10.1109/JBHI.2013.2248742