DocumentCode :
3673185
Title :
Association rule mining of gene ontology annotation terms for SGD
Author :
Anurag Nagar;Michael Hahsler;Hisham Al-Mubaid
Author_Institution :
Department of Computer Science, University of Houston - Clear Lake, Houston, TX 77058
fYear :
2015
Firstpage :
1
Lastpage :
7
Abstract :
Gene Ontology is one of the largest bioinformatics project that seeks to consolidate knowledge about genes through annotation of terms to three ontologies. In this work, we present a technique to find association relationships in the annotation terms for the Saccharomyces cerevisiae (SGD) genome. We first present a normalization algorithm to ensure that the annotation terms have a similar level of specificity. Association rule mining algorithms are used to find significant and non-trivial association rules in these normalized datasets. Metrics such as support, confidence, and lift can be used to evaluate the strength of found rules. We conducted experiments on the entire SGD annotation dataset and here we present the top 10 strongest rules for each of the three ontologies. We verify the found rules using evidence from the biomedical literature. The presented method has a number of advantages - it relies only on the structure of the gene ontology, has minimal memory and storage requirements, and can be easily scaled for large genomes, such as the human genome. There are many applications of this technique, such as predicting the GO annotations for new genes or those that have not been studied extensively.
Keywords :
"Ontologies","Yttrium","Association rules","Proteins","Bioinformatics","Itemsets","Genomics"
Publisher :
ieee
Conference_Titel :
Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), 2015 IEEE Conference on
Type :
conf
DOI :
10.1109/CIBCB.2015.7300289
Filename :
7300289
Link To Document :
بازگشت