DocumentCode
2844997
Title
Using path length measure for gene clustering based on similarity of annotation terms
Author
Nagar, Anurag ; Al-Mubaid, Hisham
Author_Institution
Univ. of Houston-Clear Lake, Houston, TX
fYear
2008
fDate
6-9 July 2008
Firstpage
637
Lastpage
642
Abstract
The application of semantic similarity measures on gene data using Gene Ontology (GO) and gene annotation information is becoming more widely used and acceptable in the recent years in bioinformatics. The purpose of this application can range from gene similarity to gene clustering. In this paper, we investigate a simple measure for gene similarity that relies on the path length between the GO annotation terms of genes to determine the similarity between them. The similarity values computed by the proposed measure for a set of genes will then be used for clustering the genes. In the evaluation, we compared the proposed measure with two widely used information-theoretic similarity measures, Resnik and Lin, using three datasets of genes. The experimental results and analysis of clusters validated the effectiveness of the proposed path length measure.
Keywords
biology computing; data analysis; genetics; information theory; pattern clustering; annotation terms similarity; bioinformatics; gene annotation information; gene clustering; gene data; gene ontology; information-theoretic similarity measures; path length measure; semantic similarity measures; Bioinformatics; Biology computing; Biomedical measurements; Clustering algorithms; Clustering methods; Lakes; Length measurement; Ontologies; Proteins; Time measurement; Gene clustering; gene similarity;
fLanguage
English
Publisher
ieee
Conference_Titel
Computers and Communications, 2008. ISCC 2008. IEEE Symposium on
Conference_Location
Marrakech
ISSN
1530-1346
Print_ISBN
978-1-4244-2702-4
Electronic_ISBN
1530-1346
Type
conf
DOI
10.1109/ISCC.2008.4625765
Filename
4625765
Link To Document