Title :
Poster: Issues in functional characterization and clustering of genes by literature mining
Author :
Dasigi, V. ; Karam, Orlando ; Pydimarri, Sailaja
Abstract :
This paper studies the issues involved in characterizing the function of the genes that are involved in the life cycle of budding yeast, and in clustering them based on their potential functional similarities. Clustering results on these genes have been reported so we have a basis for comparison. The task of clustering genes is done in two steps: First, keywords corresponding to all genes of interest from a subset of MEDLINE database were extracted automatically using TF-IDF and Z-scores. In the second step, the classic K-means algorithm was used to group genes into clusters of genes based on the keyword features.
Keywords :
biology computing; cellular biophysics; data mining; genetics; microorganisms; MEDLINE database; TF-IDF; Z-scores; budding yeast; classic K-means algorithm; functional characterization; gene clustering; keyword extraction; life cycle; literature mining; Abstracts; Clustering algorithms; Context; Databases; Feature extraction; Libraries; Venus;
Conference_Titel :
Computational Advances in Bio and Medical Sciences (ICCABS), 2011 IEEE 1st International Conference on
Conference_Location :
Orlando, FL
Print_ISBN :
978-1-61284-851-8
DOI :
10.1109/ICCABS.2011.5729893