Title :
Biologically Supervised Hierarchical Clustering Algorithms for Gene Expression Data
Author :
Boratyn, G.M. ; Datta, Soupayan ; Datta, Soupayan
Author_Institution :
Clinical Proteomics Center, Louisville Univ., KY
fDate :
Aug. 30 2006-Sept. 3 2006
Abstract :
Cluster analysis has become a standard part of gene expression analysis. In this paper, we propose a novel semi-supervised approach that offers the same flexibility as that of a hierarchical clustering. Yet it utilizes, along with the experimental gene expression data, common biological information about different genes that is being complied at various public, Web accessible databases. We argue that such an approach is inherently superior than the standard unsupervised approach of grouping genes based on expression data alone. It is shown that our biologically supervised methods produce better clustering results than the corresponding unsupervised methods as judged by the distance from the model temporal profiles. R-codes of the clustering algorithm are available from the authors upon request
Keywords :
biology computing; data analysis; database management systems; genetics; learning (artificial intelligence); pattern clustering; Web accessible databases; biological information; cluster analysis; gene expression data analysis; public database; semisupervised hierarchical clustering algorithms; training set; Algorithm design and analysis; Biological system modeling; Biology computing; Cities and towns; Clustering algorithms; Data analysis; Databases; Diseases; Gene expression; Measurement standards;
Conference_Titel :
Engineering in Medicine and Biology Society, 2006. EMBS '06. 28th Annual International Conference of the IEEE
Conference_Location :
New York, NY
Print_ISBN :
1-4244-0032-5
Electronic_ISBN :
1557-170X
DOI :
10.1109/IEMBS.2006.260308