A Center Closeness Algorithm for the Analyses of Gene Expression Data

Author

Wang, Huakun ; Feng, Lixin ; Ying, Zhou ; Xu, Zhang ; Wang, Zhenzhen

Author_Institution

Sch. of Math. Sci., Heilongjiang Univ., Harbin, China

fYear

2011

fDate

10-12 May 2011

Firstpage

1

Lastpage

2

Abstract

Clustering is an important computational tool for identifying gene sets with similar profiles. Various clustering methods have been proposed for the analyses of gene expression data, among them, k-means is a widely used method because of its simplicity and computational speed which allows it to run on large datasets. Nevertheless, k-means need to determine the cluster number prior to clustering, which greatly influences the clustering results. This paper proposed a novel center closeness clustering algorithm that can automatically determine the cluster number based on the distances of data points. We used this proposed algorithm to cluster two gene expression data and compared the clustering results with those obtained by k-means. The cluster validity indices showed that our algorithm is obviously superior to k-means.

Keywords

biology computing; data analysis; genetic algorithms; genetics; pattern clustering; very large databases; center closeness algorithm; center closeness clustering algorithm; cluster number; cluster validity indices; clustering methods; clustering results; computational speed; computational tool; data points; gene expression data; gene sets; k-means; large datasets; Algorithm design and analysis; Bioinformatics; Classification algorithms; Clustering algorithms; Gene expression; Indexes; Tumors;

fLanguage

English

Publisher

ieee

Conference_Titel

Bioinformatics and Biomedical Engineering, (iCBBE) 2011 5th International Conference on

Conference_Location

Wuhan

ISSN

2151-7614

Print_ISBN

978-1-4244-5088-6

Type

conf

DOI

10.1109/icbbe.2011.5779974

Filename

5779974