Title :
An HMM-based hierarchical clustering method for gene expression time series data
Author :
Zhao, Guoqing ; Deng, Wei
Author_Institution :
Sch. of Comput. Sci. & Technol., Soochow Univ., Suzhou, China
Abstract :
Using DNA microarray technology, biologists get a large number of gene expression time series data. Clustering is a significant approach to extracting biological information from these data. This paper proposes a novel clustering method, HMM-based hierarchical clustering (HMM-HC), to analyze gene expression time series data. We convert time-point data to discrete symbols on the base of the fact that the logarithm of the data approximately obeys normal distribution, and build hidden Markov models with these symbols for gene sequences. In a gene expression time series, the time point data is correlated with others. The use of HMMs can help to take advantage of this special correlation. We tested the method with two common datasets. The results show that it can produce high-quality clusters and find out the appropriate cluster number.
Keywords :
biology computing; data handling; hidden Markov models; information retrieval; lab-on-a-chip; pattern clustering; time series; DNA microarray technology; HMM; biological information extraction; discrete symbols; gene expression; gene sequence; hidden Markov model; hierarchical clustering; time series data; Biological system modeling; Computational modeling; Hidden Markov models; Gene expression time series data; Hidden Markov Model; Hierarchical Clustering;
Conference_Titel :
Bio-Inspired Computing: Theories and Applications (BIC-TA), 2010 IEEE Fifth International Conference on
Conference_Location :
Changsha
Print_ISBN :
978-1-4244-6437-1
DOI :
10.1109/BICTA.2010.5645327