Title :
Clustering of SNPs by a Structural EM Algorithm
Author :
Zhang, Yulong ; Ji, Liang
Author_Institution :
Dept. of Autom., Tsinghua Univ., Beijing, China
Abstract :
In population based human genetic studies, unrelated individuals are collected and SNPs are measured. There are several kinds of generative models proposed for modeling the data containing a large number of SNPs loci according to the characters of human genome. However, such models can only deal with ordered loci. In this paper, we try to model the same data without using the order information. Firstly, we present a clustering model for SNPs by modifying the multi-block model used in GERBIL. It is a two-layer Bayesian network with multiple latent variables. It does not use the order information of the loci. Secondly, we solve the model by employing a structural EM algorithm combined with simulated annealing mechanism. A real data set was analyzed by the model. The results show that the SNPs can be clustered effectively. Such a model is potentially useful for clustering distantly correlated SNPs loci.
Keywords :
belief networks; biology computing; genetics; genomics; molecular biophysics; clustering model; human genetic method; human genome; multiblock model; multiple latent variables; simulated annealing mechanism; structural EM algorithm; two-layer Bayesian network; Bayesian methods; Bioinformatics; Clustering algorithms; Data analysis; Genetics; Genomics; Hidden Markov models; Humans; Sequences; Simulated annealing; Bayesian network; EM algorithm; block; generative mdoel; latent variable; simulated annealing;
Conference_Titel :
Bioinformatics, Systems Biology and Intelligent Computing, 2009. IJCBS '09. International Joint Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-0-7695-3739-9
DOI :
10.1109/IJCBS.2009.97