DocumentCode
595077
Title
Iterative Neighbor-Joining tree clustering algorithm for genotypic data
Author
Amornbunchornvej, Chainarong ; Limpiti, Thunyawat ; Assawamakin, Anunchai ; Intarapanich, Apichart ; Tongsima, Sissades
Author_Institution
Fac. of Eng., King Mongkut´´s Inst. of Technol. Ladkrabang, Bangkok, Thailand
fYear
2012
fDate
11-15 Nov. 2012
Firstpage
1827
Lastpage
1830
Abstract
Issues to explore in genotypic datasets include the number and characteristic patterns of subpopulations and, possibly, relationships among them. Model-based clustering methods have been adopted to find a number of clusters and the individual assignments. However, they cannot infer genetic relationships among subpopulations the way phylogenetic trees, e.g., the widely-used Neighbor-Joining (NJ) tree, can. In this paper we propose an unsupervised, iterative clustering framework called iNJclust. It performs clustering on an NJ tree with a graph-based partitioning technique. The iterative process enhances the zooming ability and corrects the topology of the final NJ trees. Inference on genetic similarities between subpopulations is also possible. As final outputs, the iNJclust algorithm provides an estimate of the number of clusters, individual assignments, a population tree, as well as sub-trees of the terminal nodes. We illustrate the superior clustering performance of the proposed algorithm using Human 27 populations, bovine 47 breeds, and sheep 28 breeds datasets.
Keywords
biology computing; genetics; iterative methods; pattern clustering; trees (mathematics); NJ tree; bovine breeds; genotypic datasets; graph-based partitioning technique; human populations; iNJclust algorithm; individual assignments; iterative clustering framework; iterative neighbor-joining tree clustering algorithm; iterative process; model-based clustering methods; phylogenetic trees; population tree; sheep breeds datasets; subpopulation characteristic patterns; terminal node subtrees; zooming ability; Clustering algorithms; Genetics; Humans; Sociology; Statistics; Topology; Variable speed drives;
fLanguage
English
Publisher
ieee
Conference_Titel
Pattern Recognition (ICPR), 2012 21st International Conference on
Conference_Location
Tsukuba
ISSN
1051-4651
Print_ISBN
978-1-4673-2216-4
Type
conf
Filename
6460508
Link To Document