• DocumentCode
    595077
  • Title

    Iterative Neighbor-Joining tree clustering algorithm for genotypic data

  • Author

    Amornbunchornvej, Chainarong ; Limpiti, Thunyawat ; Assawamakin, Anunchai ; Intarapanich, Apichart ; Tongsima, Sissades

  • Author_Institution
    Fac. of Eng., King Mongkut´´s Inst. of Technol. Ladkrabang, Bangkok, Thailand
  • fYear
    2012
  • fDate
    11-15 Nov. 2012
  • Firstpage
    1827
  • Lastpage
    1830
  • Abstract
    Issues to explore in genotypic datasets include the number and characteristic patterns of subpopulations and, possibly, relationships among them. Model-based clustering methods have been adopted to find a number of clusters and the individual assignments. However, they cannot infer genetic relationships among subpopulations the way phylogenetic trees, e.g., the widely-used Neighbor-Joining (NJ) tree, can. In this paper we propose an unsupervised, iterative clustering framework called iNJclust. It performs clustering on an NJ tree with a graph-based partitioning technique. The iterative process enhances the zooming ability and corrects the topology of the final NJ trees. Inference on genetic similarities between subpopulations is also possible. As final outputs, the iNJclust algorithm provides an estimate of the number of clusters, individual assignments, a population tree, as well as sub-trees of the terminal nodes. We illustrate the superior clustering performance of the proposed algorithm using Human 27 populations, bovine 47 breeds, and sheep 28 breeds datasets.
  • Keywords
    biology computing; genetics; iterative methods; pattern clustering; trees (mathematics); NJ tree; bovine breeds; genotypic datasets; graph-based partitioning technique; human populations; iNJclust algorithm; individual assignments; iterative clustering framework; iterative neighbor-joining tree clustering algorithm; iterative process; model-based clustering methods; phylogenetic trees; population tree; sheep breeds datasets; subpopulation characteristic patterns; terminal node subtrees; zooming ability; Clustering algorithms; Genetics; Humans; Sociology; Statistics; Topology; Variable speed drives;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Pattern Recognition (ICPR), 2012 21st International Conference on
  • Conference_Location
    Tsukuba
  • ISSN
    1051-4651
  • Print_ISBN
    978-1-4673-2216-4
  • Type

    conf

  • Filename
    6460508