• DocumentCode
    3724159
  • Title

    A Generative Spatial Clustering Model for Random Data through Spanning Trees

  • Author

    Leonardo Vilela Teixeira;Renato Martins Assuncao;Rosangela Helena Loschi

  • Author_Institution
    Univ. Fed. de Minas Gerais, Belo Horizonte, Brazil
  • fYear
    2015
  • Firstpage
    997
  • Lastpage
    1002
  • Abstract
    When performing analysis of spatial data, there is often the need to aggregate geographical areas into larger regions, a process called regionalization or spatially constrained clustering. These algorithms assume that the items to be clustered are non-stochastic, an assumption not held in many applications. In this work, we present a new probabilistic regionalization algorithm that allows spatially varying random variables as features. Hence, an area highly different from its neighbors can still be considered a member of their cluster if it has a large variance. Our proposal is based on a Bayesian generative spatial product partition model. We build an effective Markov Chain Monte Carlo algorithm to carry out a random walk on the space of all trees and their induced spatial partitions by edges´ deletion. We evaluate our algorithm using synthetic data and with one problem of municipalities regionalization based on cancer incidence rates. We are able to better accommodate the natural variation of the data and to diminish the effect of outliers, producing better results than state-of-art approaches.
  • Keywords
    "Yttrium","Partitioning algorithms","Clustering algorithms","Data models","Stochastic processes","Probability distribution","Space exploration"
  • Publisher
    ieee
  • Conference_Titel
    Data Mining (ICDM), 2015 IEEE International Conference on
  • ISSN
    1550-4786
  • Type

    conf

  • DOI
    10.1109/ICDM.2015.106
  • Filename
    7373425