• DocumentCode
    9372
  • Title

    Scaling Up Synchronization-Inspired Partitioning Clustering

  • Author

    Wenhao Ying ; Fu-Lai Chung ; Shitong Wang

  • Author_Institution
    Sch. of Digital Media, Jiangnan Univ., Wuxi, China
  • Volume
    26
  • Issue
    8
  • fYear
    2014
  • fDate
    Aug. 2014
  • Firstpage
    2045
  • Lastpage
    2057
  • Abstract
    Based on the extensive Kuramoto model, synchronization-inspired partitioning clustering algorithm was recently proposed and is attracting more and more attentions, due to the fact that it simulates the synchronization phenomena in clustering where each data object is regarded as a phase oscillator and the dynamic behavior of the objects is simulated over time. In order to circumvent the serious difficulty that its existing version can only be effectively carried out on considerably small/medium datasets, a novel scalable synchronization-inspired partitioning clustering algorithm termed LSSPC, based on the center-constrained minimal enclosing ball and the reduced set density estimator, is proposed for large dataset applications. LSSPC first condenses a large scale dataset into its reduced dataset by using a fast minimal-enclosing-ball based approximation for the reduced set density estimator, thus achieving an asymptotic time complexity that is linear in the size of dataset and a space complexity that is independent of this size. Then it carries out clustering adaptively on the obtained reduced dataset by using Sync with the Davies-Bouldin clustering criterion and a new order parameter which can help us observe the degree of local synchronization. Finally, it finishes clustering by using the proposed algorithm CRD on the remaining objects in the large dataset, which can capture the outliers and isolated clusters effectively. The effectiveness of the proposed clustering algorithm LSSPC for large datasets is theoretically analyzed and experimentally verified by running on artificial and real datasets.
  • Keywords
    computational complexity; pattern clustering; synchronisation; CRD algorithm; Davies-Bouldin clustering criterion; Kuramoto model; LSSPC algorithm; Sync; artificial datasets; asymptotic time complexity; center-constrained minimal-enclosing-ball based approximation; dataset size; dynamic object behavior simulation; isolated cluster capturing; large-scale dataset applications; local synchronization degree; order parameter; outlier capturing; phase oscillator; real datasets; reduced set density estimator; scalable synchronization-inspired partitioning clustering algorithm; space complexity; Approximation algorithms; Approximation methods; Clustering algorithms; Heuristic algorithms; Oscillators; Partitioning algorithms; Synchronization; KDE based density estimation; large datasets; minimal enclosing ball; reduced set; synchronization-inspired partitioning clustering;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2013.178
  • Filename
    6678517