• DocumentCode
    2243405
  • Title

    An algorithm for clustering heterogeneous data streams with uncertainty

  • Author

    Huang, Guo-yan ; Liang, Da-peng ; Hu, Chang-zhen ; Ren, Jia-dong

  • Author_Institution
    Coll. of Inf. Sci. & Eng., Yanshan Univ., Qinhuangdao, China
  • Volume
    4
  • fYear
    2010
  • fDate
    11-14 July 2010
  • Firstpage
    2059
  • Lastpage
    2064
  • Abstract
    In many applications, the heterogeneous data streams with uncertainty are ubiquitous. However, the clustering quality of the existing methods for clustering heterogeneous data streams with uncertainty is lower. In this paper, an algorithm for clustering heterogeneous data streams with uncertainty, called HU-Clustering, is proposed. A Heterogeneous Uncertainty Clustering Feature (H-UCF) is presented to describe the feature of heterogeneous data streams with uncertainty. Based on H-UCF, a probability frequency histogram is proposed to track the statistics of categorical attributes; the algorithm initially creates n clusters by k-prototypes algorithm. In order to improve clustering quality, a two phase streams clustering selection process is applied to HU-Clustering algorithm. Firstly, the candidate clustering is selected through the new similarity measure; secondly, the most similar cluster for each new arriving tuple is selected through clustering uncertainty in candidate clustering set. The experimental results show that the clustering quality of HU-Clustering is higher than that of UMicro.
  • Keywords
    data handling; pattern clustering; probability; ubiquitous computing; H-UCF; HU-clustering; clustering quality; heterogeneous data streams clustering; heterogeneous uncertainty clustering feature; k-prototypes algorithm; probability frequency histogram; ubiquitous; Algorithm design and analysis; Clustering algorithms; Cybernetics; Histograms; Machine learning; Measurement uncertainty; Uncertainty; Clustering; Heterogeneous attributes; Probability frequency histogram; Uncertain data stream;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics (ICMLC), 2010 International Conference on
  • Conference_Location
    Qingdao
  • Print_ISBN
    978-1-4244-6526-2
  • Type

    conf

  • DOI
    10.1109/ICMLC.2010.5580502
  • Filename
    5580502