• DocumentCode
    1931807
  • Title

    A Measurement of Overlap Rate Between Gaussiancomponents

  • Author

    Sun, Hao-jun ; Sun, Mei ; Wang, Sheng-rui

  • Author_Institution
    Hebei Univ., Baoding
  • Volume
    4
  • fYear
    2007
  • fDate
    19-22 Aug. 2007
  • Firstpage
    2373
  • Lastpage
    2378
  • Abstract
    Overlapping clusters often appear in cluster analysis in the data mining. However, the phenomenon of cluster overlapping is still not mathematically well characterized, especially in multivariate cases. In this paper, we are interested in the overlap phenomenon between Gaussian clusters, since the Gaussian mixture is a fundamental data distribution model suitable for many clustering algorithms. We introduce the novel concept of the ridge curve and establish a theory on the degree of overlap between two components. Based on this theory, we develop an algorithm for calculating the overlap rate. We investigate factors that affect the value of the overlap rate, and show how the theory can be used to generate "truthed data" as well as to measure the overlap rate between a given pair of clusters or components in a mixture.
  • Keywords
    Gaussian processes; data mining; pattern clustering; Gaussian component; data distribution model; data mining; Clustering algorithms; Computer science; Cybernetics; Data analysis; Data models; Electronic mail; Gaussian processes; Machine learning; Mathematics; Multidimensional systems; Cluster analysis; Mixture model; Overlap rate; Ridge curve;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics, 2007 International Conference on
  • Conference_Location
    Hong Kong
  • Print_ISBN
    978-1-4244-0973-0
  • Electronic_ISBN
    978-1-4244-0973-0
  • Type

    conf

  • DOI
    10.1109/ICMLC.2007.4370542
  • Filename
    4370542