• DocumentCode
    2963165
  • Title

    Finding the Number of Clusters in a Dataset Using an Information Theoretic Hierarchical Algorithm

  • Author

    Aghagolzadeh, M. ; Soltanian-Zadeh, H. ; Araabi, B.N. ; Aghagolzadeh, A.

  • Author_Institution
    Tehran Univ., Tehran
  • fYear
    2006
  • fDate
    10-13 Dec. 2006
  • Firstpage
    1336
  • Lastpage
    1339
  • Abstract
    One of the most challenging problems of clustering is detecting the exact number of clusters in a dataset. Most of the previous methods, presented to solve this problem, estimate the number of clusters with model based algorithms, which are not able to detect all types of clusters and also face a problem in detecting coupled clusters in a dataset. In this paper we propose a new method for finding the number of clusters in a dataset utilizing information theory and a top-down hierarchical clustering algorithm. The algorithm starts from a large number of clusters and reduces one cluster in any iteration and then allocates its data points to the remaining clusters. Finally, by measuring information potential, the exact number of clusters in a desired dataset is detected. Our method shows high capability and stability in detecting the number of clusters even in complex datasets, as it is computational efficient too. We show the effectiveness of the proposed method by experimenting on several artificial and real datasets and comparing its results with two developed methods for finding the number of clusters in a dataset. The comparisons show superiority of the proposed method.
  • Keywords
    information theory; pattern clustering; datasets; information potential; information theoretic hierarchical algorithm; model based algorithms; top-down hierarchical clustering algorithm; Clustering algorithms; Computational complexity; Computational efficiency; Electric variables measurement; Entropy; Face detection; Information theory; Intelligent control; Process control; Stability;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Electronics, Circuits and Systems, 2006. ICECS '06. 13th IEEE International Conference on
  • Conference_Location
    Nice
  • Print_ISBN
    1-4244-0395-2
  • Electronic_ISBN
    1-4244-0395-2
  • Type

    conf

  • DOI
    10.1109/ICECS.2006.379729
  • Filename
    4263622