• DocumentCode
    1247809
  • Title

    Automated variable weighting in k-means type clustering

  • Author

    Huang, Joshua Zhexue ; Ng, Michael K. ; Rong, Hongqiang ; Li, Zichen

  • Author_Institution
    E-Business Technol. Inst., Hong Kong Univ., China
  • Volume
    27
  • Issue
    5
  • fYear
    2005
  • fDate
    5/1/2005 12:00:00 AM
  • Firstpage
    657
  • Lastpage
    668
  • Abstract
    This paper proposes a k-means type clustering algorithm that can automatically calculate variable weights. A new step is introduced to the k-means clustering process to iteratively update variable weights based on the current partition of data and a formula for weight calculation is proposed. The convergency theorem of the new clustering process is given. The variable weights produced by the algorithm measure the importance of variables in clustering and can be used in variable selection in data mining applications where large and complex real data are often involved. Experimental results on both synthetic and real data have shown that the new algorithm outperformed the standard k-means type algorithms in recovering clusters in data.
  • Keywords
    convergence; data mining; iterative methods; pattern clustering; automated variable weighting; complex real data; data mining; feature evaluation; feature selection; k-means type clustering algorithm; Additives; Clustering algorithms; Clustering methods; Cost function; Data mining; Databases; Input variables; Iterative algorithms; Noise reduction; Partitioning algorithms; Index Terms- Clustering; data mining; feature evaluation and selection.; mining methods and algorithms; Algorithms; Artificial Intelligence; Cluster Analysis; Computer Simulation; Information Storage and Retrieval; Models, Statistical; Numerical Analysis, Computer-Assisted; Pattern Recognition, Automated; Reproducibility of Results; Sensitivity and Specificity; Signal Processing, Computer-Assisted;
  • fLanguage
    English
  • Journal_Title
    Pattern Analysis and Machine Intelligence, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0162-8828
  • Type

    jour

  • DOI
    10.1109/TPAMI.2005.95
  • Filename
    1407871