• DocumentCode
    1661935
  • Title

    Independent Component Analysis Based Seeding Method for K-Means Clustering

  • Author

    Onoda, Takashi ; Sakai, Miho ; Yamada, Seiji

  • Author_Institution
    Syst. Eng. Lab., Central Res. Inst. Electr. Power Ind., Tokyo, Japan
  • Volume
    3
  • fYear
    2011
  • Firstpage
    122
  • Lastpage
    125
  • Abstract
    The k-means clustering method is a widely used clustering technique for the Web because of its simplicity and speed. However, the clustering result depends heavily on the chosen initial clustering centers, which are chosen uniformly at random from the data points. We propose a seeding method based on the independent component analysis for the k-means clustering method. We evaluate the performance of our proposed method and compare it with other seeding methods by using benchmark datasets. We applied our proposed method to a Web corpus, which is provided by ODP. The experiments show that the normalized mutual information of our proposed method is better than the normalized mutual information of k-means clustering method and k-means++ clustering method. Therefore, the proposed method is useful for Web corpus.
  • Keywords
    Internet; independent component analysis; pattern clustering; ODP; Web corpus; benchmark datasets; clustering centers; independent component analysis; k-means++ clustering method; normalized mutual information; seeding method; Clustering methods; Electronic mail; Independent component analysis; Iris; Measurement; Mutual information; Principal component analysis; independent component analysis; k-means clustering; seeding;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence and Intelligent Agent Technology (WI-IAT), 2011 IEEE/WIC/ACM International Conference on
  • Conference_Location
    Lyon
  • Print_ISBN
    978-1-4577-1373-6
  • Electronic_ISBN
    978-0-7695-4513-4
  • Type

    conf

  • DOI
    10.1109/WI-IAT.2011.29
  • Filename
    6040821