• DocumentCode
    1563751
  • Title

    Fuzzy co-clustering of documents and keywords

  • Author

    Kummamuru, Krishna ; Dhawale, Ajay ; Krishnapuram, Raghu

  • Author_Institution
    IBM India Res. Lab., IIT, New Delhi, India
  • Volume
    2
  • fYear
    2003
  • Firstpage
    772
  • Abstract
    Conventional clustering algorithms such as K-means and SAHN (also known as AHC) have been well studied and used in the information retrieval community for clustering text documents. More recently, efforts have been made to cluster documents and words simultaneously. The FCCM algorithm due to Oh et al. is a fuzzy clustering algorithm that maximizes the co-occurrence of categorical attributes (keywords) and the individual patterns (documents) in clusters. However, this algorithm poses certain problems when the number of documents or the number of words is very large. In this paper, we modify the FCCM algorithm so that it can be used to cluster large text corpora. Our experiments show that the modified algorithm is scalable and produces meaningful clusters. We also show the relation between FCCM and the Spherical K-Means (SKM) algorithm and introduce the Spherical Fuzzy c-Means (SFCM) algorithm.
  • Keywords
    fuzzy set theory; information retrieval; information retrieval systems; pattern clustering; text analysis; categorical attributes; categorical multivariate data; clustering text documents; conventional clustering algorithms; document co-clustering; fuzzy clustering algorithms; individual patterns; information retrieval community; keyword co-clustering; sequential agglomerative hierarchial nonoverlapping; spherical fuzzy c-means algorithm; spherical k-means algorithm; Bipartite graph; Clustering algorithms; Frequency shift keying; Information retrieval; Partitioning algorithms; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Fuzzy Systems, 2003. FUZZ '03. The 12th IEEE International Conference on
  • Print_ISBN
    0-7803-7810-5
  • Type

    conf

  • DOI
    10.1109/FUZZ.2003.1206527
  • Filename
    1206527