• DocumentCode
    609739
  • Title

    An integrated clustering approach for high dimensional categorical data

  • Author

    Kalaivani, K. ; Raghavendra, A.P.V.

  • Author_Institution
    Dept. of Comput. Sci. & Eng., VSB Eng. Coll., Karur, India
  • fYear
    2013
  • fDate
    14-15 March 2013
  • Firstpage
    1
  • Lastpage
    4
  • Abstract
    Clustering is an attractive and important task in data mining which is used in many applications. However earlier work on clustering focused on only categorical data which is based on attribute values for grouping similar kind of data items thus will leads to convergence problem of clustering process. This proposed work is to enhance the existing k-means clustering process based on the categorical and mixed data types in efficient manner. The goal is to use integrated clustering approach based on high dimensional categorical data that works well for data with mixed continuous and categorical features. The experimental results of the proposed method on several data sets are suggest that the link based cluster ensemble algorithm integrate with proposed k-means algorithm to produce accurate clustering results. In this proposed algorithm prove the convergence property of clustering process, thus will improve the accuracy of clustering results. The scope of this proposed work is used to provide the accurate and efficient results, whenever the user wants to access the data from the database.
  • Keywords
    data mining; information retrieval; learning (artificial intelligence); pattern clustering; attribute values; categorical features; clustering results accuracy improvement; continuous features; convergence property; data access; data mining; high dimensional categorical data; integrated clustering approach; k-means algorithm; k-means clustering process; link-based cluster ensemble algorithm; mixed data types; mixed features; Accuracy; Algorithm design and analysis; Clustering algorithms; Convergence; Data mining; Machine learning algorithms; Partitioning algorithms; Categorical Data; Clustering; Link-based Cluster Ensemble; Mixed Data; Proposed K-Means;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Green High Performance Computing (ICGHPC), 2013 IEEE International Conference on
  • Conference_Location
    Nagercoil
  • Print_ISBN
    978-1-4673-2592-9
  • Type

    conf

  • DOI
    10.1109/ICGHPC.2013.6533920
  • Filename
    6533920