Title :
An Entropy-Based Subspace Clustering Algorithm for Categorical Data
Author :
Carbonera, Joel Luis ; Abel, Mara
Author_Institution :
Inst. of Inf., Univ. Fed. do Rio Grande do Sul - UFRGS, Porto Alegre, Brazil
Abstract :
The interest in attribute weighting for soft subspace clustering have been increasing in the last years. However, most of the proposed approaches are designed for dealing only with numeric data. In this paper, our focus is on soft subspace clustering for categorical data. In soft subspace clustering, the attribute weighting approach plays a crucial role. Due to this, we propose an entropy-based approach for measuring the relevance of each categorical attribute in each cluster. Besides that, we propose the EBK-modes (entropy-based k-modes), an extension of the basic k-modes that uses our approach for attribute weighting. We performed experiments on five real-world datasets, comparing the performance of our algorithms with four state-of-the-art algorithms, using three well-known evaluation metrics: accuracy, f-measure and adjusted Rand index. According to the experiments, the EBK-modes outperforms the algorithms that were considered in the evaluation, regarding the considered metrics.
Keywords :
entropy; pattern clustering; EBK-modes; adjusted Rand index; attribute weighting approach; basic k-modes; categorical data; entropy-based subspace clustering algorithm; evaluation metrics; f-measure; soft subspace clustering; Accuracy; Breast cancer; Clustering algorithms; Entropy; Indexes; Partitioning algorithms; Uncertainty; attribute weighting; categorical data; clustering; data mining; entropy; subspace clustering;
Conference_Titel :
Tools with Artificial Intelligence (ICTAI), 2014 IEEE 26th International Conference on
Conference_Location :
Limassol
DOI :
10.1109/ICTAI.2014.48