Title :
Distance Based Subspace Clustering with Flexible Dimension Partitioning
Author :
Liu, Guimei ; Li, Jinyan ; Sim, Kelvin ; Wong, Limsoon
Author_Institution :
Nat. Univ. of Singapore
Abstract :
Traditional similarity or distance measurements usually become meaningless when the dimensions of the datasets increase, which has detrimental effects on clustering performance. In this paper, we propose a distance-based subspace clustering model, called nCluster, to find groups of objects that have similar values on subsets of dimensions. Instead of using a grid based approach to partition the data space into non-overlapping rectangle cells as in the density based subspace clustering algorithms, the nCluster model uses a more flexible method to partition the dimensions to preserve meaningful and significant clusters. We develop an efficient algorithm to mine only maximal nClusters. A set of experiments are conducted to show the efficiency of the proposed algorithm and the effectiveness of the new model in preserving significant clusters.
Keywords :
data mining; database theory; pattern clustering; data mining; distance based subspace clustering; flexible dimension partitioning; nCluster; Clustering algorithms; Distance measurement; Kelvin; Merging; Partitioning algorithms;
Conference_Titel :
Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
Conference_Location :
Istanbul
Print_ISBN :
1-4244-0802-4
Electronic_ISBN :
1-4244-0803-2
DOI :
10.1109/ICDE.2007.368985