DocumentCode :
2731751
Title :
Distance Based Subspace Clustering with Flexible Dimension Partitioning
Author :
Liu, Guimei ; Li, Jinyan ; Sim, Kelvin ; Wong, Limsoon
Author_Institution :
Nat. Univ. of Singapore
fYear :
2007
fDate :
15-20 April 2007
Firstpage :
1250
Lastpage :
1254
Abstract :
Traditional similarity or distance measurements usually become meaningless when the dimensions of the datasets increase, which has detrimental effects on clustering performance. In this paper, we propose a distance-based subspace clustering model, called nCluster, to find groups of objects that have similar values on subsets of dimensions. Instead of using a grid based approach to partition the data space into non-overlapping rectangle cells as in the density based subspace clustering algorithms, the nCluster model uses a more flexible method to partition the dimensions to preserve meaningful and significant clusters. We develop an efficient algorithm to mine only maximal nClusters. A set of experiments are conducted to show the efficiency of the proposed algorithm and the effectiveness of the new model in preserving significant clusters.
Keywords :
data mining; database theory; pattern clustering; data mining; distance based subspace clustering; flexible dimension partitioning; nCluster; Clustering algorithms; Distance measurement; Kelvin; Merging; Partitioning algorithms;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Engineering, 2007. ICDE 2007. IEEE 23rd International Conference on
Conference_Location :
Istanbul
Print_ISBN :
1-4244-0802-4
Electronic_ISBN :
1-4244-0803-2
Type :
conf
DOI :
10.1109/ICDE.2007.368985
Filename :
4221775
Link To Document :
بازگشت