Title :
A Probability Model for Projective Clustering on High Dimensional Data
Author :
Chen, Lifei ; Jiang, Qingshan ; Wang, Shengrui
Author_Institution :
Dept. of Comput. Sci., Fujian Normal Univ., Fuzhou
Abstract :
Clustering high dimensional data is a big challenge in data mining due to the curse of dimensionality. To solve this problem, projective clustering has been defined as an extension of traditional clustering that seeks to find projected clusters in subsets of dimensions of a data space. In this paper, the problem of modeling projected clusters is first discussed, and an extended Gaussian model is proposed. Second, a general objective criterion used with k-means type projective clustering is presented based on the model. Finally, the expressions to learn model parameters are derived and then used in a new algorithm named FPC to perform fuzzy clustering on high dimensional data. The experimental results on document clustering show the effectiveness of the proposed clustering model.
Keywords :
Gaussian processes; data mining; fuzzy set theory; learning (artificial intelligence); pattern clustering; probability; Gaussian model; data mining; fuzzy clustering; high dimensional data; learning algorithm; probability model; projective clustering; Clustering algorithms; Clustering methods; Computer science; Data mining; Extraterrestrial phenomena; Flexible printed circuits; Los Angeles Council; Monte Carlo methods; Partitioning algorithms; Random variables;
Conference_Titel :
Data Mining, 2008. ICDM '08. Eighth IEEE International Conference on
Conference_Location :
Pisa
Print_ISBN :
978-0-7695-3502-9
DOI :
10.1109/ICDM.2008.15