Title :
Efficient Clustering of Uncertain Data
Author :
Ngai, Wang Kay ; Kao, Ben ; Chui, Chun Kit ; Cheng, Reynold ; Chau, Michael ; Yip, Kevin Y.
Author_Institution :
Dept. of Comput. Sci., Univ. of Hong Kong, Hong Kong
Abstract :
We study the problem of clustering data objects whose locations are uncertain. A data object is represented by an uncertainty region over which a probability density function (pdf) is defined. One method to cluster uncertain objects of this sort is to apply the UK-means algorithm, which is based on the traditional K-means algorithm. In UK-means, an object is assigned to the cluster whose representative has the smallest expected distance to the object. For arbitrary pdf, calculating the expected distance between an object and a cluster representative requires expensive integration computation. We study various pruning methods to avoid such expensive expected distance calculation.
Keywords :
data handling; pattern clustering; probability; UK-means algorithm; data object; probability density function; pruning method; uncertain data clustering; Animals; Clustering algorithms; Computer science; Costs; Gaussian distribution; Histograms; Measurement errors; Probability density function; Uncertainty; Vehicles;
Conference_Titel :
Data Mining, 2006. ICDM '06. Sixth International Conference on
Conference_Location :
Hong Kong
Print_ISBN :
0-7695-2701-7
DOI :
10.1109/ICDM.2006.63