مرکز منطقه ای اطلاع رساني علوم و فناوري - Density-biased clustering based on reservoir sampling

DocumentCode :

2002442

Title :

Density-biased clustering based on reservoir sampling

Author :

Kerdprasop, Kittisak ; Kerdprasop, Nittaya ; Sattayatham, Pairote

Author_Institution :

Data Eng. & Knowledge Discovery Res. Unit, Suranaree Univ. of Technol., Thailand

fYear :

2005

fDate :

22-26 Aug. 2005

Firstpage :

1122

Lastpage :

1126

Abstract :

Clustering is a task of grouping data based on similarity. A popular k-means algorithm groups data by firstly assigning all data points to the closest clusters, then determining the cluster means. The algorithm repeats these two steps until it has converged. We propose a variation called weighted k-means to improve the clustering scalability. To speed up the clustering process, we develop the reservoir-biased sampling as an efficient data reduction technique since it performs a single scan over a data set. Our algorithm has been designed to group data of mixture models. We present an experimental evaluation of the proposed method.

Keywords :

data reduction; pattern clustering; sampling methods; very large databases; data grouping; data reduction technique; density-biased clustering; reservoir-biased sampling; weighted k-means algorithm; Clustering algorithms; Councils; Data engineering; Databases; Iterative algorithms; Knowledge engineering; Partitioning algorithms; Reservoirs; Sampling methods; Scalability;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Database and Expert Systems Applications, 2005. Proceedings. Sixteenth International Workshop on

ISSN :

1529-4188

Print_ISBN :

0-7695-2424-9

Type :

conf

DOI :

10.1109/DEXA.2005.72

Filename :

1508425

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=2002442