مرکز منطقه ای اطلاع رساني علوم و فناوري - On the quality of k-means clustering based on grouped data

Title of article :

On the quality of k-means clustering based on grouped data

Author/Authors :

Kننrik، نويسنده , , Meelis and Pنrna، نويسنده , , Kalev، نويسنده ,

Issue Information :

روزنامه با شماره پیاپی سال 2009

Pages :

From page :

3836

To page :

3841

Abstract :

Let us have a probability distribution P (possibly empirical) on the real line R . Consider the problem of finding the k-mean of P, i.e. a set A of at most k points that minimizes given loss-function. It is known that the k-mean can be found using an iterative algorithm by Lloyd [1982. Least squares quantization in PCM. IEEE Transactions on Information Theory 28, 129–136]. However, depending on the complexity of the distribution P, the application of this algorithm can be quite resource-consuming. One possibility to overcome the problem is to group the original data and calculate the k-mean on the basis of the grouped data. As a result, the new k-mean will be biased, and our aim is to measure the loss of the quality of approximation caused by such approach.

Keywords :

Loss-function , Lloydיs algorithm , Voronoi partitions , Grouped data , k-means

Journal title :

Journal of Statistical Planning and Inference

Serial Year :

2009

Journal title :

Journal of Statistical Planning and Inference

Record number :

2220336

Link To Document :

https://search.isc.ac/dl/search/defaultta.aspx?DTC=10&DC=2220336