Generating Optimum Number of Clusters Using Median Search and Projection Algorithms

Author

Suresh, Lalith ; Simha, Jay B. ; Veluru, Rajappa

Author_Institution

CSE Dept., CITech, Bangalore, India

fYear

2010

fDate

20-23 April 2010

Firstpage

97

Lastpage

102

Abstract

K-means Clustering is an important algorithm for identifying the structure in data. Kmeans is the simplest clustering algorithm. This algorithm takes a predefined number of clusters as input. Mean stands for an average, an average location of all the members of a particular cluster. This algorithm is based on random selection of cluster centers and iteratively improving the results. In this work, a novel approach to seeding the clusters with the latent data structure is proposed. This is expected to minimize: The need for number of clusters apriory Time for convergence by providing near optimal cluster centers. Also these algorithms are tested on the latest standards for data warehouses - the column store databases.

Keywords

data structures; data warehouses; pattern clustering; clustering algorithm; column store databases; data structure; data warehouses; k-means clustering; median search; near optimal cluster centers; projection algorithms; Clustering algorithms; Clustering methods; Conferences; Convergence; Data structures; Iterative algorithms; Partitioning algorithms; Performance analysis; Projection algorithms; Testing; Clustering; DBMS; Median Projection; Median Selection; k-means Algorithm;

fLanguage

English

Publisher

ieee

Conference_Titel

Advanced Information Networking and Applications Workshops (WAINA), 2010 IEEE 24th International Conference on

Conference_Location

Perth, WA

Print_ISBN

978-1-4244-6701-3

Type

conf

DOI

10.1109/WAINA.2010.196

Filename

5480848