DocumentCode
2539384
Title
A Time efficient Pattern Reduction algorithm for k-means based clustering
Author
Tsai, Chun-Wei ; Yang, Chu-Sing ; Chiang, Ming-Chao
Author_Institution
Nat. Sun Yat-sen Univ., Kaohsiung
fYear
2007
fDate
7-10 Oct. 2007
Firstpage
504
Lastpage
509
Abstract
In this paper, we present an efficient algorithm, called pattern reduction (PR) algorithm, to reduce the time required for data clustering based on iterative clustering algorithms. Conceptually similar to a lossy data compression scheme, this algorithm removes at each iteration those data patterns that are close to the centroid of a cluster or remain in the same cluster for a certain number of iterations in a row and are thus unlikely to be moved again from one cluster to another at later iterations by computing a new pattern to represent all the data patterns removed. Our simulation results - from 2 to 1,000 dimensions and 150 to 6,000,000 patterns - indicate that the proposed algorithm can reduce the computation time of k-means, genetic k-means algorithm (GKA) and k-means with genetic algorithm (KGA) from 10% up to about 80% and that for high dimensional data sets, it can even reduce the computation time for more than 70%.
Keywords
data compression; genetic algorithms; iterative methods; pattern clustering; data clustering; genetic k-means algorithm; iterative clustering algorithms; k-means based clustering; lossy data compression; time efficient pattern reduction algorithm; Algorithm design and analysis; Clustering algorithms; Computational modeling; Data compression; Delay; Euclidean distance; Genetic algorithms; Heuristic algorithms; High performance computing; Iterative algorithms;
fLanguage
English
Publisher
ieee
Conference_Titel
Systems, Man and Cybernetics, 2007. ISIC. IEEE International Conference on
Conference_Location
Montreal, Que.
Print_ISBN
978-1-4244-0990-7
Electronic_ISBN
978-1-4244-0991-4
Type
conf
DOI
10.1109/ICSMC.2007.4413608
Filename
4413608
Link To Document