Title :
Genetic Algorithm and Simulated Annealing based Approaches to Categorical Data Clustering
Author :
Saha, Indrajit ; Mukhopadhyay, Anirban
Author_Institution :
Dept. of Inf. Technol., Acad. of Technol., Adisaptagram
Abstract :
Categorical data clustering has been gaining significant attention from researchers, because most of the real life data sets are categorical in nature. In contrast to numerical domain, no natural ordering can be found among the elements of a categorical domain. Hence no inherent distance measure, like the Euclidean distance, would work to compute the distance between two categorical objects. In this article, genetic algorithm and simulated annealing based categorical data clustering algorithm has been proposed. The performance of the proposed algorithm has been compared with that of different well known categorical data clustering algorithms and demonstrated for a variety of artificial and real life categorical data sets.
Keywords :
category theory; pattern clustering; simulated annealing; Euclidean distance; categorical data clustering; categorical domain; distance measure; genetic algorithm; natural ordering; simulated annealing; Clustering algorithms; Computational modeling; Euclidean distance; Genetic algorithms; Information technology; Land surface temperature; Paper technology; Partitioning algorithms; Region 10; Simulated annealing; Genetic Algorithm based Clustering; K-medoids Algorithm; Minkowski Score; Simulated Annealing based Clustering; Wilcoxons rank sum test;
Conference_Titel :
Industrial and Information Systems, 2008. ICIIS 2008. IEEE Region 10 and the Third international Conference on
Conference_Location :
Kharagpur
Print_ISBN :
978-1-4244-2806-9
Electronic_ISBN :
978-1-4244-2806-9
DOI :
10.1109/ICIINFS.2008.4798335