Title :
A genetic rule-based data clustering toolkit
Author :
Sarafis, I. ; Zalzala, Ams ; Trinder, P.W.
Author_Institution :
Dept. of Comput. & Electr. Eng., Heriot-Watt Univ., Edinburgh, UK
fDate :
6/24/1905 12:00:00 AM
Abstract :
Clustering is a hard combinatorial problem and is defined as the unsupervised classification of patterns. The formation of clusters is based on the principle of maximizing the similarity between objects of the same cluster while simultaneously minimizing the similarity between objects belonging to distinct clusters. This paper presents a tool for database clustering using a rule-based genetic algorithm (RBCGA). RBCGA evolves individuals consisting of a fixed set of clustering rules, where each rule includes d non-binary intervals, one for each feature. The investigations attempt to alleviate certain drawbacks related to the classical minimization of square-error criterion by suggesting a flexible fitness function which takes into consideration, cluster asymmetry, density, coverage and homogeny
Keywords :
data mining; database theory; genetic algorithms; knowledge based systems; least mean squares methods; pattern clustering; very large databases; cluster asymmetry; combinatorial problem; data mining; database clustering; flexible fitness function; genetic rule-based data clustering toolkit; huge databases; minimization; nonbinary intervals; object similarity; rule-based genetic algorithm; square-error criterion; unsupervised pattern classification; Clustering algorithms; Computational efficiency; Data analysis; Data mining; Delta modulation; Genetic algorithms; Gravity; Multidimensional systems; Partitioning algorithms; Spatial databases;
Conference_Titel :
Evolutionary Computation, 2002. CEC '02. Proceedings of the 2002 Congress on
Conference_Location :
Honolulu, HI
Print_ISBN :
0-7803-7282-4
DOI :
10.1109/CEC.2002.1004420