Title :
Clustering large data sets based on data compression technique and weighted quality measures
Author :
Sassi, M. ; Grissa, A.
Author_Institution :
TIC Dept., Nat. Sch. of Eng. of Tunis, Tunis, Tunisia
Abstract :
Various algorithms have been proposed for clustering large data sets for the hard and fuzzy case, not as much work has been done for automatic clustering approaches in which the number of clusters is unknown for the user. These approaches need some measures, called validity function to evaluate the clustering result and to give to the user the optimal number of clusters. In order to obtain this number, three conditions are necessary: (1) a good compression technique for data reduction with limited memory allocated, (b) good measures for the evaluation of the goodness of clusters for varying number of clusters, and (c) a good cluster algorithm that can automatically produce the number of clusters and takes into account the used compression technique. In this paper, we propose new clustering approaches which deals with new compression technique based on quality measures.
Keywords :
data compression; data reduction; pattern clustering; very large databases; data compression; data reduction; large data set clustering; memory allocation; validity function; weighted quality measures; Clustering algorithms; Data compression; Data structures; Databases; Fuzzy sets; Tellurium; Weight measurement;
Conference_Titel :
Fuzzy Systems, 2009. FUZZ-IEEE 2009. IEEE International Conference on
Conference_Location :
Jeju Island
Print_ISBN :
978-1-4244-3596-8
Electronic_ISBN :
1098-7584
DOI :
10.1109/FUZZY.2009.5277208