Title :
Supervised clustering - algorithms and benefits
Author :
Eick, Christoph F. ; Zeidat, Nidal ; Zhao, Zhenghong
Author_Institution :
Dept. of Comput. Sci., Houston Univ., TX, USA
Abstract :
This work centers on a novel data mining technique we term supervised clustering. Unlike traditional clustering, supervised clustering assumes that the examples are classified and has the goal of identifying class-uniform clusters that have high probability densities. Four representative-based algorithms for supervised clustering are introduced: a greedy algorithm with random restart, named SRIDHCR, that seeks for solutions by inserting and removing single objects from the current solution, SPAM (a variation of the clustering algorithm PAM), an evolutionary computing algorithm named SCEC, and a fast medoid-based top-down splitting algorithm, named TDS. The four algorithms were evaluated using a benchmark consisting of four UCI machine learning data sets. In general, it seems that "greedy" algorithms, such as SPAM, SRIDHCR, and TDS, do not perform particularly well for supervised clustering and seem to terminate prematurely too often. We also briefly describe the applications of supervised clustering.
Keywords :
data mining; evolutionary computation; greedy algorithms; learning (artificial intelligence); pattern clustering; very large databases; data mining; evolutionary computing algorithm; greedy algorithm; machine learning data sets; supervised clustering; top-down splitting algorithm; Clustering algorithms; Computer science; Data mining; Greedy algorithms; Impurities; Machine learning; Machine learning algorithms; Partitioning algorithms; Unsolicited electronic mail; Unsupervised learning;
Conference_Titel :
Tools with Artificial Intelligence, 2004. ICTAI 2004. 16th IEEE International Conference on
Print_ISBN :
0-7695-2236-X
DOI :
10.1109/ICTAI.2004.111