DocumentCode
2222757
Title
Supervised clustering - algorithms and benefits
Author
Eick, Christoph F. ; Zeidat, Nidal ; Zhao, Zhenghong
Author_Institution
Dept. of Comput. Sci., Houston Univ., TX, USA
fYear
2004
fDate
15-17 Nov. 2004
Firstpage
774
Lastpage
776
Abstract
This work centers on a novel data mining technique we term supervised clustering. Unlike traditional clustering, supervised clustering assumes that the examples are classified and has the goal of identifying class-uniform clusters that have high probability densities. Four representative-based algorithms for supervised clustering are introduced: a greedy algorithm with random restart, named SRIDHCR, that seeks for solutions by inserting and removing single objects from the current solution, SPAM (a variation of the clustering algorithm PAM), an evolutionary computing algorithm named SCEC, and a fast medoid-based top-down splitting algorithm, named TDS. The four algorithms were evaluated using a benchmark consisting of four UCI machine learning data sets. In general, it seems that "greedy" algorithms, such as SPAM, SRIDHCR, and TDS, do not perform particularly well for supervised clustering and seem to terminate prematurely too often. We also briefly describe the applications of supervised clustering.
Keywords
data mining; evolutionary computation; greedy algorithms; learning (artificial intelligence); pattern clustering; very large databases; data mining; evolutionary computing algorithm; greedy algorithm; machine learning data sets; supervised clustering; top-down splitting algorithm; Clustering algorithms; Computer science; Data mining; Greedy algorithms; Impurities; Machine learning; Machine learning algorithms; Partitioning algorithms; Unsolicited electronic mail; Unsupervised learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Tools with Artificial Intelligence, 2004. ICTAI 2004. 16th IEEE International Conference on
ISSN
1082-3409
Print_ISBN
0-7695-2236-X
Type
conf
DOI
10.1109/ICTAI.2004.111
Filename
1374270
Link To Document