DocumentCode :
3166879
Title :
Classification-Based Clustering Evaluation
Author :
Whissell, John S. ; Clarke, Charles L. A.
Author_Institution :
David R. Cheriton Sch. of Comput. Sci., Univ. of Waterloo, Waterloo, ON, Canada
fYear :
2013
fDate :
7-10 Dec. 2013
Firstpage :
1229
Lastpage :
1234
Abstract :
The evaluation of clustering quality has proven to be a difficult task. While it is generally agreed that application specific human assessment can provide a reasonable gold standard for clustering evaluation, the use of human assessors is not practical in many real situations. As a result, machine computable internal clustering quality measures (CQMs) are often used in the evaluation process. However, CQMs have their own drawbacks. Despite their extensive use in clustering research and applications, many CQMs have been shown to lack generality. In this paper we present a new CQM with general applicability. The basis of our CQM is a pattern recognition view of clustering´s purpose: the unsupervised prediction of behavior from populations. This purpose translates naturally into our new classifier based CQM which we refer to as in formativeness. We show that in formativeness can satisfy core CQM axioms defined in prior research. Additionally, we provide experimental support, showing that in formativeness can outperform many established CQMs by detecting a larger variety of meaningful structures across a range of synthetic datasets, while at the same time exhibiting good performance on each individual dataset. Our results indicate that in formativeness provides a highly general and effective CQM.
Keywords :
human factors; pattern classification; pattern clustering; CQM axioms; application-specific human assessment; classification-based clustering evaluation; clustering quality evaluation; human assessors; machine computable internal clustering quality measures; pattern recognition; synthetic datasets; unsupervised behavior prediction; Clustering algorithms; Estimation; Euclidean distance; Labeling; Radio frequency; Sociology; Statistics; clustering method;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Mining (ICDM), 2013 IEEE 13th International Conference on
Conference_Location :
Dallas, TX
ISSN :
1550-4786
Type :
conf
DOI :
10.1109/ICDM.2013.28
Filename :
6729626
Link To Document :
بازگشت