Title :
Class discovery based on K-means clustering and perturbation analysis
Author :
Xiaohu Ru;Zheng Liu;Zhitao Huang;Wenli Jiang
Author_Institution :
College of Electronic Science and Engineering, National University of Defense Technology, Changsha, Hunan, 410073, P.R. China
Abstract :
Class discovery, which aims to identify the underlying category structure, is an important issue in pattern recognition and knowledge discovery. The key task in class discovery is to estimate the number of classes. Classical estimation approaches usually face the problems of low accuracy, high complexity, or difficulty in choosing an appropriate penalty function. In this paper, an effective class discovery method is proposed. The method first utilizes the characteristics of the mean-square-error produced by k-means clustering, giving a coarse estimate of the number of classes, and then calculates the difference between the clustering results obtained from the original dataset and the perturbed dataset to further determine the real number of classes. Experiments on simulated and real-world data demonstrate that the proposed method has satisfactory performance in different situations. Moreover, this method relies loosely on artificially selected parameters, thus can be reliably used in wide applications.
Keywords :
"Indexes","Fitting","Perturbation methods","Cost function","Estimation","Robustness","Complexity theory"
Conference_Titel :
Image and Signal Processing (CISP), 2015 8th International Congress on
DOI :
10.1109/CISP.2015.7408070