Title :
How many clusters?: A Ying-Yang machine based theory for a classical open problem in pattern recognition
Author_Institution :
Dept. of Comput. Sci., Chinese Univ. of Hong Kong, Shatin, Hong Kong
Abstract :
Determination of the number of clusters in the classical mean square error (MSE) clustering analysis (e.g., by the well known k-mean algorithm) and determination of the number of Gaussians in a finite Gaussian mixture (e.g., by the EM algorithm) are well known model selection problems that take important roles in unsupervised pattern recognition. The problem has remained open for decades since there is no appropriate theory for solving it except for some heuristic techniques. This paper presents a theory for solving this problem based on the Ying-Yang machine-a Bayesian-Kullback learning scheme for unified learnings (Xu, 1995, 1996). By this theory, we obtain the criteria for selecting the correct number of clusters in the MSE clustering or in a Gaussian mixture. In addition, an automatic procedure is designed for a fast implementation of the selection. Experimental results are provided to demonstrate our success
Keywords :
approximation theory; pattern recognition; unsupervised learning; Bayesian-Kullback learning scheme; Ying-Yang machine based theory; classical mean square error; clustering analysis; finite Gaussian mixture; heuristic techniques; k-mean algorithm; model selection problems; pattern recognition; unified learnings; Algorithm design and analysis; Bayesian methods; Clustering algorithms; Computer science; Gaussian processes; Hidden Markov models; Machine learning; Mean square error methods; Pattern recognition; Predictive models;
Conference_Titel :
Neural Networks, 1996., IEEE International Conference on
Conference_Location :
Washington, DC
Print_ISBN :
0-7803-3210-5
DOI :
10.1109/ICNN.1996.549130