DocumentCode :
671419
Title :
Robust EM algorithm for model-based curve clustering
Author :
Chamroukhi, Faicel
Author_Institution :
Inf. Sci. & Syst. Lab. (LSIS), Univ. of the South Toulon-Var (USTV), La Garde, France
fYear :
2013
fDate :
4-9 Aug. 2013
Firstpage :
1
Lastpage :
8
Abstract :
Model-based clustering approaches concern the paradigm of exploratory data analysis relying on the finite mixture model to automatically find a latent structure governing observed data. They are one of the most popular and successful approaches in cluster analysis. The mixture density estimation is generally performed by maximizing the observed-data log-likelihood by using the expectation-maximization (EM) algorithm. However, it is well-known that the EM algorithm initialization is crucial. In addition, the standard EM algorithm requires the number of clusters to be known a priori. Some solutions have been provided in [31], [12] for model-based clustering with Gaussian mixture models for multivariate data. In this paper we focus on model-based curve clustering approaches, when the data are curves rather than vectorial data, based on regression mixtures. We propose a new robust EM algorithm for clustering curves. We extend the model-based clustering approach presented in [31] for Gaussian mixture models, to the case of curve clustering by regression mixtures, including polynomial regression mixtures as well as spline or B-spline regressions mixtures. Our approach both handles the problem of initialization and the one of choosing the optimal number of clusters as the EM learning proceeds, rather than in a twofold scheme. This is achieved by optimizing a penalized log-likelihood criterion. A simulation study confirms the potential benefit of the proposed algorithm in terms of robustness regarding initialization and funding the actual number of clusters.
Keywords :
data analysis; expectation-maximisation algorithm; learning (artificial intelligence); mixture models; pattern clustering; polynomial approximation; regression analysis; splines (mathematics); B-spline regression mixture; EM learning; Gaussian mixture model; cluster analysis; expectation-maximization algorithm; exploratory data analysis; finite mixture model; latent structure; mixture density estimation; model-based curve clustering; observed-data log-likelihood maximization; penalized log-likelihood criterion; polynomial regression mixture; robust EM algorithm; Algorithm design and analysis; Clustering algorithms; Data models; Polynomials; Robustness; Splines (mathematics); Vectors;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Neural Networks (IJCNN), The 2013 International Joint Conference on
Conference_Location :
Dallas, TX
ISSN :
2161-4393
Print_ISBN :
978-1-4673-6128-6
Type :
conf
DOI :
10.1109/IJCNN.2013.6706758
Filename :
6706758
Link To Document :
بازگشت