Title :
Optimized data-driven order selection method for Gaussian mixtures on clustering problems
Author :
Corona, Enrique ; Nutter, Brian ; Mitra, Sunanda
Author_Institution :
Dept. of Electr. & Comput. Eng., Texas Tech Univ. Lubbock, Lubbock, TX, USA
Abstract :
Perhaps the most fundamental consideration when modeling data as a mixture of Gaussians is the number of components in the mixture. To this end, numerous approaches have been proposed, ranging from the classic use of statistical hypothesis testing methods to make decisions, to the determination of balance between the model Goodness-of-Fit (GoF) and complexity. In this paper, we explore an existing simple yet powerful order selection method developed in the field of information theory, the Jump method. This method infers the model order by estimating, transforming, and analyzing a description of the distortion-rate function, R(D) of the input data. The description of the R(D) curve is efficiently estimated through the popular K-means clustering algorithm using proper seeding techniques. The proposed adaptations to the Jump method allow for higher sensitivity and improved performance at low dimensionality. These adaptations are experimentally tested in a clustering setting with synthetic and natural data. The results suggest better performance than with the original version.
Keywords :
Gaussian processes; pattern clustering; Gaussian mixtures; clustering problems; data driven order selection method; distortion rate function; jump method; model complexity; model goodness-of-fit; Clustering algorithms; Data compression; Distortion measurement; Information theory; Optimization methods; Performance loss; Random variables; Rate-distortion; Testing; Vector quantization; Gaussian mixtures; K-means clustering; Lossy data compression; Model order identification; Rate-distortion theory;
Conference_Titel :
Image Analysis & Interpretation (SSIAI), 2010 IEEE Southwest Symposium on
Conference_Location :
Austin, TX
Print_ISBN :
978-1-4244-7801-9
DOI :
10.1109/SSIAI.2010.5483914