Title :
Mixture Modeling and Information Criteria for Discovering Patterns in Continuous Data
Author :
Fonseca, Jaime R S
Author_Institution :
Tech. Univ. of Lisbon, Lisbon
Abstract :
This study addresses the adequacy of some theoretical information criteria when using finite mixture modelling on discovering patterns in continuous data. In fact, the selection of an adequate number of clusters is a key issue in deriving complex mixture structures and it is desirable that information criteria used for this end are effective. In order to select among several information criteria, which may support the selection of the correct number of clusters, we conduct a simulation study that is intended to determine which information criteria are more appropriate for mixture model selection when considering data sets with only continuous clustering base variables. As a result, the criterion BIC shows a better performance, that is, it indicates the correct number of the simulated cluster structures more often, when referring to mixtures of continuous clustering base variables.
Keywords :
data mining; pattern clustering; base variable clustering; continuous data; data mining; data set; finite mixture modeling; information criteria; mixture model selection; pattern discovery; Clustering algorithms; Data mining; Hybrid intelligent systems; Information analysis; Maximum likelihood estimation; Probability distribution; Proposals; Unsupervised learning; Continuous Clustering Base Variables; Finite Mixture Models; Model Selection; Patterns in Continuous Data; Quantitative Methods; Simulation experiments; Theoretical Information Criteria;
Conference_Titel :
Hybrid Intelligent Systems, 2008. HIS '08. Eighth International Conference on
Conference_Location :
Barcelona
Print_ISBN :
978-0-7695-3326-1
Electronic_ISBN :
978-0-7695-3326-1
DOI :
10.1109/HIS.2008.32