كليدواژه :
Probabilistic Clustering Model , معيار اطلاع ميزي , خوشه بندي احتمالاتي , Bayesian Information Criterion
چكيده لاتين :
One of the most important problems in analysis of multivariate data is to find the relationship between variables. The easiest way to understand these relationships are scatter plots. In literature of medical, genetics and other fields the goal of data analysis, is clustering data in some homogeneous groups. Most of the ad hoc methods such as hierarchical and nonhierarchical clustering methods are based on maximizing within-group similarities. Since these methods are dependent to the definition of distance between two clusters and number of clusters is determined by definition of an arbitrary threshold via these methods, researchers have problem in determining the best criteria to maximize the within-group similarities. The goal of this paper is to present a new method called "Probabilistic Clustering Model Selection Using Bayesian Information Criterion (BIC)" in such a way that firstly, its structure is free from personal-oriented assumptions about similarities. Secondly, since this method is model-based then by spectral decomposition of covariance matrix we can find the criteria for describing the volume, shape, and orientation of clusters. Furthermore, the best clustering model can be found using BIC.