Title :
A Bayesian approach to unsupervised feature selection and density estimation using expectation propagation
Author :
Chang, Shaorong ; Dasgupta, Nilanjan ; Carin, Lawrence
Author_Institution :
Dept. of Electr. & Comput. Eng., Duke Univ., Durham, NC, USA
Abstract :
We propose an approximate Bayesian approach for unsupervised feature selection and density estimation, where the importance of the features for clustering is used as the measure for feature selection. Traditional maximum-likelihood (ML) model-parameter optimization schemes estimate the feature saliencies for a fixed model structure (i.e., a fixed number of clusters). In practice, the number of clusters present in the data for mixture-based modeling is unknown. In an ML framework, the number of clusters typically needs to be ascertained prior to estimating the feature saliencies. We propose a density estimation scheme that addresses model complexity (number of clusters present) and model-parameter estimation (feature saliencies) in a single optimization framework. The approximate Bayesian approach presented here, based on the expectation propagation method, obtains a full posterior distribution on the saliency of the features, along with full posterior distribution of other model parameters (including the number of clusters) that represent the underlying statistics of the data. The performance of the algorithm, is analyzed based on its ability to identify the features salient for clustering the multivariate data.
Keywords :
belief networks; feature extraction; image representation; maximum likelihood estimation; optimisation; parameter estimation; pattern clustering; unsupervised learning; Bayesian approach; density estimation; expectation propagation; feature clustering; feature saliency; full posterior distribution; maximum-likelihood estimate; mixture-based modeling; model complexity; model-parameter estimation; model-parameter optimization; multivariate data; unsupervised feature selection; Bayesian methods; Clustering algorithms; Density measurement; Electric variables measurement; Maximum likelihood estimation; Parametric statistics; Probability distribution; Sampling methods; Statistical distributions; Training data;
Conference_Titel :
Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on
Print_ISBN :
0-7695-2372-2
DOI :
10.1109/CVPR.2005.15