• DocumentCode
    807953
  • Title

    EM in high-dimensional spaces

  • Author

    Draper, Bruce A. ; Elliott, Daniel L. ; Hayes, Jeremy ; Baek, Kyungim

  • Author_Institution
    Comput. Sci. Dept., Colorado State Univ., Fort Collins, CO, USA
  • Volume
    35
  • Issue
    3
  • fYear
    2005
  • fDate
    6/1/2005 12:00:00 AM
  • Firstpage
    571
  • Lastpage
    577
  • Abstract
    This paper considers fitting a mixture of Gaussians model to high-dimensional data in scenarios where there are fewer data samples than feature dimensions. Issues that arise when using principal component analysis (PCA) to represent Gaussian distributions inside Expectation-Maximization (EM) are addressed, and a practical algorithm results. Unlike other algorithms that have been proposed, this algorithm does not try to compress the data to fit low-dimensional models. Instead, it models Gaussian distributions in the (N-1)-dimensional space spanned by the N data samples. We are able to show that this algorithm converges on data sets where low-dimensional techniques do not.
  • Keywords
    image classification; maximum likelihood estimation; optimisation; principal component analysis; unsupervised learning; Gaussian distributions; Gaussians model; PCA; expectation-maximization; high-dimensional data; image classification; maximum likelihood estimation; principal component analysis; unsupervised learning; Clustering algorithms; Covariance matrix; Eigenvalues and eigenfunctions; Gaussian distribution; Gaussian processes; Image coding; Image converters; Maximum likelihood estimation; Pixel; Principal component analysis; Expectation–Maximization; image classification; maximum likelihood estimation; principal component analysis; unsupervised learning; Algorithms; Artificial Intelligence; Cluster Analysis; Computer Simulation; Image Enhancement; Image Interpretation, Computer-Assisted; Information Storage and Retrieval; Likelihood Functions; Models, Biological; Models, Statistical; Pattern Recognition, Automated; Principal Component Analysis; Reproducibility of Results; Sensitivity and Specificity; Signal Processing, Computer-Assisted; Subtraction Technique;
  • fLanguage
    English
  • Journal_Title
    Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1083-4419
  • Type

    jour

  • DOI
    10.1109/TSMCB.2005.846670
  • Filename
    1430841