• DocumentCode
    44937
  • Title

    Outlier-Robust PCA: The High-Dimensional Case

  • Author

    Xu, Huan ; Caramanis, Constantine ; Mannor, Shie

  • Author_Institution
    Dept. of Mech. Eng., Nat. Univ. of Singapore, Singapore, Singapore
  • Volume
    59
  • Issue
    1
  • fYear
    2013
  • fDate
    Jan. 2013
  • Firstpage
    546
  • Lastpage
    572
  • Abstract
    Principal component analysis plays a central role in statistics, engineering, and science. Because of the prevalence of corrupted data in real-world applications, much research has focused on developing robust algorithms. Perhaps surprisingly, these algorithms are unequipped-indeed, unable-to deal with outliers in the high-dimensional setting where the number of observations is of the same magnitude as the number of variables of each observation, and the dataset contains some (arbitrarily) corrupted observations. We propose a high-dimensional robust principal component analysis algorithm that is efficient, robust to contaminated points, and easily kernelizable. In particular, our algorithm achieves maximal robustness-it has a breakdown point of 50% (the best possible), while all existing algorithms have a breakdown point of zero. Moreover, our algorithm recovers the optimal solution exactly in the case where the number of corrupted points grows sublinearly in the dimension.
  • Keywords
    data handling; principal component analysis; contaminated points; corrupted data; corrupted points; dataset; high-dimensional case; high-dimensional robust principal component analysis algorithm; optimal solution; outlier-robust PCA; real-world applications; robust algorithms; Approximation algorithms; Electric breakdown; Matrix decomposition; Noise; Principal component analysis; Robustness; Standards; Dimension reduction; outlier; principal component analysis (PCA); robustness; statistical learning;
  • fLanguage
    English
  • Journal_Title
    Information Theory, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9448
  • Type

    jour

  • DOI
    10.1109/TIT.2012.2212415
  • Filename
    6307864