DocumentCode :
3688625
Title :
Randomized robust subspace recovery for big data
Author :
Mostafa Rahmani;George K. Atia
Author_Institution :
Department of Electrical Engineering and Computer Science, University of Central Florida, Orlando, FL, USA
fYear :
2015
Firstpage :
1
Lastpage :
6
Abstract :
In this paper, a randomized PCA algorithm that is robust to the presence of outliers and whose complexity is independent of the dimension of the given data matrix is proposed. Using random sampling and random embedding techniques, the given data matrix is turned to a small compressed data. A subspace learning approach is proposed to extract the columns subspace of the low rank matrix from the compressed data. Two ideas for robust subspace learning are proposed to work under two different model assumptions. The first idea is based on the linear dependence between the columns of the low rank matrix, and the second is based on the independence between the columns subspace of the low rank matrix and the subspace spanned by the outlying columns. We derive sufficient conditions to guarantee the performance of the proposed approach with high probability. It is shown that the proposed algorithm can successfully identify the outliers just by using roughly O(r2) random linear data observations, where r is the rank of the low rank matrix, and provably achieve notable speedups in comparison to existing approaches.
Keywords :
"Robustness","Principal component analysis","Data models","Sparse matrices","Algorithm design and analysis","Optimization","Yttrium"
Publisher :
ieee
Conference_Titel :
Machine Learning for Signal Processing (MLSP), 2015 IEEE 25th International Workshop on
Type :
conf
DOI :
10.1109/MLSP.2015.7324346
Filename :
7324346
Link To Document :
بازگشت