• DocumentCode
    6031
  • Title

    Subspace Learning and Imputation for Streaming Big Data Matrices and Tensors

  • Author

    Mardani, Morteza ; Mateos, Gonzalo ; Giannakis, Georgios B.

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Univ. of Minnesota, Minneapolis, MN, USA
  • Volume
    63
  • Issue
    10
  • fYear
    2015
  • fDate
    15-May-15
  • Firstpage
    2663
  • Lastpage
    2677
  • Abstract
    Extracting latent low-dimensional structure from high-dimensional data is of paramount importance in timely inference tasks encountered with “Big Data” analytics. However, increasingly noisy, heterogeneous, and incomplete datasets, as well as the need for real-time processing of streaming data, pose major challenges to this end. In this context, the present paper permeates benefits from rank minimization to scalable imputation of missing data, via tracking low-dimensional subspaces and unraveling latent (possibly multi-way) structure from incomplete streaming data. For low-rank matrix data, a subspace estimator is proposed based on an exponentially weighted least-squares criterion regularized with the nuclear norm. After recasting the nonseparable nuclear norm into a form amenable to online optimization, real-time algorithms with complementary strengths are developed, and their convergence is established under simplifying technical assumptions. In a stationary setting, the asymptotic estimates obtained offer the well-documented performance guarantees of the batch nuclear-norm regularized estimator. Under the same unifying framework, a novel online (adaptive) algorithm is developed to obtain multi-way decompositions of low-rank tensors with missing entries and perform imputation as a byproduct. Simulated tests with both synthetic as well as real Internet and cardiac magnetic resonance imagery (MRI) data confirm the efficacy of the proposed algorithms, and their superior performance relative to state-of-the-art alternatives.
  • Keywords
    Big Data; convergence; data analysis; learning (artificial intelligence); least squares approximations; matrix algebra; minimisation; real-time systems; tensors; Internet; MRI data; batch nuclear-norm regularized estimator; big data analytics; big data matrix streaming; big data tensor streaming; cardiac magnetic resonance imagery data; exponentially weighted least-squares criterion; incomplete streaming data; latent low-dimensional structure extraction; low-dimensional subspace tracking; low-rank tensors; missing data imputation; nonseparable nuclear norm; online adaptive algorithm; online optimization; rank minimization; real-time algorithms; real-time processing; subspace estimator; subspace learning; Algorithm design and analysis; Convergence; Data models; Real-time systems; Signal processing algorithms; Tensile stress; Vectors; Low rank; matrix and tensor completion; missing data; streaming analytics; subspace tracking;
  • fLanguage
    English
  • Journal_Title
    Signal Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1053-587X
  • Type

    jour

  • DOI
    10.1109/TSP.2015.2417491
  • Filename
    7072498