Subspace Learning and Imputation for Streaming Big Data Matrices and Tensors
From MaRDI portal
Publication:4580581
Abstract: Extracting latent low-dimensional structure from high-dimensional data is of paramount importance in timely inference tasks encountered with `Big Data' analytics. However, increasingly noisy, heterogeneous, and incomplete datasets as well as the need for {em real-time} processing of streaming data pose major challenges to this end. In this context, the present paper permeates benefits from rank minimization to scalable imputation of missing data, via tracking low-dimensional subspaces and unraveling latent (possibly multi-way) structure from emph{incomplete streaming} data. For low-rank matrix data, a subspace estimator is proposed based on an exponentially-weighted least-squares criterion regularized with the nuclear norm. After recasting the non-separable nuclear norm into a form amenable to online optimization, real-time algorithms with complementary strengths are developed and their convergence is established under simplifying technical assumptions. In a stationary setting, the asymptotic estimates obtained offer the well-documented performance guarantees of the {em batch} nuclear-norm regularized estimator. Under the same unifying framework, a novel online (adaptive) algorithm is developed to obtain multi-way decompositions of emph{low-rank tensors} with missing entries, and perform imputation as a byproduct. Simulated tests with both synthetic as well as real Internet and cardiac magnetic resonance imagery (MRI) data confirm the efficacy of the proposed algorithms, and their superior performance relative to state-of-the-art alternatives.
Cited in
(9)- Orthogonal self-guided similarity preserving projection for classification and clustering
- Stochastic gradients for large-scale tensor decomposition
- Streaming principal component analysis from incomplete data
- Online subspace learning and imputation by tensor-ring decomposition
- Matrix completion with column outliers and sparse noise
- An AO-ADMM Approach to Constraining PARAFAC2 on All Modes
- Online Categorical Subspace Learning for Sketching Big Data with Misses
- An Adaptive Sampling Strategy for Online Monitoring and Diagnosis of High-Dimensional Streaming Data
- Tensor decision trees for continual learning from drifting data streams
This page was built for publication: Subspace Learning and Imputation for Streaming Big Data Matrices and Tensors
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4580581)