Low-Rank Approximation and Regression in Input Sparsity Time

From MaRDI portal

Revision as of 15:56, 2 May 2024 by EloiFerrer (talk | contribs) (EloiFerrer moved page Low-Rank Approximation and Regression in Input Sparsity Time to Low-Rank Approximation and Regression in Input Sparsity Time: Duplicate)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:3177880

Jump to:navigation, search

DOI10.1145/3019134zbMath1426.65057arXiv1207.6365OpenAlexW2580753685MaRDI QIDQ3177880

Kenneth L. Clarkson, David P. Woodruff

Publication date: 2 August 2018

Published in: Journal of the ACM, Unnamed Author (Search for Journal in Brave)

Abstract: We design a new distribution over $poly(r eps^{-1}) imes n$ matrices $S$ so that for any fixed $n imes d$ matrix $A$ of rank $r$, with probability at least 9/10, $

orm{SAx}_2 = (1 pm eps) orm{Ax}_2$ simultaneously for all $x in mathbb{R}^d$. Such a matrix $S$ is called a emph{subspace embedding}. Furthermore, $SA$ can be computed in $ nz(A) + poly(d eps^{-1})$ time, where $ nz(A)$ is the number of non-zero entries of $A$. This improves over all previous subspace embeddings, which required at least $Omega(nd log d)$ time to achieve this property. We call our matrices $S$ emph{sparse embedding matrices}. Using our sparse embedding matrices, we obtain the fastest known algorithms for $(1+eps)$-approximation for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and $ell_p$-regression. The leading order term in the time complexity of our algorithms is $O( nz(A))$ or $O(

nz(A)log n)$. We optimize the low-order $poly(d/eps)$ terms in our running times (or for rank-$k$ approximation, the $n*poly(k/eps)$ term), and show various tradeoffs. For instance, we also use our methods to design new preconditioners that improve the dependence on $eps$ in least squares regression to $log 1/eps$. Finally, we provide preliminary experimental results which suggest that our algorithms are competitive in practice.

Full work available at URL: https://arxiv.org/abs/1207.6365

zbMATH Keywords

sketching approximation matrices randomized algorithm regression low-rank approximation randomized leverage scores

Mathematics Subject Classification ID

Linear regression; mixed models (62J05) Computational methods for sparse matrices (65F50) Analysis of algorithms and problem complexity (68Q25) Random matrices (algebraic aspects) (15B52) Randomized algorithms (68W20)

Related Items (95)

Randomized numerical linear algebra: Foundations and algorithms ⋮ Matrix sketching for supervised classification with imbalanced classes ⋮ Spectral estimation from simulations via sketching ⋮ Randomized Local Model Order Reduction ⋮ Fast Quantum Algorithms for Least Squares Regression and Statistic Leverage Scores ⋮ Low-Rank Approximation and Regression in Input Sparsity Time ⋮ Randomized Spectral Clustering in Large-Scale Stochastic Block Models ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Sparser Johnson-Lindenstrauss Transforms ⋮ Perturbations of the \textsc{Tcur} decomposition for tensor valued data in the Tucker format ⋮ An efficient algorithm for computing the approximate t-URV and its applications ⋮ Unnamed Item ⋮ Low-complexity \(l_0\)-norm penalized shrinkage linear and widely linear affine projection algorithms ⋮ Sharper Bounds for Regularized Data Fitting ⋮ Low-Rank PSD Approximation in Input-Sparsity Time ⋮ Unnamed Item ⋮ Guarantees for the Kronecker fast Johnson-Lindenstrauss transform using a coherence and sampling argument ⋮ RidgeSketch: A Fast Sketching Based Solver for Large Scale Ridge Regression ⋮ One-pass additive-error subset selection for \(\ell_p\) subspace approximation and \((k, p)\)-clustering ⋮ Structural conditions for projection-cost preservation via randomized matrix multiplication ⋮ Simpler is better: a comparative study of randomized pivoting algorithms for CUR and interpolative decompositions ⋮ Optimal subsampling for large‐sample quantile regression with massive data ⋮ Randomized algorithms for the computation of multilinear rank-\((\mu_1,\mu_2,\mu_3)\) approximations ⋮ Distributed penalized modal regression for massive data ⋮ Randomized algorithms in numerical linear algebra ⋮ Randomized Low-Rank Approximation for Symmetric Indefinite Matrices ⋮ Randomized LU decomposition ⋮ An Improved Analysis and Unified Perspective on Deterministic and Randomized Low-Rank Matrix Approximation ⋮ Principled interpolation of Green's functions learned from data ⋮ Generalized linear models for massive data via doubly-sketching ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Optimal subsampling algorithms for composite quantile regression in massive data ⋮ Max-Plus Algebraic Statistical Leverage Scores ⋮ Practical Sketching Algorithms for Low-Rank Matrix Approximation ⋮ Adaptive iterative Hessian sketch via \(A\)-optimal subsampling ⋮ Fast randomized numerical rank estimation for numerically low-rank matrices ⋮ A fast randomized algorithm for computing an approximate null space ⋮ On unifying randomized methods for inverse problems ⋮ On randomized sketching algorithms and the Tracy-Widom law ⋮ Hessian averaging in stochastic Newton methods achieves superlinear convergence ⋮ Sharp Analysis of Sketch-and-Project Methods via a Connection to Randomized Singular Value Decomposition ⋮ Single Pass Spectral Sparsification in Dynamic Streams ⋮ Optimal CUR Matrix Decompositions ⋮ Unnamed Item ⋮ Low Rank Approximation of a Sparse Matrix Based on LU Factorization with Column and Row Tournament Pivoting ⋮ Structural Convergence Results for Approximation of Dominant Subspaces from Block Krylov Spaces ⋮ Sketching for Principal Component Regression ⋮ Structured Random Sketching for PDE Inverse Problems ⋮ Turning Big Data Into Tiny Data: Constant-Size Coresets for $k$-Means, PCA, and Projective Clustering ⋮ Toward a unified theory of sparse dimensionality reduction in Euclidean space ⋮ An improvement of the parameterized frequent directions algorithm ⋮ Fast quantum algorithms for least squares regression and statistic leverage scores ⋮ Unnamed Item ⋮ Sublinear update time randomized algorithms for dynamic graph regression ⋮ A count sketch maximal weighted residual Kaczmarz method for solving highly overdetermined linear systems ⋮ Randomized LU decomposition using sparse projections ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Fast randomized matrix and tensor interpolative decomposition using countsketch ⋮ Pass-Efficient Randomized Algorithms for Low-Rank Matrix Approximation Using Any Number of Views ⋮ Sparse random matrices have simple spectrum ⋮ RA-HOOI: rank-adaptive higher-order orthogonal iteration for the fixed-accuracy low multilinear-rank approximation of tensors ⋮ Randomized tensor wheel decomposition ⋮ Fast and accurate randomized algorithms for linear systems and eigenvalue problems ⋮ Robust and efficient subsampling algorithms for massive data logistic regression ⋮ Optimal Poisson subsampling for softmax regression ⋮ \texttt{pylspack}: parallel algorithms and data structures for sketching, column subset selection, regression, and leverage scores ⋮ Improving compressed matrix multiplication using control variate method ⋮ Random projections for Bayesian regression ⋮ A stochastic perturbation analysis of the QR decomposition and its applications ⋮ Optimal Subsampling for Large Sample Logistic Regression ⋮ Randomized block Krylov subspace methods for trace and log-determinant estimators ⋮ Efficient bounds and estimates for canonical angles in randomized subspace approximations ⋮ SketchySGD: reliable stochastic optimization via randomized curvature estimates ⋮ An efficient randomized algorithm for computing the approximate Tucker decomposition ⋮ Optimal Bounds for Johnson-Lindenstrauss Transformations ⋮ Frequent Directions: Simple and Deterministic Matrix Sketching ⋮ Streaming Low-Rank Matrix Approximation with an Application to Scientific Simulation ⋮ Unnamed Item ⋮ Tail bounds for gaps between eigenvalues of sparse random matrices ⋮ On Approximating Matrix Norms in Data Streams ⋮ An Implicit Representation and Iterative Solution of Randomly Sketched Linear Systems ⋮ Fast and Accurate Gaussian Kernel Ridge Regression Using Matrix Decompositions for Preconditioning ⋮ Estimating Leverage Scores via Rank Revealing Methods and Randomization ⋮ Small-deviation inequalities for sums of random matrices ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Tensor-Structured Sketching for Constrained Least Squares ⋮ ISLET: Fast and Optimal Low-Rank Tensor Regression via Importance Sketching

This page was built for publication: Low-Rank Approximation and Regression in Input Sparsity Time

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3177880&oldid=34061293"