Low-Rank Approximation and Regression in Input Sparsity Time

From MaRDI portal
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:3177880

DOI10.1145/3019134zbMath1426.65057arXiv1207.6365OpenAlexW2580753685MaRDI QIDQ3177880

Kenneth L. Clarkson, David P. Woodruff

Publication date: 2 August 2018

Published in: Journal of the ACM, Unnamed Author (Search for Journal in Brave)

Abstract: We design a new distribution over $poly(r eps^{-1}) imes n$ matrices $S$ so that for any fixed $n imes d$ matrix $A$ of rank $r$, with probability at least 9/10, $

orm{SAx}_2 = (1 pm eps) orm{Ax}_2$ simultaneously for all $x in mathbb{R}^d$. Such a matrix $S$ is called a emph{subspace embedding}. Furthermore, $SA$ can be computed in $ nz(A) + poly(d eps^{-1})$ time, where $ nz(A)$ is the number of non-zero entries of $A$. This improves over all previous subspace embeddings, which required at least $Omega(nd log d)$ time to achieve this property. We call our matrices $S$ emph{sparse embedding matrices}. Using our sparse embedding matrices, we obtain the fastest known algorithms for $(1+eps)$-approximation for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and $ell_p$-regression. The leading order term in the time complexity of our algorithms is $O( nz(A))$ or $O(

nz(A)log n)$. We optimize the low-order $poly(d/eps)$ terms in our running times (or for rank-$k$ approximation, the $n*poly(k/eps)$ term), and show various tradeoffs. For instance, we also use our methods to design new preconditioners that improve the dependence on $eps$ in least squares regression to $log 1/eps$. Finally, we provide preliminary experimental results which suggest that our algorithms are competitive in practice.


Full work available at URL: https://arxiv.org/abs/1207.6365






Related Items (95)

Randomized numerical linear algebra: Foundations and algorithmsMatrix sketching for supervised classification with imbalanced classesSpectral estimation from simulations via sketchingRandomized Local Model Order ReductionFast Quantum Algorithms for Least Squares Regression and Statistic Leverage ScoresLow-Rank Approximation and Regression in Input Sparsity TimeRandomized Spectral Clustering in Large-Scale Stochastic Block ModelsUnnamed ItemUnnamed ItemUnnamed ItemSparser Johnson-Lindenstrauss TransformsPerturbations of the \textsc{Tcur} decomposition for tensor valued data in the Tucker formatAn efficient algorithm for computing the approximate t-URV and its applicationsUnnamed ItemLow-complexity \(l_0\)-norm penalized shrinkage linear and widely linear affine projection algorithmsSharper Bounds for Regularized Data FittingLow-Rank PSD Approximation in Input-Sparsity TimeUnnamed ItemGuarantees for the Kronecker fast Johnson-Lindenstrauss transform using a coherence and sampling argumentRidgeSketch: A Fast Sketching Based Solver for Large Scale Ridge RegressionOne-pass additive-error subset selection for \(\ell_p\) subspace approximation and \((k, p)\)-clusteringStructural conditions for projection-cost preservation via randomized matrix multiplicationSimpler is better: a comparative study of randomized pivoting algorithms for CUR and interpolative decompositionsOptimal subsampling for large‐sample quantile regression with massive dataRandomized algorithms for the computation of multilinear rank-\((\mu_1,\mu_2,\mu_3)\) approximationsDistributed penalized modal regression for massive dataRandomized algorithms in numerical linear algebraRandomized Low-Rank Approximation for Symmetric Indefinite MatricesRandomized LU decompositionAn Improved Analysis and Unified Perspective on Deterministic and Randomized Low-Rank Matrix ApproximationPrincipled interpolation of Green's functions learned from dataGeneralized linear models for massive data via doubly-sketchingUnnamed ItemUnnamed ItemOptimal subsampling algorithms for composite quantile regression in massive dataMax-Plus Algebraic Statistical Leverage ScoresPractical Sketching Algorithms for Low-Rank Matrix ApproximationAdaptive iterative Hessian sketch via \(A\)-optimal subsamplingFast randomized numerical rank estimation for numerically low-rank matricesA fast randomized algorithm for computing an approximate null spaceOn unifying randomized methods for inverse problemsOn randomized sketching algorithms and the Tracy-Widom lawHessian averaging in stochastic Newton methods achieves superlinear convergenceSharp Analysis of Sketch-and-Project Methods via a Connection to Randomized Singular Value DecompositionSingle Pass Spectral Sparsification in Dynamic StreamsOptimal CUR Matrix DecompositionsUnnamed ItemLow Rank Approximation of a Sparse Matrix Based on LU Factorization with Column and Row Tournament PivotingStructural Convergence Results for Approximation of Dominant Subspaces from Block Krylov SpacesSketching for Principal Component RegressionStructured Random Sketching for PDE Inverse ProblemsTurning Big Data Into Tiny Data: Constant-Size Coresets for $k$-Means, PCA, and Projective ClusteringToward a unified theory of sparse dimensionality reduction in Euclidean spaceAn improvement of the parameterized frequent directions algorithmFast quantum algorithms for least squares regression and statistic leverage scoresUnnamed ItemSublinear update time randomized algorithms for dynamic graph regressionA count sketch maximal weighted residual Kaczmarz method for solving highly overdetermined linear systemsRandomized LU decomposition using sparse projectionsUnnamed ItemUnnamed ItemUnnamed ItemFast randomized matrix and tensor interpolative decomposition using countsketchPass-Efficient Randomized Algorithms for Low-Rank Matrix Approximation Using Any Number of ViewsSparse random matrices have simple spectrumRA-HOOI: rank-adaptive higher-order orthogonal iteration for the fixed-accuracy low multilinear-rank approximation of tensorsRandomized tensor wheel decompositionFast and accurate randomized algorithms for linear systems and eigenvalue problemsRobust and efficient subsampling algorithms for massive data logistic regressionOptimal Poisson subsampling for softmax regression\texttt{pylspack}: parallel algorithms and data structures for sketching, column subset selection, regression, and leverage scoresImproving compressed matrix multiplication using control variate methodRandom projections for Bayesian regressionA stochastic perturbation analysis of the QR decomposition and its applicationsOptimal Subsampling for Large Sample Logistic RegressionRandomized block Krylov subspace methods for trace and log-determinant estimatorsEfficient bounds and estimates for canonical angles in randomized subspace approximationsSketchySGD: reliable stochastic optimization via randomized curvature estimatesAn efficient randomized algorithm for computing the approximate Tucker decompositionOptimal Bounds for Johnson-Lindenstrauss TransformationsFrequent Directions: Simple and Deterministic Matrix SketchingStreaming Low-Rank Matrix Approximation with an Application to Scientific SimulationUnnamed ItemTail bounds for gaps between eigenvalues of sparse random matricesOn Approximating Matrix Norms in Data StreamsAn Implicit Representation and Iterative Solution of Randomly Sketched Linear SystemsFast and Accurate Gaussian Kernel Ridge Regression Using Matrix Decompositions for PreconditioningEstimating Leverage Scores via Rank Revealing Methods and RandomizationSmall-deviation inequalities for sums of random matricesUnnamed ItemUnnamed ItemUnnamed ItemUnnamed ItemTensor-Structured Sketching for Constrained Least SquaresISLET: Fast and Optimal Low-Rank Tensor Regression via Importance Sketching





This page was built for publication: Low-Rank Approximation and Regression in Input Sparsity Time