Low-Rank Approximation and Regression in Input Sparsity Time
Publication:3177880
DOI10.1145/3019134zbMath1426.65057arXiv1207.6365OpenAlexW2580753685MaRDI QIDQ3177880
Kenneth L. Clarkson, David P. Woodruff
Publication date: 2 August 2018
Published in: Journal of the ACM, Unnamed Author (Search for Journal in Brave)
orm{SAx}_2 = (1 pm eps) orm{Ax}_2$ simultaneously for all $x in mathbb{R}^d$. Such a matrix $S$ is called a emph{subspace embedding}. Furthermore, $SA$ can be computed in $ nz(A) + poly(d eps^{-1})$ time, where $ nz(A)$ is the number of non-zero entries of $A$. This improves over all previous subspace embeddings, which required at least $Omega(nd log d)$ time to achieve this property. We call our matrices $S$ emph{sparse embedding matrices}. Using our sparse embedding matrices, we obtain the fastest known algorithms for $(1+eps)$-approximation for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and $ell_p$-regression. The leading order term in the time complexity of our algorithms is $O( nz(A))$ or $O(
nz(A)log n)$. We optimize the low-order $poly(d/eps)$ terms in our running times (or for rank-$k$ approximation, the $n*poly(k/eps)$ term), and show various tradeoffs. For instance, we also use our methods to design new preconditioners that improve the dependence on $eps$ in least squares regression to $log 1/eps$. Finally, we provide preliminary experimental results which suggest that our algorithms are competitive in practice.
Full work available at URL: https://arxiv.org/abs/1207.6365
sketchingapproximationmatricesrandomized algorithmregressionlow-rank approximationrandomizedleverage scores
Linear regression; mixed models (62J05) Computational methods for sparse matrices (65F50) Analysis of algorithms and problem complexity (68Q25) Random matrices (algebraic aspects) (15B52) Randomized algorithms (68W20)
Related Items (95)
This page was built for publication: Low-Rank Approximation and Regression in Input Sparsity Time