Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression

DOI10.1145/2488608.2488621zbMATH Open1293.68150arXiv1210.3135OpenAlexW2134342155MaRDI QIDQ5495779FDOQ5495779

Authors: Xiangrui Meng, Michael W. Mahoney

Publication date: 7 August 2014

Published in: Proceedings of the forty-eighth annual ACM symposium on Theory of Computing (Search for Journal in Brave)

Abstract: Low-distortion embeddings are critical building blocks for developing random sampling and random projection algorithms for linear algebra problems. We show that, given a matrix

A i n R^{n i m e s d}

with

n g g d

and a

p i n [1, 2)

, with a constant probability, we can construct a low-distortion embedding matrix

P i i n R^{O (p o l y (d)) i m e s n}

that embeds

A_{p}

, the

e l l_{p}

subspace spanned by

A

's columns, into

(R^{O (p o l y (d))}, | c d o t |_{p})

; the distortion of our embeddings is only

O (p o l y (d))

, and we can compute

P i A

in

O (n z (A))

time, i.e., input-sparsity time. Our result generalizes the input-sparsity time

e l l_{2}

subspace embedding by Clarkson and Woodruff [STOC'13]; and for completeness, we present a simpler and improved analysis of their construction for

e l l_{2}

. These input-sparsity time

e l l_{p}

embeddings are optimal, up to constants, in terms of their running time; and the improved running time propagates to applications such as

(1 p m e p s i l o n)

-distortion

e l l_{p}

subspace embedding and relative-error

e l l_{p}

regression. For

e l l_{2}

, we show that a

(1 + e p s i l o n)

-approximate solution to the

e l l_{2}

regression problem specified by the matrix

A

and a vector

b i n R^{n}

can be computed in

O (n z (A) + d^{3} l o g (d / e p s i l o n) / e p s i l o n^{2})

time; and for

e l l_{p}

, via a subspace-preserving sampling procedure, we show that a

(1 p m e p s i l o n)

-distortion embedding of

A_{p}

into

R^{O (p o l y (d))}

can be computed in

O (n z (A) c d o t l o g n)

time, and we also show that a

(1 + e p s i l o n)

-approximate solution to the

e l l_{p}

regression problem

m i n_{x i n R^{d}} | A x - b |_{p}

can be computed in

O (n z (A) c d o t l o g n + p o l y (d) l o g (1 / e p s i l o n) / e p s i l o n^{2})

time. Moreover, we can improve the embedding dimension or equivalently the sample size to

O (d^{3 + p / 2} l o g (1 / e p s i l o n) / e p s i l o n^{2})

without increasing the complexity.

Full work available at URL: https://arxiv.org/abs/1210.3135

Recommendations

zbMATH Keywords

linear regression robust regression P subspace embedding low-distortion embedding input-sparsity time

Mathematics Subject Classification ID

Linear regression; mixed models (62J05) Analysis of algorithms and problem complexity (68Q25) Randomized algorithms (68W20) Computational difficulty of problems (lower bounds, completeness, difficulty of approximation, etc.) (68Q17)

Cited In (42)

Uses Software

LSRN

This page was built for publication: Low-distortion subspace embeddings in input-sparsity time and applications to robust linear regression

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5495779)