Low-Rank Approximation and Regression in Input Sparsity Time

Abstract: We design a new distribution over

p o l y (r e p s^{- 1}) i m e s n

matrices

S

so that for any fixed

n i m e s d

matrix

A

of rank

r

, with probability at least 9/10,

o r m {S A x}_{2} = (1 p m e p s) o r m {A x}_{2}

simultaneously for all

x i n m a t h b b R^{d}

. Such a matrix

S

is called a emph{subspace embedding}. Furthermore,

S A

can be computed in

n z (A) + p o l y (d e p s^{- 1})

time, where

n z (A)

is the number of non-zero entries of

A

. This improves over all previous subspace embeddings, which required at least

O m e g a (n d l o g d)

time to achieve this property. We call our matrices

S

emph{sparse embedding matrices}. Using our sparse embedding matrices, we obtain the fastest known algorithms for

(1 + e p s)

-approximation for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and

e l l_{p}

-regression. The leading order term in the time complexity of our algorithms is

O (n z (A))

or

O (n z (A) l o g n)

. We optimize the low-order

p o l y (d / e p s)

terms in our running times (or for rank-

k

approximation, the

n * p o l y (k / e p s)

term), and show various tradeoffs. For instance, we also use our methods to design new preconditioners that improve the dependence on

e p s

in least squares regression to

l o g 1 / e p s

. Finally, we provide preliminary experimental results which suggest that our algorithms are competitive in practice.

Recommendations

Cited In (only showing first 100 items - show all)

This page was built for publication: Low-Rank Approximation and Regression in Input Sparsity Time