Breaking the Span Assumption Yields Fast Finite-Sum Minimization

From MaRDI portal
Publication:6301845

arXiv1805.07786MaRDI QIDQ6301845FDOQ6301845


Authors: Robert Hannah, Yanli Liu, Daniel O'Connor, Wotao Yin Edit this on Wikidata


Publication date: 20 May 2018

Abstract: In this paper, we show that SVRG and SARAH can be modified to be fundamentally faster than all of the other standard algorithms that minimize the sum of n smooth functions, such as SAGA, SAG, SDCA, and SDCA without duality. Most finite sum algorithms follow what we call the "span assumption": Their updates are in the span of a sequence of component gradients chosen in a random IID fashion. In the big data regime, where the condition number kappa=mathcalO(n), the span assumption prevents algorithms from converging to an approximate solution of accuracy epsilon in less than nln(1/epsilon) iterations. SVRG and SARAH do not follow the span assumption since they are updated with a hybrid of full-gradient and component-gradient information. We show that because of this, they can be up to Omega(1+(ln(n/kappa))+) times faster. In particular, to obtain an accuracy epsilon=1/nalpha for and , modified SVRG requires mathcalO(n) iterations, whereas algorithms that follow the span assumption require mathcalO(nln(n)) iterations. Moreover, we present lower bound results that show this speedup is optimal, and provide analysis to help explain why this speedup exists. With the understanding that the span assumption is a point of weakness of finite sum algorithms, future work may purposefully exploit this to yield even faster algorithms in the big data regime.













This page was built for publication: Breaking the Span Assumption Yields Fast Finite-Sum Minimization

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6301845)