Finite-sum smooth optimization with SARAH

Abstract: The total complexity (measured as the total number of gradient computations) of a stochastic first-order optimization algorithm that finds a first-order stationary point of a finite-sum smooth nonconvex objective function

F (w) = f r a c 1 n s u m_{i = 1}^{n} f_{i} (w)

has been proven to be at least

O m e g a (s q r t n / e p s i l o n)

for

n l e q m a t h c a l O (e p s i l o n^{- 2})

where

e p s i l o n

denotes the attained accuracy

m a t h b b E [| a b l a F (i l d e w) |^{2}] l e q e p s i l o n

for the outputted approximation

i l d e w

(Fang et al., 2018). In this paper, we provide a convergence analysis for a slightly modified version of the SARAH algorithm (Nguyen et al., 2017a;b) and achieve total complexity that matches the lower-bound worst case complexity in (Fang et al., 2018) up to a constant factor when

n l e q m a t h c a l O (e p s i l o n^{- 2})

for nonconvex problems. For convex optimization, we propose SARAH++ with sublinear convergence for general convex and linear convergence for strongly convex problems; and we provide a practical version for which numerical experiments on various datasets show an improved performance.

Recommendations

Cites work

Cited in

(9)

Describes a project that uses

Uses Software

This page was built for publication: Finite-sum smooth optimization with SARAH

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2149950)