Efficiency of stochastic coordinate proximal gradient methods on nonseparable composite optimization

arXiv2104.13370MaRDI QIDQ6504474FDOQ6504474

Abstract: This paper deals with large-scale composite optimization problems having the objective function formed as a sum of two terms, one has Lipschitz continuous gradient along random subspaces and may be nonconvex and the second term is simple, but possibly nonconvex and nonseparable. Under these settings we design a stochastic coordinate proximal gradient method which takes into account the nonseparable composite form of the objective function. This algorithm achieves scalability by constructing at each iteration a local approximation model of the nonseparable objective function along a random subspace with user-determined dimension. We outline efficient techniques for selecting the random subspace, yielding an implementation that has low cost per-iteration while also achieving fast convergence rates. We present a probabilistic worst-case complexity analysis for our stochastic coordinate proximal gradient method in convex and nonconvex settings, in particular we prove high-probability bounds on the number of iterations before a given optimality is achieved. To the best of our knowledge, this work is the first proposing a pure stochastic coordinate descent algorithm which is supported by global efficiency estimates for general nonseparable composite optimization problems. Preliminary numerical results also confirm the efficiency of our algorithm.

This page was built for publication: Efficiency of stochastic coordinate proximal gradient methods on nonseparable composite optimization

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6504474)