Sharp Analysis of Stochastic Optimization under Global Kurdyka-{\L}ojasiewicz Inequality

arXiv2210.01748MaRDI QIDQ6412902FDOQ6412902

Authors: Ilyas Fatkhullin, Jalal Etesami, Niao He, Negar Kiyavash

Publication date: 4 October 2022

Abstract: We study the complexity of finding the global solution to stochastic nonconvex optimization when the objective function satisfies global Kurdyka-Lojasiewicz (KL) inequality and the queries from stochastic gradient oracles satisfy mild expected smoothness assumption. We first introduce a general framework to analyze Stochastic Gradient Descent (SGD) and its associated nonlinear dynamics under the setting. As a byproduct of our analysis, we obtain a sample complexity of

m a t h c a l O (e p s i l o n^{- (4 - a l p h a) / a l p h a})

for SGD when the objective satisfies the so called

a l p h a

-PL condition, where

a l p h a

is the degree of gradient domination. Furthermore, we show that a modified SGD with variance reduction and restarting (PAGER) achieves an improved sample complexity of

m a t h c a l O (e p s i l o n^{- 2 / a l p h a})

when the objective satisfies the average smoothness assumption. This leads to the first optimal algorithm for the important case of

a l p h a = 1

which appears in applications such as policy optimization in reinforcement learning.

Mathematics Subject Classification ID

Nonconvex programming, global optimization (90C26) Stochastic programming (90C15)

This page was built for publication: Sharp Analysis of Stochastic Optimization under Global Kurdyka-{\L}ojasiewicz Inequality

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6412902)