Taming Neural Networks with TUSLA: Nonconvex Learning via Adaptive Stochastic Gradient Langevin Algorithms

From MaRDI portal
Publication:6162009

DOI10.1137/22M1514283zbMATH Open1518.65007arXiv2006.14514OpenAlexW3037401783MaRDI QIDQ6162009FDOQ6162009


Authors: Attila Lovas, Miklós Rásonyi, Sotirios Sabanis Edit this on Wikidata


Publication date: 28 June 2023

Published in: SIAM Journal on Mathematics of Data Science (Search for Journal in Brave)

Abstract: Artificial neural networks (ANNs) are typically highly nonlinear systems which are finely tuned via the optimization of their associated, non-convex loss functions. In many cases, the gradient of any such loss function has superlinear growth, making the use of the widely-accepted (stochastic) gradient descent methods, which are based on Euler numerical schemes, problematic. We offer a new learning algorithm based on an appropriately constructed variant of the popular stochastic gradient Langevin dynamics (SGLD), which is called tamed unadjusted stochastic Langevin algorithm (TUSLA). We also provide a nonasymptotic analysis of the new algorithm's convergence properties in the context of non-convex learning problems with the use of ANNs. Thus, we provide finite-time guarantees for TUSLA to find approximate minimizers of both empirical and population risks. The roots of the TUSLA algorithm are based on the taming technology for diffusion processes with superlinear coefficients as developed in citet{tamed-euler, SabanisAoAP} and for MCMC algorithms in citet{tula}. Numerical experiments are presented which confirm the theoretical findings and illustrate the need for the use of the new algorithm in comparison to vanilla SGLD within the framework of ANNs.


Full work available at URL: https://arxiv.org/abs/2006.14514




Recommendations




Cites Work


Cited In (6)





This page was built for publication: Taming Neural Networks with TUSLA: Nonconvex Learning via Adaptive Stochastic Gradient Langevin Algorithms

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6162009)