Taming Neural Networks with TUSLA: Nonconvex Learning via Adaptive Stochastic Gradient Langevin Algorithms
DOI10.1137/22M1514283zbMATH Open1518.65007arXiv2006.14514OpenAlexW3037401783MaRDI QIDQ6162009FDOQ6162009
Authors: Attila Lovas, Miklós Rásonyi, Sotirios Sabanis
Publication date: 28 June 2023
Published in: SIAM Journal on Mathematics of Data Science (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2006.14514
Recommendations
- Conjugate-gradient-based Adam for nonconvex stochastic optimization and its application to deep learning
- Stochastic generalized gradient methods for training nonconvex nonsmooth neural networks
- Adaptive methods using element-wise \(p\)th power of stochastic gradient for nonconvex optimization in deep neural networks
- Stochastic gradient Langevin dynamics with adaptive drifts
- Stochastic perturbation of subgradient algorithm for nonconvex deep neural networks
- Uniformly convex neural networks and non-stationary iterated network Tikhonov (iNETT) method
- Adaptivity of stochastic gradient methods for nonconvex optimization
- The convergence of stochastic gradient algorithms applied to learning in neural networks
- Optimal nonparametric inference via deep neural network
Monte Carlo methods (65C05) Sequential statistical analysis (62L10) Artificial neural networks and deep learning (68T07) Stochastic learning and adaptive control (93E35)
Cites Work
- Strong and weak divergence in finite time of Euler's method for stochastic differential equations with non-globally Lipschitz continuous coefficients
- Strong convergence of an explicit numerical method for SDEs with nonglobally Lipschitz continuous coefficients
- Euler approximations with varying coefficients: the case of superlinearly growing diffusion coefficients
- A note on tamed Euler approximations
- Laplace's method revisited: Weak convergence of probability measures
- High-dimensional Bayesian inference via the unadjusted Langevin algorithm
- Nonasymptotic convergence analysis for the unadjusted Langevin algorithm
- Theoretical Guarantees for Approximate Sampling from Smooth and Log-Concave Densities
- Couplings and quantitative contraction rates for Langevin dynamics
- Quantitative Harris-type theorems for diffusions and McKean-Vlasov processes
- The tamed unadjusted Langevin algorithm
- On stochastic gradient Langevin dynamics with dependent data streams in the logconcave case
- Convergence and dynamical behavior of the ADAM algorithm for nonconvex stochastic optimization
- User-friendly guarantees for the Langevin Monte Carlo with inaccurate gradient
- On Stochastic Gradient Langevin Dynamics with Dependent Data Streams: The Fully Nonconvex Case
- Higher order Langevin Monte Carlo algorithm
- Nonasymptotic estimates for stochastic gradient Langevin dynamics under local conditions in nonconvex optimization
Cited In (6)
- An inertial Newton algorithm for deep learning
- A phase transition for finding needles in nonlinear haystacks with LASSO artificial neural networks
- Statistical Finite Elements via Langevin Dynamics
- Kinetic Langevin MCMC sampling without gradient Lipschitz continuity -- the strongly convex case
- Non-asymptotic estimates for TUSLA algorithm for non-convex learning with applications to neural networks with ReLU activation function
- Non-asymptotic convergence bounds for modified tamed unadjusted Langevin algorithm in non-convex setting
This page was built for publication: Taming Neural Networks with TUSLA: Nonconvex Learning via Adaptive Stochastic Gradient Langevin Algorithms
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6162009)