scientific article; zbMATH DE number 7306906
From MaRDI portal
Publication:5149016
Authors: Rachel Ward, Xiaoxia Wu, Léon Bottou
Publication date: 5 February 2021
Full work available at URL: https://arxiv.org/abs/1806.01811
Title of this publication is not available (Why is that?)
Recommendations
- A class of gradient unconstrained minimization algorithms with adaptive stepsize
- Adaptivity of stochastic gradient methods for nonconvex optimization
- On stochastic gradient and subgradient methods with adaptive steplength sequences
- New adaptive stepsize selections in gradient methods
- Sequential convergence of AdaGrad algorithm for smooth convex optimization
- Convergence of constant step stochastic gradient descent for non-smooth non-convex functions
- Convergence and dynamical behavior of the ADAM algorithm for nonconvex stochastic optimization
- Gradient methods with adaptive step-sizes
- The adaptive \(s\)-step conjugate gradient method
- An adaptive gradient algorithm for large-scale nonlinear bound constrained optimization
convergencenonconvex optimizationlarge-scale optimizationadaptive gradient descentstochastic offline learning
Cites Work
- Adaptive subgradient methods for online learning and stochastic optimization
- Title not available (Why is that?)
- A Stochastic Approximation Method
- Two-Point Step Size Gradient Methods
- Robust Stochastic Approximation Approach to Stochastic Programming
- Title not available (Why is that?)
- Accelerated gradient methods for nonconvex nonlinear and stochastic programming
- Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming
- Convex optimization: algorithms and complexity
- Accelerated methods for nonconvex optimization
- Finding approximate local minima faster than gradient descent
- Optimization methods for large-scale machine learning
Cited In (19)
- Random Batch Methods for Classical and Quantum Interacting Particle Systems and Statistical Samplings
- An adaptive Polyak heavy-ball method
- Convergence Properties of an Objective-Function-Free Optimization Regularization Algorithm, Including an \(\boldsymbol{\mathcal{O}(\epsilon^{-3/2})}\) Complexity Bound
- Theoretical analysis of Adam using hyperparameters close to one without Lipschitz smoothness
- Stochastic momentum methods for non-convex learning without bounded assumptions
- Provably faster gradient descent via long steps
- Gradient descent in the absence of global Lipschitz continuity of the gradients
- An improved transformer model with multi-head attention and attention to attention for low-carbon multi-depot vehicle routing problem
- Stochastic algorithms with geometric step decay converge linearly on sharp functions
- Machine learning design of volume of fluid schemes for compressible flows
- An adaptive Riemannian gradient method without function evaluations
- Dynamic regret of adaptive gradient methods for strongly convex problems
- Incremental without replacement sampling in nonconvex optimization
- Stochastic Gauss-Newton algorithms for online PCA
- A multivariate adaptive gradient algorithm with reduced tuning efforts
- High probability bounds on AdaGrad for constrained weakly convex optimization
- SVRG meets AdaGrad: painless variance reduction
- Recent Theoretical Advances in Non-Convex Optimization
- Adaptive step size rules for stochastic optimization in large-scale learning
Uses Software
This page was built for publication:
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5149016)