Multilevel stochastic gradient methods for nested composition optimization
From MaRDI portal
Abstract: Stochastic gradient methods are scalable for solving large-scale optimization problems that involve empirical expectations of loss functions. Existing results mainly apply to optimization problems where the objectives are one- or two-level expectations. In this paper, we consider the multi-level compositional optimization problem that involves compositions of multi-level component functions and nested expectations over a random path. It finds applications in risk-averse optimization and sequential planning. We propose a class of multi-level stochastic gradient methods that are motivated from the method of multi-timescale stochastic approximation. First we propose a basic -level stochastic compositional gradient algorithm, establish its almost sure convergence and obtain an -iteration error bound . Then we develop accelerated multi-level stochastic gradient methods by using an extrapolation-interpolation scheme to take advantage of the smoothness of individual component functions. When all component functions are smooth, we show that the convergence rate improves to for general objectives and for strongly convex objectives. We also provide almost sure convergence and rate of convergence results for nonconvex problems. The proposed methods and theoretical results are validated using numerical experiments.
Recommendations
- Multilevel composite stochastic optimization via nested variance reduction
- Stochastic multilevel composition optimization algorithms with level-independent convergence rates
- Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions
- A Stochastic Subgradient Method for Nonsmooth Nonconvex Multilevel Composition Optimization
- Accelerating Stochastic Composition Optimization
Cites work
- scientific article; zbMATH DE number 3761782 (Why is no real title available?)
- A proximal stochastic gradient method with progressive variance reduction
- Adaptive subgradient methods for online learning and stochastic optimization
- Coherent risk measures in inventory problems
- Minimizing finite sums with the stochastic average gradient
- Optimization of Convex Risk Functions
- Parallel stochastic gradient algorithms for large-scale matrix completion
- Risk neutral and risk averse approaches to multistage renewable investment planning under uncertainty
- Semi-stochastic coordinate descent
- Statistical estimation of composite risk functionals and risk optimization problems
- Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming
- Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions
- Stochastic first-order methods with random constraint projection
- Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
- Validation analysis of mirror descent stochastic approximation method
Cited in
(19)- Optimal methods for convex nested stochastic composite optimization
- A Stochastic Subgradient Method for Nonsmooth Nonconvex Multilevel Composition Optimization
- Stochastic multilevel composition optimization algorithms with level-independent convergence rates
- Multilevel Fine-Tuning: Closing Generalization Gaps in Approximation of Solution Maps under a Limited Budget for Training Data
- Solving Nonsmooth and Nonconvex Compound Stochastic Programs with Applications to Risk Measure Minimization
- Zeroth-order stochastic compositional algorithms for risk-aware learning
- Hybrid SGD algorithms to solve stochastic composite optimization problems with application in sparse portfolio selection problems
- Distributed stochastic compositional optimization problems over directed networks
- A Single Timescale Stochastic Approximation Method for Nested Stochastic Optimization
- Stochastic nested primal-dual method for nonconvex constrained composition optimization
- A stochastic subgradient method for distributionally robust non-convex and non-smooth learning
- A single timescale stochastic quasi-Newton method for stochastic optimization
- Probability maximization via Minkowski functionals: convex representations and tractable resolution
- Two-timescale stochastic gradient descent in continuous time with applications to joint online parameter estimation and optimal sensor placement
- Stochastic composition optimization of functions without Lipschitz continuous gradient
- Nested-Batch-Mode Learning and Stochastic Optimization with An Application to Sequential MultiStage Testing in Materials Science
- Stochastic compositional gradient descent: algorithms for minimizing compositions of expected-value functions
- Multilevel composite stochastic optimization via nested variance reduction
- Streaming constrained binary logistic regression with online standardized data
This page was built for publication: Multilevel stochastic gradient methods for nested composition optimization
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4629336)