Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems
From MaRDI portal
Publication:4969058
zbMath1498.93784arXiv1812.08305MaRDI QIDQ4969058
Koulik Khamaru, Kush Bhatia, Martin J. Wainwright, Ashwin Pananjady, Dhruv Malik, Bartlett, Peter L.
Publication date: 5 October 2020
Full work available at URL: https://arxiv.org/abs/1812.08305
Nonconvex programming, global optimization (90C26) Learning and adaptive systems in artificial intelligence (68T05) Optimal stochastic control (93E20) Linear-quadratic optimal control problems (49N10)
Related Items
Model-free design of stochastic LQR controller from a primal-dual optimization perspective, Tracking and Regret Bounds for Online Zeroth-Order Euclidean and Riemannian Optimization, Analysis of the optimization landscape of Linear Quadratic Gaussian (LQG) control, Recent Theoretical Advances in Non-Convex Optimization, Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon, Unnamed Item, Policy Optimization for $\mathcal{H}_2$ Linear Control with $\mathcal{H}_\infty$ Robustness Guarantee: Implicit Regularization and Global Convergence, Model-free linear quadratic regulator, Controlled interacting particle algorithms for simulation-based reinforcement learning
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- A tail inequality for quadratic forms of subgaussian random vectors
- Linear Thompson sampling revisited
- On the sample complexity of the linear quadratic regulator
- Optimal Rates for Zero-Order Convex Optimization: The Power of Two Function Evaluations
- Introduction to Stochastic Search and Optimization
- Optimality of Fast-Matching Algorithms for Random Networks With Applications to Structural Controllability
- Optimization of Smooth Functions With Noisy Observations: Local Minimax Rates
- An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback
- Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming
- Gradient methods for solving equations and inequalities
- A Bound on Tail Probabilities for Quadratic Forms in Independent Random Variables
- Probability