Policy gradient in Lipschitz Markov decision processes
From MaRDI portal
Publication:747252
DOI10.1007/S10994-015-5484-1zbMATH Open1354.90166OpenAlexW2046859786MaRDI QIDQ747252FDOQ747252
Authors: Matteo Pirotta, Marcello Restelli, Luca Bascetta
Publication date: 23 October 2015
Published in: Machine Learning (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s10994-015-5484-1
Recommendations
- An Online Policy Gradient Algorithm for Markov Decision Processes with Continuous States and Actions
- On linear and super-linear convergence of natural policy gradient algorithm
- Lipschitz continuous policy functions for strongly concave optimization problems
- Policy gradient in continuous time
- Lipschitz continuity of value functions in Markovian decision processes
Cites Work
- Title not available (Why is that?)
- A Stochastic Approximation Method
- Line search algorithms with guaranteed sufficient decrease
- Stochastic optimal control. The discrete time case
- Minimization of functions having Lipschitz continuous first partial derivatives
- Solving connection and linearization problems within the Askey scheme and its \(q\)-analogue via inversion formulas
- Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
- Policy search for motor primitives in robotics
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Lipschitz continuity of value functions in Markovian decision processes
- Collective motions of a shell structure
- Title not available (Why is that?)
Cited In (15)
- Title not available (Why is that?)
- 10.1162/1532443041827907
- A Small Gain Analysis of Single Timescale Actor Critic
- On the sample complexity of actor-critic method for reinforcement learning with function approximation
- Nonconvex policy search using variational inequalities
- On high-order differentiability of the policy function
- Expected policy gradients for reinforcement learning
- Importance sampling techniques for policy optimization
- Title not available (Why is that?)
- Lipschitz continuous policy functions for strongly concave optimization problems
- On linear and super-linear convergence of natural policy gradient algorithm
- Risk-averse optimization of reward-based coherent risk measures
- Smoothing policies and safe policy gradients
- Global convergence of policy gradient methods to (almost) locally optimal policies
- Learning parametric policies and transition probability models of Markov decision processes from data
This page was built for publication: Policy gradient in Lipschitz Markov decision processes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q747252)