Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
From MaRDI portal
Publication:4558492
zbMath1471.90160arXiv1512.01629MaRDI QIDQ4558492
Marco Pavone, Lucas Janson, Mohammad Ghavamzadeh, Yinlam Chow
Publication date: 22 November 2018
Full work available at URL: https://arxiv.org/abs/1512.01629
Markov decision processreinforcement learningactor-critic algorithmsconditional value-at-riskchance-constrained optimizationpolicy gradient algorithms
Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40) General considerations in statistical decision theory (62C05)
Related Items (17)
SAMBA: safe model-based \& active reinforcement learning ⋮ Risk-Sensitive Reinforcement Learning via Policy Gradient Search ⋮ On Robustness of Individualized Decision Rules ⋮ Sim-to-lab-to-real: safe reinforcement learning with shielding and generalization guarantees ⋮ Risk-averse optimization of reward-based coherent risk measures ⋮ Safety-constrained reinforcement learning with a distributional safety critic ⋮ Temporal Robustness of Stochastic Signals ⋮ Unnamed Item ⋮ Recent advances in reinforcement learning in finance ⋮ Safe multi-agent reinforcement learning for multi-robot control ⋮ Reinforcement learning with dynamic convex risk measures ⋮ Classification with costly features as a sequential decision-making problem ⋮ Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies ⋮ Risk-averse policy optimization via risk-neutral policy optimization ⋮ Risk verification of stochastic systems with neural network controllers ⋮ Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning ⋮ Peril, prudence and planning as risk, avoidance and worry
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- An online actor-critic algorithm with function approximation for constrained Markov decision processes
- Stochastic recursive algorithms for optimization. Simultaneous perturbation methods
- An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes
- Dynamic mean-risk optimization in a binomial model
- Natural actor-critic algorithms
- Minimizing risk models in Markov decision processes with policies depending on target values
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- Mean, variance and probabilistic criteria in finite Markov decision processes: A review
- Markov decision processes with average-value-at-risk criteria
- Risk neutral and risk averse stochastic dual dynamic programming method
- Time consistent dynamic risk measures
- An actor-critic algorithm for constrained Markov decision processes
- Coherent Measures of Risk
- Risk-Constrained Markov Decision Processes
- Computing VaR and CVaR using stochastic approximation and adaptive unconstrained importance sampling
- Variance-Penalized Markov Decision Processes
- Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
- OnActor-Critic Algorithms
- A Perturbation Theory for Ergodic Markov Chains and Application to Numerical Approximations
- Perturbation analysis for denumerable Markov chains with application to queueing models
- The variance of discounted Markov decision processes
- Percentile performance criteria for limiting average Markov decision processes
- Stochastic Target Hitting Time and the Problem of Early Retirement
- Stochastic Approximations and Differential Inclusions, Part II: Applications
- Envelope Theorems for Arbitrary Choice Sets
- Risk-Sensitive Markov Decision Processes
- Q-Learning for Risk-Sensitive Control
- A sensitivity formula for risk-sensitive cost and the actor-critic algorithm
This page was built for publication: Risk-Constrained Reinforcement Learning with Percentile Risk Criteria