Reinforcement learning with dynamic convex risk measures
From MaRDI portal
Publication:6196296
Abstract: We develop an approach for solving time-consistent risk-sensitive stochastic optimization problems using model-free reinforcement learning (RL). Specifically, we assume agents assess the risk of a sequence of random variables using dynamic convex risk measures. We employ a time-consistent dynamic programming principle to determine the value of a particular policy, and develop policy gradient update rules that aid in obtaining optimal policies. We further develop an actor-critic style algorithm using neural networks to optimize over policies. Finally, we demonstrate the performance and flexibility of our approach by applying it to three optimization problems: statistical arbitrage trading strategies, financial hedging, and obstacle avoidance robot control.
Recommendations
- Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement Learning
- Robust risk-aware reinforcement learning
- Risk-averse learning by temporal difference methods with Markov risk measures
- Risk-Sensitive Reinforcement Learning via Policy Gradient Search
- Risk-sensitive reinforcement learning
Cites work
- scientific article; zbMATH DE number 1066320 (Why is no real title available?)
- scientific article; zbMATH DE number 1405266 (Why is no real title available?)
- scientific article; zbMATH DE number 7370555 (Why is no real title available?)
- scientific article; zbMATH DE number 7370615 (Why is no real title available?)
- A closed-form solution for options with stochastic volatility with applications to bond and currency options
- Approximate Integration of Stochastic Differential Equations
- Approximate Value Iteration for Risk-Aware Markov Decision Processes
- Approximation by superpositions of a sigmoidal function
- Coherent measures of risk
- Conditional and dynamic convex risk measures
- Convex measures of risk and trading constraints
- DISTRIBUTION‐INVARIANT RISK MEASURES, INFORMATION, AND DYNAMIC CONSISTENCY
- Deep Q-Learning for Nash Equilibria: Nash-DQN
- Deep learning
- Deep learning volatility: a deep neural network perspective on pricing and calibration in (rough) volatility models
- Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations
- Distributionally Robust Markov Decision Processes and Their Connection to Risk Measures
- Dual representation of minimal supersolutions of convex BSDEs
- Dynamic assessment indices
- Dynamic coherent risk measures
- Dynamic risk measures
- Envelope Theorems for Arbitrary Choice Sets
- Lectures on stochastic programming. Modeling and theory.
- Markov decision processes with iterated coherent risk measures
- Markov decision processes with recursive risk measures
- Minimizing spectral risk measures applied to Markov decision processes
- Percentile Optimization for Markov Decision Processes with Parameter Uncertainty
- Policy Gradient Methods for the Noisy Linear Quadratic Regulator over a Finite Horizon
- Reinforcement learning and stochastic optimisation
- Reinforcement learning. An introduction
- Risk Measures and Comonotonicity: A Review
- Risk-averse dynamic programming for Markov decision processes
- Risk-constrained reinforcement learning with percentile risk criteria
- Risk-sensitive reinforcement learning
- Sequential Decision Making With Coherent Risk
- Solving high-dimensional partial differential equations using deep learning
- Stochastic finance. An introduction in discrete time
Cited in
(8)- Recent advances in reinforcement learning in finance
- Conditionally Elicitable Dynamic Risk Measures for Deep Reinforcement Learning
- Robust Risk-Aware Option Hedging
- A stochastic maximum principle approach for reinforcement learning with parameterized environment
- Achieving zero constraint violation for concave utility constrained reinforcement learning via primal-dual approach
- Markov decision processes with risk-sensitive criteria: an overview
- Improving reinforcement learning algorithms: Towards optimal learning rate policies
- Learning equilibrium mean‐variance strategy
This page was built for publication: Reinforcement learning with dynamic convex risk measures
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6196296)