Risk-averse approximate dynamic programming with quantile-based risk measures
From MaRDI portal
Publication:5219554
Abstract: In this paper, we consider a finite-horizon Markov decision process (MDP) for which the objective at each stage is to minimize a quantile-based risk measure (QBRM) of the sequence of future costs; we call the overall objective a dynamic quantile-based risk measure (DQBRM). In particular, we consider optimizing dynamic risk measures where the one-step risk measures are QBRMs, a class of risk measures that includes the popular value at risk (VaR) and the conditional value at risk (CVaR). Although there is considerable theoretical development of risk-averse MDPs in the literature, the computational challenges have not been explored as thoroughly. We propose data-driven and simulation-based approximate dynamic programming (ADP) algorithms to solve the risk-averse sequential decision problem. We address the issue of inefficient sampling for risk applications in simulated settings and present a procedure, based on importance sampling, to direct samples toward the "risky region" as the ADP algorithm progresses. Finally, we show numerical results of our algorithms in the context of an application involving risk-averse bidding for energy storage.
Recommendations
Cites work
- scientific article; zbMATH DE number 5957196 (Why is no real title available?)
- scientific article; zbMATH DE number 3687126 (Why is no real title available?)
- scientific article; zbMATH DE number 1321699 (Why is no real title available?)
- scientific article; zbMATH DE number 1005357 (Why is no real title available?)
- scientific article; zbMATH DE number 1972910 (Why is no real title available?)
- scientific article; zbMATH DE number 3449561 (Why is no real title available?)
- scientific article; zbMATH DE number 2107836 (Why is no real title available?)
- A New Optimal Stepsize for Approximate Dynamic Programming
- A Space-Efficient Recursive Procedure for Estimating a Quantile of an Unknown Distribution
- A Stochastic Approximation Method
- Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
- An approximate dynamic programming algorithm for monotone value functions
- An optimal approximate dynamic programming algorithm for the lagged asset acquisition problem
- Approximate dynamic programming. Solving the curses of dimensionality
- Asynchronous stochastic approximation and Q-learning
- Coherent measures of risk
- Coherent multiperiod risk adjusted values and Bellman's principle
- Computational methods for risk-averse undiscounted transient Markov models
- Computing VaR and CVaR using stochastic approximation and adaptive unconstrained importance sampling
- Conditional Risk Mappings
- Dynamic coherent risk measures
- Dynamic monetary risk measures for bounded discrete-time processes
- Dynamic risk measures
- Dynamic sampling algorithms for multi-stage stochastic programs with risk aversion
- Evaluating policies in risk-averse multi-stage stochastic programming
- Information collection on a graph
- Introduction to rare event simulation.
- Learning Algorithms for Separable Approximations of Discrete Stochastic Optimization Problems
- Measuring risk for income streams
- Multi-stage stochastic optimization applied to energy planning
- On a Stochastic Approximation Method
- On a time consistency concept in risk averse multistage stochastic programming
- On solving multistage stochastic programs with coherent risk measures
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- On the convergence of a stochastic approximation procedure for estimating the quantile criterion in the case of a discontinuous distribution function
- Optimal hour-ahead bidding in the real-time electricity market with battery storage using approximate dynamic programming
- Optimization of Convex Risk Functions
- Optimizing trading decisions for hydro storage systems using approximate dual dynamic programming
- Quantile estimation with adaptive importance sampling
- Rare-Event Simulation for Multistage Production-Inventory Systems
- Risk neutral and risk averse stochastic dual dynamic programming method
- Risk-averse dynamic programming for Markov decision processes
- Robust Stochastic Approximation Approach to Stochastic Programming
- The cross-entropy method for combinatorial and continuous optimization
- The sample average approximation method for stochastic discrete optimization
- Time consistency and risk averse dynamic decision models: definition, interpretation and practical consequences
- Time consistent dynamic risk measures
- \({\mathcal Q}\)-learning
Cited in
(6)- Socially responsible merchant operations: comparison of shutdown-averse CVaR and anticipated regret policies
- Zeroth-order stochastic compositional algorithms for risk-aware learning
- Quantile Markov Decision Processes
- A unified framework for stochastic optimization
- Risk-Sensitive Reinforcement Learning via Policy Gradient Search
- Capturing deep tail risk via sequential learning of quantile dynamics
This page was built for publication: Risk-averse approximate dynamic programming with quantile-based risk measures
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5219554)