Risk-averse approximate dynamic programming with quantile-based risk measures
From MaRDI portal
Publication:5219554
DOI10.1287/MOOR.2017.0872zbMATH Open1440.90084arXiv1509.01920OpenAlexW2462780152MaRDI QIDQ5219554FDOQ5219554
Daniel R. Jiang, Warren Powell
Publication date: 12 March 2020
Published in: Mathematics of Operations Research (Search for Journal in Brave)
Abstract: In this paper, we consider a finite-horizon Markov decision process (MDP) for which the objective at each stage is to minimize a quantile-based risk measure (QBRM) of the sequence of future costs; we call the overall objective a dynamic quantile-based risk measure (DQBRM). In particular, we consider optimizing dynamic risk measures where the one-step risk measures are QBRMs, a class of risk measures that includes the popular value at risk (VaR) and the conditional value at risk (CVaR). Although there is considerable theoretical development of risk-averse MDPs in the literature, the computational challenges have not been explored as thoroughly. We propose data-driven and simulation-based approximate dynamic programming (ADP) algorithms to solve the risk-averse sequential decision problem. We address the issue of inefficient sampling for risk applications in simulated settings and present a procedure, based on importance sampling, to direct samples toward the "risky region" as the ADP algorithm progresses. Finally, we show numerical results of our algorithms in the context of an application involving risk-averse bidding for energy storage.
Full work available at URL: https://arxiv.org/abs/1509.01920
Recommendations
Decision theory (91B06) Stochastic approximation (62L20) Dynamic programming (90C39) Stochastic learning and adaptive control (93E35)
Cites Work
- Title not available (Why is that?)
- Coherent measures of risk
- Title not available (Why is that?)
- A Stochastic Approximation Method
- A Space-Efficient Recursive Procedure for Estimating a Quantile of an Unknown Distribution
- Robust Stochastic Approximation Approach to Stochastic Programming
- \({\mathcal Q}\)-learning
- The cross-entropy method for combinatorial and continuous optimization
- Title not available (Why is that?)
- Title not available (Why is that?)
- Multi-stage stochastic optimization applied to energy planning
- Risk neutral and risk averse stochastic dual dynamic programming method
- The sample average approximation method for stochastic discrete optimization
- Approximate Dynamic Programming
- Dynamic sampling algorithms for multi-stage stochastic programs with risk aversion
- Dynamic Risk Measures
- Dynamic monetary risk measures for bounded discrete-time processes
- Introduction to rare event simulation.
- Title not available (Why is that?)
- Computing VaR and CVaR using stochastic approximation and adaptive unconstrained importance sampling
- Coherent multiperiod risk adjusted values and Bellman's principle
- An Optimal Approximate Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem
- Optimizing Trading Decisions for Hydro Storage Systems Using Approximate Dual Dynamic Programming
- Optimization of Convex Risk Functions
- Dynamic coherent risk measures
- On Solving Multistage Stochastic Programs with Coherent Risk Measures
- On a Stochastic Approximation Method
- Risk-averse dynamic programming for Markov decision processes
- Adaptive stepsizes for recursive estimation with applications in approximate dynamic programming
- Asynchronous stochastic approximation and Q-learning
- On a time consistency concept in risk averse multistage stochastic programming
- Time consistent dynamic risk measures
- Time consistency and risk averse dynamic decision models: definition, interpretation and practical consequences
- Measuring risk for income streams
- Title not available (Why is that?)
- Conditional Risk Mappings
- An approximate dynamic programming algorithm for monotone value functions
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- Quantile estimation with adaptive importance sampling
- Title not available (Why is that?)
- On the convergence of a stochastic approximation procedure for estimating the quantile criterion in the case of a discontinuous distribution function
- A New Optimal Stepsize for Approximate Dynamic Programming
- Evaluating policies in risk-averse multi-stage stochastic programming
- Computational methods for risk-averse undiscounted transient Markov models
- Learning Algorithms for Separable Approximations of Discrete Stochastic Optimization Problems
- Rare-Event Simulation for Multistage Production-Inventory Systems
- Optimal Hour-Ahead Bidding in the Real-Time Electricity Market with Battery Storage Using Approximate Dynamic Programming
- Information Collection on a Graph
Cited In (6)
- Risk-Sensitive Reinforcement Learning via Policy Gradient Search
- Quantile Markov Decision Processes
- Capturing deep tail risk via sequential learning of quantile dynamics
- Socially responsible merchant operations: comparison of shutdown-averse CVaR and anticipated regret policies
- Zeroth-Order Stochastic Compositional Algorithms for Risk-Aware Learning
- A unified framework for stochastic optimization
This page was built for publication: Risk-averse approximate dynamic programming with quantile-based risk measures
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5219554)