Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage
DOI10.1080/03155986.2019.1624491OpenAlexW2963530719WikidataQ114100489 ScholiaQ114100489MaRDI QIDQ5882386FDOQ5882386
Authors: Somayeh Moazeni, Warren Scott, Warren Powell
Publication date: 15 March 2023
Published in: INFOR: Information Systems and Operational Research (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1401.0843
Recommendations
- Least squares policy evaluation algorithms with linear function approximation
- 10.1162/1532443041827907
- Hybrid least-squares algorithms for approximate policy evaluation
- Approximate policy iteration: a survey and some new methods
- Benchmarking a scalable approximate dynamic programming algorithm for stochastic control of grid-level energy storage
dynamic programmingapproximate dynamic programmingenergy storageapproximate policy iterationdirect policy searchBellman error minimization
Cites Work
- Recursive estimation and time-series analysis. An introduction for the student and practitioner
- Errors in Variables
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Approximate dynamic programming. Solving the curses of dimensionality
- Smoothing and parametric rules for stochastic mean-CVaR optimal execution strategy
- The Linear Programming Approach to Approximate Dynamic Programming
- A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
- 10.1162/1532443041827907
- Linear least-squares algorithms for temporal difference learning
- An analysis of temporal-difference learning with function approximation
- Dynamic-programming approximations for stochastic time-staged integer multicommodity-flow problems
- Dynamic programming and optimal control. Vol. 2
- On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming
- Algorithms for reinforcement learning.
- Asynchronous stochastic approximation and Q-learning
- Pricing in Electricity Markets: A Mean Reverting Jump Diffusion Model with Seasonality
- On the existence of fixed points for approximate value iteration and temporal-difference learning
- An approximate dynamic programming approach to benchmark practice-based heuristics for natural gas storage valuation
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- Basis function adaptation in temporal difference reinforcement learning
- Instrumental variable methods for system identification
- The correlated knowledge gradient for simulation optimization of continuous parameters using Gaussian process regression
- Least squares temporal difference methods: An analysis under general conditions
- Title not available (Why is that?)
- Policy evaluation with temporal differences: a survey and comparison
- An Optimal Approximate Dynamic Programming Algorithm for Concave, Scalar Storage Problems With Vector-Valued Controls
- On the complexity of energy storage problems
- Parallel Nonstationary Direct Policy Search for Risk-Averse Stochastic Optimization
- Approximate dynamic programming via iterated Bellman inequalities
- Optimal price-threshold control for battery operation with aging phenomenon: a quasiconvex optimization approach
Cited In (2)
Uses Software
This page was built for publication: Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5882386)