Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage

DOI10.1080/03155986.2019.1624491OpenAlexW2963530719WikidataQ114100489 ScholiaQ114100489MaRDI QIDQ5882386FDOQ5882386

Authors: Somayeh Moazeni, Warren Scott, Warren Powell

Publication date: 15 March 2023

Published in: INFOR: Information Systems and Operational Research (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1401.0843

Recommendations

Least squares policy evaluation algorithms with linear function approximation
10.1162/1532443041827907
Hybrid least-squares algorithms for approximate policy evaluation
Approximate policy iteration: a survey and some new methods
Benchmarking a scalable approximate dynamic programming algorithm for stochastic control of grid-level energy storage

zbMATH Keywords

dynamic programming approximate dynamic programming energy storage approximate policy iteration direct policy search Bellman error minimization

Mathematics Subject Classification ID

Systems theory; control (93-XX)

Cites Work

Recursive estimation and time-series analysis. An introduction for the student and practitioner
Errors in Variables
Title not available (Why is that?)
Title not available (Why is that?)
Title not available (Why is that?)
Approximate dynamic programming. Solving the curses of dimensionality
Smoothing and parametric rules for stochastic mean-CVaR optimal execution strategy
The Linear Programming Approach to Approximate Dynamic Programming
A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
10.1162/1532443041827907
Linear least-squares algorithms for temporal difference learning
An analysis of temporal-difference learning with function approximation
Dynamic-programming approximations for stochastic time-staged integer multicommodity-flow problems
Dynamic programming and optimal control. Vol. 2
On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming
Algorithms for reinforcement learning.
Asynchronous stochastic approximation and Q-learning
Pricing in Electricity Markets: A Mean Reverting Jump Diffusion Model with Seasonality
On the existence of fixed points for approximate value iteration and temporal-difference learning
An approximate dynamic programming approach to benchmark practice-based heuristics for natural gas storage valuation
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
Basis function adaptation in temporal difference reinforcement learning
Instrumental variable methods for system identification
The correlated knowledge gradient for simulation optimization of continuous parameters using Gaussian process regression
Least squares temporal difference methods: An analysis under general conditions
Title not available (Why is that?)
Policy evaluation with temporal differences: a survey and comparison
An Optimal Approximate Dynamic Programming Algorithm for Concave, Scalar Storage Problems With Vector-Valued Controls
On the complexity of energy storage problems
Parallel Nonstationary Direct Policy Search for Risk-Averse Stochastic Optimization
Approximate dynamic programming via iterated Bellman inequalities
Optimal price-threshold control for battery operation with aging phenomenon: a quasiconvex optimization approach

Cited In (2)

Approximate dynamic programming for the dispatch of military medical evacuation assets
Hybrid least-squares algorithms for approximate policy evaluation

Uses Software

S-PLUS

This page was built for publication: Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5882386)