Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage
From MaRDI portal
Publication:5882386
Recommendations
- Least squares policy evaluation algorithms with linear function approximation
- 10.1162/1532443041827907
- Hybrid least-squares algorithms for approximate policy evaluation
- Approximate policy iteration: a survey and some new methods
- Benchmarking a scalable approximate dynamic programming algorithm for stochastic control of grid-level energy storage
Cites work
- scientific article; zbMATH DE number 3980333 (Why is no real title available?)
- scientific article; zbMATH DE number 1241609 (Why is no real title available?)
- scientific article; zbMATH DE number 1321699 (Why is no real title available?)
- scientific article; zbMATH DE number 2104240 (Why is no real title available?)
- 10.1162/1532443041827907
- A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
- Algorithms for reinforcement learning.
- An Optimal Approximate Dynamic Programming Algorithm for Concave, Scalar Storage Problems With Vector-Valued Controls
- An analysis of temporal-difference learning with function approximation
- An approximate dynamic programming approach to benchmark practice-based heuristics for natural gas storage valuation
- Approximate dynamic programming via iterated Bellman inequalities
- Approximate dynamic programming. Solving the curses of dimensionality
- Asynchronous stochastic approximation and Q-learning
- Basis function adaptation in temporal difference reinforcement learning
- Dynamic programming and optimal control. Vol. 2
- Dynamic-programming approximations for stochastic time-staged integer multicommodity-flow problems
- Errors in Variables
- Instrumental variable methods for system identification
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- Least squares temporal difference methods: An analysis under general conditions
- Linear least-squares algorithms for temporal difference learning
- On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming
- On the complexity of energy storage problems
- On the existence of fixed points for approximate value iteration and temporal-difference learning
- Optimal price-threshold control for battery operation with aging phenomenon: a quasiconvex optimization approach
- Parallel Nonstationary Direct Policy Search for Risk-Averse Stochastic Optimization
- Policy evaluation with temporal differences: a survey and comparison
- Pricing in Electricity Markets: A Mean Reverting Jump Diffusion Model with Seasonality
- Recursive estimation and time-series analysis. An introduction for the student and practitioner
- Smoothing and parametric rules for stochastic mean-CVaR optimal execution strategy
- The Linear Programming Approach to Approximate Dynamic Programming
- The correlated knowledge gradient for simulation optimization of continuous parameters using Gaussian process regression
Cited in
(3)
This page was built for publication: Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5882386)