Approximate dynamic programming via direct search in the space of value function approximations
From MaRDI portal
Publication:713118
DOI10.1016/j.ejor.2010.11.019zbMath1250.90105OpenAlexW2079830915MaRDI QIDQ713118
João B. R. do Val, Edilson F. Arruda, Marcelo Dutra Fragoso
Publication date: 26 October 2012
Published in: European Journal of Operational Research (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.ejor.2010.11.019
Related Items (4)
Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm ⋮ A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces ⋮ Modified iterative aggregation procedure for maintenance optimisation of multi-component systems with failure interaction ⋮ Accelerating the convergence of value iteration by using partial transition functions
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- A new learning algorithm for optimal stopping
- Projected equation methods for approximate solution of large linear systems
- Direct search methods: Then and now
- Practical issues in temporal difference learning
- An empirical study of policy convergence in Markov decision process value iteration
- Basis function adaptation in temporal difference reinforcement learning
- On the Convergence of Pattern Search Algorithms
- 10.1162/1532443041827907
- Convergence Results for Some Temporal Difference Methods Based on Least Squares
- Approximate Dynamic Programming
- Performance Loss Bounds for Approximate Value Iteration with State Aggregation
- Control Techniques for Complex Networks
- Discrete Dynamic Programming with Unbounded Rewards
This page was built for publication: Approximate dynamic programming via direct search in the space of value function approximations