Feature-based methods for large scale dynamic programming
From MaRDI portal
Publication:1911341
DOI10.1007/BF00114724zbMath0843.68092MaRDI QIDQ1911341
John N. Tsitsiklis, Benjamin van Roy
Publication date: 21 April 1996
Published in: Machine Learning (Search for Journal in Brave)
Related Items
Approximate policy iteration: a survey and some new methods ⋮ A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications ⋮ Dynamic programming approximation algorithms for the capacitated lot-sizing problem ⋮ Approximate dynamic programming for stochastic \(N\)-stage optimization with application to optimal consumption under uncertainty ⋮ Shape constraints in economics and operations research ⋮ The actor-critic algorithm as multi-time-scale stochastic approximation. ⋮ Data-driven models for capacity allocation of inpatient beds in a Chinese public hospital ⋮ The Benefits of State Aggregation with Extreme-Point Weighting for Assemble-to-Order Systems ⋮ Single sample path-based optimization of Markov chains
Cites Work
- Unnamed Item
- Unnamed Item
- Generalized polynomial approximations in Markovian decision processes
- Asynchronous stochastic approximation and Q-learning
- Practical issues in temporal difference learning
- \({\mathcal Q}\)-learning
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
- Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks
- Functional Approximations and Dynamic Programming
- Adaptive aggregation methods for infinite horizon dynamic programming
- Approximations of Dynamic Programs, I