Empirical Dynamic Programming
DOI10.1287/moor.2015.0733zbMath1338.49055arXiv1311.5918OpenAlexW2593952959MaRDI QIDQ2806811
Dileep Kalathil, Rahul Jain, William B. Haskell
Publication date: 19 May 2016
Published in: Mathematics of Operations Research (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1311.5918
simulationdynamic programmingMarkov decision processesrandom operatorsempirical methodsprobabilistic fixed points
Numerical mathematical programming methods (65K05) Dynamic programming in optimal control and differential games (49L20) Stochastic programming (90C15) Dynamic programming (90C39) Optimal stochastic control (93E20) Random operators and equations (aspects of stochastic analysis) (60H25) Markov and semi-Markov decision processes (90C40) Random linear operators (47B80) Simulation of dynamical systems (37M05) Empirical decision procedures; empirical Bayes procedures (62C12) Random dynamical systems (37H99)
Related Items
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Simulation-based optimization of Markov decision processes: an empirical process theory approach
- A survey of some simulation-based algorithms for Markov decision processes
- Associative search network: A reinforcement learning associative memory
- \({\mathcal Q}\)-learning
- Convergence rate of linear two-time-scale stochastic approximation.
- Learning Algorithms for Markov Decision Processes with Average Cost
- Approximate policy iteration: a survey and some new methods
- 10.1162/153244303768966102
- Functional Approximations and Dynamic Programming
- Approximate Fixed Point Iteration with an Application to Infinite Horizon Markov Decision Processes
- Simulation‐based Uniform Value Function Estimates of Markov Decision Processes
- The Complexity of Markov Decision Processes
- Analysis of recursive stochastic algorithms
- Approximations of Dynamic Programs, I
- Approximations of Dynamic Programs, II
- Using Randomization to Break the Curse of Dimensionality
- CONVERGENCE OF SIMULATION-BASED POLICY ITERATION
- Performance Guarantees for Empirical Markov Decision Processes with Applications to Multiperiod Inventory Models
- Actor-Critic--Type Learning Algorithms for Markov Decision Processes
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- Neural Network Learning
- Q-Learning for Risk-Sensitive Control
- Stochastic Estimation of the Maximum of a Regression Function
- Stochastic Games
- A Stochastic Approximation Method