Simulation‐based Uniform Value Function Estimates of Markov Decision Processes
From MaRDI portal
Publication:3593009
Recommendations
- The value functions of Markov decision processes
- Simulation-based algorithms for Markov decision processes
- scientific article; zbMATH DE number 1509479
- Simulation-based optimization of Markov reward processes
- Uniform convergence of value iteration policies for discounted Markov decision processes
- Computing optimal policies for Markovian decision processes using simulation
- scientific article; zbMATH DE number 3916050
- Simulation-based optimization of Markov decision processes: an empirical process theory approach
- A class of procedures to compute the optimal value f unction in a Markovian decision problem
Cited in
(9)- A survey of some simulation-based algorithms for Markov decision processes
- Bias and variance approximation in value function estimates
- Empirical dynamic programming
- Simulation-based optimization of Markov decision processes: an empirical process theory approach
- On the convergence of reinforcement learning with Monte Carlo exploring starts
- CONVERGENCE OF SIMULATION-BASED POLICY ITERATION
- Computing optimal policies for Markovian decision processes using simulation
- scientific article; zbMATH DE number 1804129 (Why is no real title available?)
- Near optimality of quantized policies in stochastic control under weak continuity conditions
This page was built for publication: Simulation‐based Uniform Value Function Estimates of Markov Decision Processes
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3593009)