Pages that link to "Item:Q4450393"
From MaRDI portal
The following pages link to CONVERGENCE OF SIMULATION-BASED POLICY ITERATION (Q4450393):
Displaying 10 items.
- New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system (Q320866) (← links)
- Finding optimal memoryless policies of POMDPs under the expected average reward criterion (Q418072) (← links)
- A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: multichain cases (Q705478) (← links)
- Basic ideas for event-based optimization of Markov systems (Q1773104) (← links)
- Coupling based estimation approaches for the average reward performance potential in Markov chains (Q1796998) (← links)
- Empirical Dynamic Programming (Q2806811) (← links)
- Some Limit Properties of Markov Chains Induced by Recursive Stochastic Algorithms (Q5037552) (← links)
- Queueing Network Controls via Deep Reinforcement Learning (Q5084497) (← links)
- A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs (Q5227201) (← links)
- Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities (Q5265786) (← links)