Pages that link to "Item:Q870662"
From MaRDI portal
The following pages link to Simulation-based algorithms for Markov decision processes. (Q870662):
Displaying 23 items.
- Solving average cost Markov decision processes by means of a two-phase time aggregation algorithm (Q300040) (← links)
- Computable approximations for continuous-time Markov decision processes on Borel spaces based on empirical measures (Q302091) (← links)
- New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system (Q320866) (← links)
- Adaptive aggregation for reinforcement learning in average reward Markov decision processes (Q378753) (← links)
- The optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systems (Q421584) (← links)
- A \(Sarsa(\lambda)\) algorithm based on double-layer fuzzy reasoning (Q473823) (← links)
- Sampled fictitious play for approximate dynamic programming (Q547121) (← links)
- Simulation-based optimization of Markov decision processes: an empirical process theory approach (Q608432) (← links)
- Approximation of Markov decision processes with general state space (Q663675) (← links)
- Sleeping experts and bandits approach to constrained Markov decision processes (Q901196) (← links)
- Approximation of discounted minimax Markov control problems and zero-sum Markov games using Hausdorff and Wasserstein distances (Q1741211) (← links)
- Coupling based estimation approaches for the average reward performance potential in Markov chains (Q1796998) (← links)
- Stochastic approximations of constrained discounted Markov decision processes (Q2338706) (← links)
- Optimization of Markov decision processes under the variance criterion (Q2409311) (← links)
- Mean field Markov decision processes (Q2701089) (← links)
- Approximate policy iteration: a survey and some new methods (Q2887629) (← links)
- A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications (Q2887630) (← links)
- CONIC TRADING IN A MARKOVIAN STEADY STATE (Q2976128) (← links)
- Strategic capacity decision-making in a stochastic manufacturing environment using real-time approximate dynamic programming (Q3553741) (← links)
- What you should know about approximate dynamic programming (Q3621932) (← links)
- Computable approximations for average Markov decision processes in continuous time (Q4684960) (← links)
- Risk-Sensitive Reinforcement Learning via Policy Gradient Search (Q5102286) (← links)
- Variance-penalized Markov decision processes: dynamic programming and reinforcement learning techniques (Q5166474) (← links)