Simulation-based algorithms for Markov decision processes.

From MaRDI portal
Publication:870662

zbMath1155.90002MaRDI QIDQ870662

Jiaqiao Hu, Steven I. Marcus, Michael C. Fu, Hyeong Soo Chang

Publication date: 13 March 2007

Published in: Communications and Control Engineering (Search for Journal in Brave)




Related Items (23)

Approximate policy iteration: a survey and some new methodsA review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applicationsSolving average cost Markov decision processes by means of a two-phase time aggregation algorithmComputable approximations for continuous-time Markov decision processes on Borel spaces based on empirical measuresNew approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution systemAdaptive aggregation for reinforcement learning in average reward Markov decision processesOptimization of Markov decision processes under the variance criterionSimulation-based optimization of Markov decision processes: an empirical process theory approachRisk-Sensitive Reinforcement Learning via Policy Gradient SearchSleeping experts and bandits approach to constrained Markov decision processesThe optimal control of just-in-time-based production and distribution systems and performance comparisons with optimized pull systemsMean field Markov decision processesCONIC TRADING IN A MARKOVIAN STEADY STATEA \(Sarsa(\lambda)\) algorithm based on double-layer fuzzy reasoningApproximation of Markov decision processes with general state spaceStrategic capacity decision-making in a stochastic manufacturing environment using real-time approximate dynamic programmingApproximation of discounted minimax Markov control problems and zero-sum Markov games using Hausdorff and Wasserstein distancesVariance-penalized Markov decision processes: dynamic programming and reinforcement learning techniquesSampled fictitious play for approximate dynamic programmingComputable approximations for average Markov decision processes in continuous timeCoupling based estimation approaches for the average reward performance potential in Markov chainsWhat you should know about approximate dynamic programmingStochastic approximations of constrained discounted Markov decision processes




This page was built for publication: Simulation-based algorithms for Markov decision processes.