The policy iteration algorithm for average reward Markov decision processes with general state space

From MaRDI portal

Revision as of 02:22, 7 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:4395828

Jump to:navigation, search

DOI10.1109/9.650016zbMath0906.93063WikidataQ114991401 ScholiaQ114991401MaRDI QIDQ4395828

Sean P. Meyn

Publication date: 12 August 1998

Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1109/9.650016

zbMATH Keywords

optimal control; queueing networks; deterministic routing; controlled Markov chains; Howard's policy iteration algorithm

Mathematics Subject Classification ID

90B15: Stochastic network models in operations research

93E20: Optimal stochastic control

60K20: Applications of Markov renewal processes (reliability, queueing networks, etc.)

Related Items

Optimal Inventory Control with Jump Diffusion and Nonlinear Dynamics in the Demand, On Iteration Improvement for Averaged Expected Cost Control for One-Dimensional Ergodic Diffusions, Average Cost Optimality Inequality for Markov Decision Processes with Borel Spaces and Universally Measurable Policies, On the Minimum Pair Approach for Average Cost Markov Decision Processes with Countable Discrete Action Spaces and Strictly Unbounded Costs, A note on the existence of optimal stationary policies for average Markov decision processes with countable states, Potential-based least-squares policy iteration for a parameterized feedback control system, Weak convergence and fluid limits in optimal time-to-empty queueing control problems, Average control of Markov decision processes with Feller transition probabilities and general action spaces, Weakly coupled event triggered output feedback system in wireless networked control systems, The policy iteration algorithm for average continuous control of piecewise deterministic Markov processes, Stochastic control via direct comparison, Completion-of-squares: revisited and extended, Policy iteration for continuous-time average reward Markov decision processes in Polish spaces, Single sample path-based optimization of Markov chains, Approximate receding horizon approach for Markov decision processes: average reward case, Planning for the long run: programming with patient, Pareto responsive preferences, On structural properties of optimal average cost functions in Markov decision processes with Borel spaces and universally measurable policies, Dispatching to parallel servers. Solutions of Poisson's equation for first-policy improvement, The policy iteration algorithm for a compound Poisson process applied to optimal dividend strategies under a Cramér-Lundberg risk model, An optimal control approach to day-to-day congestion pricing for stochastic transportation networks, Coding and control for communication networks, A policy improvement method for constrained average Markov decision processes, Reliability by design in distributed power transmission networks, Dynamic load balancing in parallel queueing systems: stability and optimal control, Dynamic safety-stocks for asymptotic optimality in stochastic networks, A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications, Unnamed Item

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:4395828&oldid=18409341"