Policy iteration for continuous-time average reward Markov decision processes in Polish spaces (Q963139): Difference between revisions

Summary: We study the policy iteration algorithm (PIA) for continuous-time jump Markov decision processes in general state and action spaces. The corresponding transition rates are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. The criterion that we are concerned with is expected average reward. We propose a set of conditions under which we first establish the average reward optimality equation and present the PIA. Then, under two slightly different sets of conditions, we show that the PIA yields the optimal (maximum) reward, an average optimal stationary policy, and a solution to the average reward optimality equation.

0 references

zbMATH Keywords

transition rates

0 references

reward rates

0 references

average reward optimality equation

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Q3266141

0 references

Counter examples for compact action markov decision chains with average reward criteria

0 references

Q4315289

0 references

On undiscounted markovian decision processes with compact action spaces

0 references

Multichain Markov Renewal Programs

0 references

Drift and monotonicity conditions for continuous-time controlled markov chains with an average criterion

0 references

Optimal Control of Ergodic Continuous-Time Markov Chains with Average Sample-Path Rewards

0 references

Q4255598

0 references

Policy iteration for average cost Markov control processes on Borel spaces

0 references

On the Convergence of Policy Iteration in Finite State Undiscounted Markov Decision Processes: The Unichain Case

0 references

A new policy iteration scheme for Markov decision processes using Schweitzer's formula

0 references

The policy iteration algorithm for average reward Markov decision processes with general state space

0 references

Convergence Properties of Policy Iteration

0 references

Average optimality for continuous-time Markov decision processes with a policy iteration approach

0 references

A Note on the Convergence of Policy Iteration in Markov Decision Processes with Compact Action Spaces

0 references

Average optimality for continuous-time Markov decision processes in Polish spaces

0 references

Average optimality inequality for continuous-time Markov decision processes in Polish spaces

0 references

Bias and Overtaking Optimality for Continuous-Time Jump Markov Decision Processes in Polish Spaces

0 references

Computable exponential convergence rates for stochastically ordered Markov processes

0 references

Markov Decision Processes with Variance Minimization: A New Condition and Approach

0 references

Another set of conditions for Markov decision processes with average sample-path costs

0 references

Another Set of Conditions for Strong<i>n</i>(<i>n</i> = −1, 0) Discount Optimality in Markov Decision Processes

0 references

Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal

0 references

Identifiers

zbMATH Open document ID

1192.90243

0 references

DOI

10.1155/2009/103723

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:963139

@@ Property / cites work @@
+Q3266141
@@ Property / cites work: Q3266141 / rank @@
+Normal rank
@@ Property / cites work @@
+Counter examples for compact action markov decision chains with average reward criteria
+Normal rank
@@ Property / cites work @@
+Q4315289
@@ Property / cites work: Q4315289 / rank @@
+Normal rank
@@ Property / cites work @@
+On undiscounted markovian decision processes with compact action spaces
+Normal rank
@@ Property / cites work @@
+Multichain Markov Renewal Programs
@@ Property / cites work: Multichain Markov Renewal Programs / rank @@
+Normal rank
@@ Property / cites work @@
+Drift and monotonicity conditions for continuous-time controlled markov chains with an average criterion
+Normal rank
@@ Property / cites work @@
+Optimal Control of Ergodic Continuous-Time Markov Chains with Average Sample-Path Rewards
+Normal rank
@@ Property / cites work @@
+Q4255598
@@ Property / cites work: Q4255598 / rank @@
+Normal rank
@@ Property / cites work @@
+Policy iteration for average cost Markov control processes on Borel spaces
+Normal rank
@@ Property / cites work @@
+On the Convergence of Policy Iteration in Finite State Undiscounted Markov Decision Processes: The Unichain Case
+Normal rank
@@ Property / cites work @@
+A new policy iteration scheme for Markov decision processes using Schweitzer's formula
+Normal rank
@@ Property / cites work @@
+The policy iteration algorithm for average reward Markov decision processes with general state space
+Normal rank
@@ Property / cites work @@
+Convergence Properties of Policy Iteration
@@ Property / cites work: Convergence Properties of Policy Iteration / rank @@
+Normal rank
@@ Property / cites work @@
+Average optimality for continuous-time Markov decision processes with a policy iteration approach
+Normal rank
@@ Property / cites work @@
+A Note on the Convergence of Policy Iteration in Markov Decision Processes with Compact Action Spaces
+Normal rank
@@ Property / cites work @@
+Average optimality for continuous-time Markov decision processes in Polish spaces
+Normal rank
@@ Property / cites work @@
+Average optimality inequality for continuous-time Markov decision processes in Polish spaces
+Normal rank
@@ Property / cites work @@
+Bias and Overtaking Optimality for Continuous-Time Jump Markov Decision Processes in Polish Spaces
+Normal rank
@@ Property / cites work @@
+Computable exponential convergence rates for stochastically ordered Markov processes
+Normal rank
@@ Property / cites work @@
+Markov Decision Processes with Variance Minimization: A New Condition and Approach
+Normal rank
@@ Property / cites work @@
+Another set of conditions for Markov decision processes with average sample-path costs
+Normal rank
@@ Property / cites work @@
+Another Set of Conditions for Strong<i>n</i>(<i>n</i> = −1, 0) Discount Optimality in Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal
+Normal rank