Policy iteration for continuous-time average reward Markov decision processes in Polish spaces (Q963139): Difference between revisions

From MaRDI portal
Set OpenAlex properties.
ReferenceBot (talk | contribs)
Changed an Item
 
Property / cites work
 
Property / cites work: Q3266141 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Counter examples for compact action markov decision chains with average reward criteria / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4315289 / rank
 
Normal rank
Property / cites work
 
Property / cites work: On undiscounted markovian decision processes with compact action spaces / rank
 
Normal rank
Property / cites work
 
Property / cites work: Multichain Markov Renewal Programs / rank
 
Normal rank
Property / cites work
 
Property / cites work: Drift and monotonicity conditions for continuous-time controlled markov chains with an average criterion / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal Control of Ergodic Continuous-Time Markov Chains with Average Sample-Path Rewards / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4255598 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Policy iteration for average cost Markov control processes on Borel spaces / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the Convergence of Policy Iteration in Finite State Undiscounted Markov Decision Processes: The Unichain Case / rank
 
Normal rank
Property / cites work
 
Property / cites work: A new policy iteration scheme for Markov decision processes using Schweitzer's formula / rank
 
Normal rank
Property / cites work
 
Property / cites work: The policy iteration algorithm for average reward Markov decision processes with general state space / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convergence Properties of Policy Iteration / rank
 
Normal rank
Property / cites work
 
Property / cites work: Average optimality for continuous-time Markov decision processes with a policy iteration approach / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Note on the Convergence of Policy Iteration in Markov Decision Processes with Compact Action Spaces / rank
 
Normal rank
Property / cites work
 
Property / cites work: Average optimality for continuous-time Markov decision processes in Polish spaces / rank
 
Normal rank
Property / cites work
 
Property / cites work: Average optimality inequality for continuous-time Markov decision processes in Polish spaces / rank
 
Normal rank
Property / cites work
 
Property / cites work: Bias and Overtaking Optimality for Continuous-Time Jump Markov Decision Processes in Polish Spaces / rank
 
Normal rank
Property / cites work
 
Property / cites work: Computable exponential convergence rates for stochastically ordered Markov processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Markov Decision Processes with Variance Minimization: A New Condition and Approach / rank
 
Normal rank
Property / cites work
 
Property / cites work: Another set of conditions for Markov decision processes with average sample-path costs / rank
 
Normal rank
Property / cites work
 
Property / cites work: Another Set of Conditions for Strong<i>n</i>(<i>n</i> = −1, 0) Discount Optimality in Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal / rank
 
Normal rank

Latest revision as of 15:45, 2 July 2024

scientific article
Language Label Description Also known as
English
Policy iteration for continuous-time average reward Markov decision processes in Polish spaces
scientific article

    Statements

    Policy iteration for continuous-time average reward Markov decision processes in Polish spaces (English)
    0 references
    0 references
    0 references
    0 references
    8 April 2010
    0 references
    Summary: We study the policy iteration algorithm (PIA) for continuous-time jump Markov decision processes in general state and action spaces. The corresponding transition rates are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. The criterion that we are concerned with is expected average reward. We propose a set of conditions under which we first establish the average reward optimality equation and present the PIA. Then, under two slightly different sets of conditions, we show that the PIA yields the optimal (maximum) reward, an average optimal stationary policy, and a solution to the average reward optimality equation.
    0 references
    transition rates
    0 references
    reward rates
    0 references
    average reward optimality equation
    0 references
    0 references
    0 references
    0 references

    Identifiers