On Convergence of Value Iteration for a Class of Total Cost Markov Decision Processes (Q5502179): Difference between revisions

From MaRDI portal
Importer (talk | contribs)
Changed an Item
ReferenceBot (talk | contribs)
Changed an Item
 
Property / cites work
 
Property / cites work: Monotone Mappings with Application in Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic optimal control. The discrete time case / rank
 
Normal rank
Property / cites work
 
Property / cites work: An Analysis of Stochastic Shortest Path Problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4257216 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5583572 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Borel Set Not Containing a Graph / rank
 
Normal rank
Property / cites work
 
Property / cites work: The optimal reward operator in dynamic programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3527701 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Value iteration and optimization of multiclass queueing networks / rank
 
Normal rank
Property / cites work
 
Property / cites work: Real Analysis and Probability / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Expected Total Cost Criterion for Markov Decision Processes under Constraints: A Convex Analytic Approach / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Expected Total Cost Criterion for Markov Decision Processes under Constraints / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3237805 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3807013 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4547438 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Average Cost Markov Decision Processes with Weakly Continuous Transition Probabilities / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3329244 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A simple proof of Whittle's bridging condition in dynamic programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4255598 / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the Optimality of Structured Policies in Countable Stage Decision Processes. II: Positive and Negative Problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5541832 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4421713 / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Optimal Reward Operator in Negative Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4881151 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Control Techniques for Complex Networks / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the Existence of Stationary Optimal Strategies / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4315289 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stationary Policies in Dynamic Programming Models Under Compactness Assumptions / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stationary policies and Markov policies in Borel dynamic programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4194027 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Universally Measurable Policies in Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4626283 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Algorithms for Reinforcement Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Asynchronous stochastic approximation and Q-learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3912356 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4192588 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A simple condition for regularity in negative programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable Policies / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3975565 / rank
 
Normal rank

Latest revision as of 16:22, 10 July 2024

scientific article; zbMATH DE number 6473213
Language Label Description Also known as
English
On Convergence of Value Iteration for a Class of Total Cost Markov Decision Processes
scientific article; zbMATH DE number 6473213

    Statements

    On Convergence of Value Iteration for a Class of Total Cost Markov Decision Processes (English)
    0 references
    0 references
    0 references
    18 August 2015
    0 references
    0 references
    0 references
    0 references
    0 references
    discrete-time stochastic optimal control
    0 references
    Markov decision processes
    0 references
    infinite spaces
    0 references
    dynamic programming
    0 references
    value iteration
    0 references
    convergence
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references