Infinite Horizon Average Cost Dynamic Programming Subject to Total Variation Distance Ambiguity
From MaRDI portal
Publication:5232245
Abstract: We analyze the infinite horizon minimax average cost Markov Control Model (MCM), for a class of controlled process conditional distributions, which belong to a ball, with respect to total variation distance metric, centered at a known nominal controlled conditional distribution with radius , in which the minimization is over the control strategies and the maximization is over conditional distributions. Upon performing the maximization, a dynamic programming equation is obtained which includes, in addition to the standard terms, the oscillator semi-norm of the cost-to-go. First, the dynamic programming equation is analyzed for finite state and control spaces. We show that if the nominal controlled process distribution is irreducible, then for every stationary Markov control policy the maximizing conditional distribution of the controlled process is also irreducible for . Second, the generalized dynamic programming is analyzed for Borel spaces. We derive necessary and sufficient conditions for any control strategy to be optimal. Through our analysis, new dynamic programming equations and new policy iteration algorithms are derived. The main feature of the new policy iteration algorithms (which are applied for finite alphabet spaces) is that the policy evaluation and policy improvement steps are performed by using the maximizing conditional distribution, which is obtained via a water filling solution. Finally, the application of the new dynamic programming equations and the corresponding policy iteration algorithms are shown via illustrative examples.
Recommendations
- Dynamic programming subject to total variation distance ambiguity
- Infinite Horizon Stochastic Programs
- Infinite horizon programs; convergence of approximate solutions
- Existence and discovery of average optimal solutions in deterministic infinite horizon optimization
- Average Cost Optimal Stationary Policies in Infinite State Markov Decision Processes with Unbounded Costs
- An infinite-horizon multistage dynamic optimization problem
- Average optimality in nonhomogeneous infinite horizon Markov decision processes
- Infinite horizon stochastic optimal control problems with running maximum cost
- Adaptive aggregation methods for infinite horizon dynamic programming
- Infinite-horizon deterministic dynamic programming in discrete time: a monotone convergence principle and a penalty method
Cites work
- scientific article; zbMATH DE number 3137662 (Why is no real title available?)
- scientific article; zbMATH DE number 4160608 (Why is no real title available?)
- scientific article; zbMATH DE number 107482 (Why is no real title available?)
- scientific article; zbMATH DE number 193291 (Why is no real title available?)
- scientific article; zbMATH DE number 700091 (Why is no real title available?)
- scientific article; zbMATH DE number 1153603 (Why is no real title available?)
- A Finite-Dimensional Risk-Sensitive Control Problem
- Another set of conditions for average optimality in Markov control processes
- Control of Markov Chains with Long-Run Average Cost Criterion: The Dynamic Programming Equations
- Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey
- Distributionally Robust Counterpart in Markov Decision Processes
- Distributionally robust Markov decision processes
- Dynamic programming and stochastic control
- Dynamic programming subject to total variation distance ambiguity
- Extremum Problems With Total Variation Distance and Their Applications
- Finite horizon minimax optimal control of stochastic partially observed time varying uncertain systems
- Minimax optimal control of stochastic uncertain systems with relative entropy constraints
- Minimum principle for partially observable nonlinear risk-sensitive control problems using measure-valued decompositions
- On Choosing and Bounding Probability Metrics
- On Minimum Cost Per Unit Time Control of Markov Chains
- Risk-sensitive control and dynamic games for partially observed discrete-time nonlinear systems
- Robust MDPs with \(k\)-rectangular uncertainty
- Stochastic Uncertain Systems Subject to Relative Entropy Constraints: Induced Norms and Monotonicity Properties of Minimax Games
- \(H^ \infty\)-optimal control and related minimax design problems. A dynamic game approach.
Cited in
(3)
This page was built for publication: Infinite Horizon Average Cost Dynamic Programming Subject to Total Variation Distance Ambiguity
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5232245)