Partially observable total-cost Markov decision processes with weakly continuous transition probabilities
From MaRDI portal
Publication:2806825
Abstract: This paper describes sufficient conditions for the existence of optimal policies for Partially Observable Markov Decision Processes (POMDPs) with Borel state, observation, and action sets and with the expected total costs. Action sets may not be compact and one-step cost functions may be unbounded. The introduced conditions are also sufficient for the validity of optimality equations, semi-continuity of value functions, and convergence of value iterations to optimal values. Since POMDPs can be reduced to Completely Observable Markov Decision Processes (COMDPs), whose states are posterior state distributions, this paper focuses on the validity of the above mentioned optimality properties for COMDPs. The central question is whether transition probabilities for a COMDP are weakly continuous. We introduce sufficient conditions for this and show that the transition probabilities for a COMDP are weakly continuous, if transition probabilities of the underlying Markov Decision Process are weakly continuous and observation probabilities for the POMDP are continuous in the total variation. Moreover, the continuity in the total variation of the observation probabilities cannot be weakened to setwise continuity. The results are illustrated with counterexamples and examples.
Recommendations
- Optimality conditions for partially observable Markov decision processes
- Markov decision processes with incomplete information and semiuniform Feller transition probabilities
- On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes
- Optimal cost almost-sure reachability in POMDPs
- A survey of solution techniques for the partially observed Markov decision process
Cites work
- scientific article; zbMATH DE number 5919866 (Why is no real title available?)
- scientific article; zbMATH DE number 3320878 (Why is no real title available?)
- scientific article; zbMATH DE number 3327774 (Why is no real title available?)
- An incomplete information inventory model with presence of inventories or backorders as only observations
- Average Optimality in Dynamic Programming with General State Space
- Average cost Markov decision processes with weakly continuous transition probabilities
- Bayesian dynamic programming
- Berge's maximum theorem for noncompact image sets
- Berge's theorem for noncompact image sets
- Convergence of probability measures and Markov decision models with incomplete information
- Discrete-Time Markovian Decision Processes with Incomplete State Observation
- Fatou's lemma for weakly converging probabilities
- Incomplete information in Markovian decision models
- Limiting discounted-cost control of partially observable stochastic systems
- Optimal control of partially observable Markovian systems
- Optimality Inequalities for Average Cost Markov Decision Processes and the Stochastic Cash Balance Problem
- Optimization and convergence of observation channels in stochastic control
- Partially Observed Inventory Systems: The Case of Rain Checks
- Partially Observed Inventory Systems: The Case of Zero‐Balance Walk
- Reduction of a Controlled Markov Model with Incomplete Data to a Problem with Complete Information in the Case of Borel State and Control Space
- The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs
- Zero-Sum Ergodic Stochastic Games with Feller Transition Probabilities
Cited in
(37)- Partially observed discrete-time risk-sensitive mean field games
- Continuity of equilibria for two-person zero-sum games with noncompact action sets and unbounded payoffs
- On the optimality equation for average cost Markov decision processes and its validity for inventory control
- Average cost Markov decision processes with weakly continuous transition probabilities
- Fatou's lemma in its classical form and Lebesgue's convergence theorems for varying measures with applications to Markov decision processes
- A universal dynamic program and refined existence results for decentralized stochastic control
- Robustness to Incorrect Priors in Partially Observed Stochastic Control
- MDPs with setwise continuous transition probabilities
- Optimality conditions for partially observable Markov decision processes
- Convergence of probability measures and Markov decision models with incomplete information
- Off-policy evaluation in partially observed Markov decision processes under sequential ignorability
- Robustness to incorrect priors and controlled filter stability in partially observed stochastic control
- Semi-uniform Feller stochastic kernels
- Approximate Nash equilibria in partially observed stochastic games with mean-field interactions
- Risk measurement and risk-averse control of partially observable discrete-time Markov systems
- Information-theoretic multi-time-scale partially observable systems with inspiration from leukemia treatment
- Stochastic setup-cost inventory model with backorders and quasiconvex cost functions
- Stochastic comparative statics in Markov decision processes
- A Fenchel-Moreau-Rockafellar type theorem on the Kantorovich-Wasserstein space with applications in partially observable Markov decision processes
- Optimal cost almost-sure reachability in POMDPs
- Strong uniform value in gambling houses and partially observable Markov decision processes
- scientific article; zbMATH DE number 7625164 (Why is no real title available?)
- Absorbing continuous-time Markov decision processes with total cost criteria
- Convergence theorems for varying measures under convexity conditions and applications
- Average cost optimality of partially observed MDPs: contraction of nonlinear filters and existence of optimal solutions and approximations
- Fatou's lemma for weakly converging measures under the uniform integrability condition
- Weak Feller property of non-linear filters
- Uniform Fatou's lemma
- Average cost Markov decision processes with semi-uniform Feller transition probabilities
- Markov decision processes with incomplete information and semiuniform Feller transition probabilities
- Equivalent conditions for weak continuity of nonlinear filters
- Optimal control of partially observable piecewise deterministic Markov processes
- Robustness to approximations and model learning in MDPs and POMDPs
- Robustness to incorrect system models in stochastic control
- Another look at partially observed optimal stochastic control: existence, ergodicity, and approximations without belief-reduction
- On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes
- Convergence for varying measures
This page was built for publication: Partially observable total-cost Markov decision processes with weakly continuous transition probabilities
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2806825)