Partially observable total-cost Markov decision processes with weakly continuous transition probabilities
From MaRDI portal
Publication:2806825
Abstract: This paper describes sufficient conditions for the existence of optimal policies for Partially Observable Markov Decision Processes (POMDPs) with Borel state, observation, and action sets and with the expected total costs. Action sets may not be compact and one-step cost functions may be unbounded. The introduced conditions are also sufficient for the validity of optimality equations, semi-continuity of value functions, and convergence of value iterations to optimal values. Since POMDPs can be reduced to Completely Observable Markov Decision Processes (COMDPs), whose states are posterior state distributions, this paper focuses on the validity of the above mentioned optimality properties for COMDPs. The central question is whether transition probabilities for a COMDP are weakly continuous. We introduce sufficient conditions for this and show that the transition probabilities for a COMDP are weakly continuous, if transition probabilities of the underlying Markov Decision Process are weakly continuous and observation probabilities for the POMDP are continuous in the total variation. Moreover, the continuity in the total variation of the observation probabilities cannot be weakened to setwise continuity. The results are illustrated with counterexamples and examples.
Recommendations
- Optimality conditions for partially observable Markov decision processes
- Markov decision processes with incomplete information and semiuniform Feller transition probabilities
- On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes
- Optimal cost almost-sure reachability in POMDPs
- A survey of solution techniques for the partially observed Markov decision process
Cites work
- scientific article; zbMATH DE number 5919866 (Why is no real title available?)
- scientific article; zbMATH DE number 3320878 (Why is no real title available?)
- scientific article; zbMATH DE number 3327774 (Why is no real title available?)
- An incomplete information inventory model with presence of inventories or backorders as only observations
- Average Optimality in Dynamic Programming with General State Space
- Average cost Markov decision processes with weakly continuous transition probabilities
- Bayesian dynamic programming
- Berge's maximum theorem for noncompact image sets
- Berge's theorem for noncompact image sets
- Convergence of probability measures and Markov decision models with incomplete information
- Discrete-Time Markovian Decision Processes with Incomplete State Observation
- Fatou's lemma for weakly converging probabilities
- Incomplete information in Markovian decision models
- Limiting discounted-cost control of partially observable stochastic systems
- Optimal control of partially observable Markovian systems
- Optimality Inequalities for Average Cost Markov Decision Processes and the Stochastic Cash Balance Problem
- Optimization and convergence of observation channels in stochastic control
- Partially Observed Inventory Systems: The Case of Rain Checks
- Partially Observed Inventory Systems: The Case of Zero‐Balance Walk
- Reduction of a Controlled Markov Model with Incomplete Data to a Problem with Complete Information in the Case of Borel State and Control Space
- The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs
- Zero-Sum Ergodic Stochastic Games with Feller Transition Probabilities
Cited in
(37)- Optimal cost almost-sure reachability in POMDPs
- Absorbing continuous-time Markov decision processes with total cost criteria
- Weak Feller property of non-linear filters
- Uniform Fatou's lemma
- Markov decision processes with incomplete information and semiuniform Feller transition probabilities
- Fatou's lemma for weakly converging measures under the uniform integrability condition
- Stochastic comparative statics in Markov decision processes
- Strong uniform value in gambling houses and partially observable Markov decision processes
- Stochastic setup-cost inventory model with backorders and quasiconvex cost functions
- Robustness to incorrect system models in stochastic control
- Partially observed discrete-time risk-sensitive mean field games
- Robustness to Incorrect Priors in Partially Observed Stochastic Control
- Off-policy evaluation in partially observed Markov decision processes under sequential ignorability
- Risk measurement and risk-averse control of partially observable discrete-time Markov systems
- A universal dynamic program and refined existence results for decentralized stochastic control
- A Fenchel-Moreau-Rockafellar type theorem on the Kantorovich-Wasserstein space with applications in partially observable Markov decision processes
- Average cost Markov decision processes with weakly continuous transition probabilities
- Optimal control of partially observable piecewise deterministic Markov processes
- Average cost optimality of partially observed MDPs: contraction of nonlinear filters and existence of optimal solutions and approximations
- Fatou's lemma in its classical form and Lebesgue's convergence theorems for varying measures with applications to Markov decision processes
- MDPs with setwise continuous transition probabilities
- On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes
- Robustness to incorrect priors and controlled filter stability in partially observed stochastic control
- Convergence of probability measures and Markov decision models with incomplete information
- Convergence theorems for varying measures under convexity conditions and applications
- Average cost Markov decision processes with semi-uniform Feller transition probabilities
- Optimality conditions for partially observable Markov decision processes
- scientific article; zbMATH DE number 7625164 (Why is no real title available?)
- Approximate Nash equilibria in partially observed stochastic games with mean-field interactions
- Another look at partially observed optimal stochastic control: existence, ergodicity, and approximations without belief-reduction
- Robustness to approximations and model learning in MDPs and POMDPs
- Equivalent conditions for weak continuity of nonlinear filters
- Semi-uniform Feller stochastic kernels
- Convergence for varying measures
- Continuity of equilibria for two-person zero-sum games with noncompact action sets and unbounded payoffs
- On the optimality equation for average cost Markov decision processes and its validity for inventory control
- Information-theoretic multi-time-scale partially observable systems with inspiration from leukemia treatment
This page was built for publication: Partially observable total-cost Markov decision processes with weakly continuous transition probabilities
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2806825)