Partially observable total-cost Markov decision processes with weakly continuous transition probabilities
From MaRDI portal
Publication:2806825
DOI10.1287/MOOR.2015.0746zbMATH Open1338.90445arXiv1401.2168OpenAlexW2963292203MaRDI QIDQ2806825FDOQ2806825
Authors: Eugene A. Feinberg, Pavlo O. Kasyanov, Michael Z. Zgurovsky
Publication date: 19 May 2016
Published in: Mathematics of Operations Research (Search for Journal in Brave)
Abstract: This paper describes sufficient conditions for the existence of optimal policies for Partially Observable Markov Decision Processes (POMDPs) with Borel state, observation, and action sets and with the expected total costs. Action sets may not be compact and one-step cost functions may be unbounded. The introduced conditions are also sufficient for the validity of optimality equations, semi-continuity of value functions, and convergence of value iterations to optimal values. Since POMDPs can be reduced to Completely Observable Markov Decision Processes (COMDPs), whose states are posterior state distributions, this paper focuses on the validity of the above mentioned optimality properties for COMDPs. The central question is whether transition probabilities for a COMDP are weakly continuous. We introduce sufficient conditions for this and show that the transition probabilities for a COMDP are weakly continuous, if transition probabilities of the underlying Markov Decision Process are weakly continuous and observation probabilities for the POMDP are continuous in the total variation. Moreover, the continuity in the total variation of the observation probabilities cannot be weakened to setwise continuity. The results are illustrated with counterexamples and examples.
Full work available at URL: https://arxiv.org/abs/1401.2168
Recommendations
- Optimality conditions for partially observable Markov decision processes
- Markov decision processes with incomplete information and semiuniform Feller transition probabilities
- On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes
- Optimal cost almost-sure reachability in POMDPs
- A survey of solution techniques for the partially observed Markov decision process
Cites Work
- The Optimal Control of Partially Observable Markov Processes over the Infinite Horizon: Discounted Costs
- Average Optimality in Dynamic Programming with General State Space
- Zero-Sum Ergodic Stochastic Games with Feller Transition Probabilities
- Title not available (Why is that?)
- Partially Observed Inventory Systems: The Case of Zero‐Balance Walk
- Optimal control of partially observable Markovian systems
- Average cost Markov decision processes with weakly continuous transition probabilities
- Bayesian dynamic programming
- Incomplete information in Markovian decision models
- Berge's maximum theorem for noncompact image sets
- Reduction of a Controlled Markov Model with Incomplete Data to a Problem with Complete Information in the Case of Borel State and Control Space
- Convergence of probability measures and Markov decision models with incomplete information
- Optimality Inequalities for Average Cost Markov Decision Processes and the Stochastic Cash Balance Problem
- Discrete-Time Markovian Decision Processes with Incomplete State Observation
- Berge's theorem for noncompact image sets
- Fatou's lemma for weakly converging probabilities
- Title not available (Why is that?)
- Partially Observed Inventory Systems: The Case of Rain Checks
- An incomplete information inventory model with presence of inventories or backorders as only observations
- Limiting discounted-cost control of partially observable stochastic systems
- Optimization and convergence of observation channels in stochastic control
- Title not available (Why is that?)
Cited In (37)
- Optimal cost almost-sure reachability in POMDPs
- Absorbing continuous-time Markov decision processes with total cost criteria
- Weak Feller property of non-linear filters
- Uniform Fatou's lemma
- Markov decision processes with incomplete information and semiuniform Feller transition probabilities
- Fatou's lemma for weakly converging measures under the uniform integrability condition
- Stochastic comparative statics in Markov decision processes
- Strong uniform value in gambling houses and partially observable Markov decision processes
- Stochastic setup-cost inventory model with backorders and quasiconvex cost functions
- Robustness to incorrect system models in stochastic control
- Partially observed discrete-time risk-sensitive mean field games
- Robustness to Incorrect Priors in Partially Observed Stochastic Control
- Off-policy evaluation in partially observed Markov decision processes under sequential ignorability
- Risk measurement and risk-averse control of partially observable discrete-time Markov systems
- A universal dynamic program and refined existence results for decentralized stochastic control
- A Fenchel-Moreau-Rockafellar type theorem on the Kantorovich-Wasserstein space with applications in partially observable Markov decision processes
- Average cost Markov decision processes with weakly continuous transition probabilities
- Average cost optimality of partially observed MDPs: contraction of nonlinear filters and existence of optimal solutions and approximations
- Optimal control of partially observable piecewise deterministic Markov processes
- Fatou's lemma in its classical form and Lebesgue's convergence theorems for varying measures with applications to Markov decision processes
- On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes
- MDPs with setwise continuous transition probabilities
- Robustness to incorrect priors and controlled filter stability in partially observed stochastic control
- Convergence theorems for varying measures under convexity conditions and applications
- Average cost Markov decision processes with semi-uniform Feller transition probabilities
- Convergence of probability measures and Markov decision models with incomplete information
- Optimality conditions for partially observable Markov decision processes
- Title not available (Why is that?)
- Approximate Nash equilibria in partially observed stochastic games with mean-field interactions
- Another look at partially observed optimal stochastic control: existence, ergodicity, and approximations without belief-reduction
- Equivalent conditions for weak continuity of nonlinear filters
- Robustness to approximations and model learning in MDPs and POMDPs
- Semi-uniform Feller stochastic kernels
- Information-theoretic multi-time-scale partially observable systems with inspiration from leukemia treatment
- Convergence for varying measures
- Continuity of equilibria for two-person zero-sum games with noncompact action sets and unbounded payoffs
- On the optimality equation for average cost Markov decision processes and its validity for inventory control
This page was built for publication: Partially observable total-cost Markov decision processes with weakly continuous transition probabilities
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2806825)