Interval Markov Decision Processes with Continuous Action-Spaces
From MaRDI portal
Publication:6202090
Abstract: Interval Markov Decision Processes (IMDPs) are finite-state uncertain Markov models, where the transition probabilities belong to intervals. Recently, there has been a surge of research on employing IMDPs as abstractions of stochastic systems for control synthesis. However, due to the absence of algorithms for synthesis over IMDPs with continuous action-spaces, the action-space is assumed discrete a-priori, which is a restrictive assumption for many applications. Motivated by this, we introduce continuous-action IMDPs (caIMDPs), where the bounds on transition probabilities are functions of the action variables, and study value iteration for maximizing expected cumulative rewards. Specifically, we decompose the max-min problem associated to value iteration to max problems, where is the number of states of the caIMDP. Then, exploiting the simple form of these max problems, we identify cases where value iteration over caIMDPs can be solved efficiently (e.g., with linear or convex programming). We also gain other interesting insights: e.g., in certain cases where the action set is a polytope, synthesis over a discrete-action IMDP, where the actions are the vertices of , is sufficient for optimality. We demonstrate our results on a numerical example. Finally, we include a short discussion on employing caIMDPs as abstractions for control synthesis.
Cites work
- scientific article; zbMATH DE number 2107836 (Why is no real title available?)
- scientific article; zbMATH DE number 3894186 (Why is no real title available?)
- A linear max—min problem
- Abstraction-based synthesis for stochastic systems with omega-regular objectives
- Automated verification and synthesis of stochastic hybrid systems: a survey
- Bounded-parameter Markov decision processes
- Convex Analysis
- Formal Verification and Synthesis for Discrete-Time Stochastic Systems
- Formal and Efficient Synthesis for Continuous-Time Linear Stochastic Hybrid Processes
- Hybrid Systems: Computation and Control
- Hybrid switching diffusions. Properties and applications
- Parametric probabilistic transition systems for system design and analysis
- Robust Control of Markov Decision Processes with Uncertain Transition Matrices
- Stochastic optimal control. The discrete time case
- dReal: an SMT solver for nonlinear theories over the reals
This page was built for publication: Interval Markov Decision Processes with Continuous Action-Spaces
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6202090)