Exact decomposition approaches for Markov decision processes: a survey (Q606196): Difference between revisions

Summary: As classical methods are intractable for solving Markov decision processes (MDPs) requiring a large state space, decomposition and aggregation techniques are very useful to cope with large problems. These techniques are in general a special case of the classic Divide-and-Conquer framework to split a large, unwieldy problem into smaller components and solving the parts in order to construct the global solution. This paper reviews most of decomposition approaches encountered in the associated literature over the past two decades, weighing their pros and cons. We consider several categories of MDPs (average, discounted, and weighted MDPs), and we present briefly a variety of methodologies to find or approximate optimal strategies.

0 references

Mathematics Subject Classification ID

90C40

0 references

0 references

0 references

0 references

MaRDI publication profile

0 references

0 references

0 references

0 references

Calculating availability and performability measures of repairable computer systems using randomization

0 references

Bounds for the Positive Eigenvectors of Nonnegative Matrices and for their Approximations by Decomposition

0 references

Q3321201

0 references

Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey

0 references

Markov decision processes with exponentially representable discounting

0 references

On some algorithms for limiting average Markov decision processes

0 references

Q4427313

0 references

Q3174169

0 references

Q3135096

0 references

Sample-path optimality and variance-maximization for Markov decision processes

0 references

Average optimality for continuous-time Markov decision processes with a policy iteration approach

0 references

Markov Decision Processes with Variance Minimization: A New Condition and Approach

0 references

An improved algorithm for solving communicating average reward Markov decision processes

0 references

Planning and acting in partially observable stochastic domains

0 references

Q3960718

0 references

Q4699290

0 references

The Complexity of Markov Decision Processes

0 references

Algorithms for aggregated limiting average Markov decision problems

0 references

Hierarchical algorithms for discounted and weighted Markov decision processes

0 references

A decomposition algorithm for limiting average Markov decision problems.

0 references

Optimal decision procedures for finite Markov chains. Part III: General convex systems

0 references

Multichain Markov Decision Processes with a Sample Path Constraint: A Decomposition Approach

0 references

Abstraction and approximate decision-theoretic planning.

0 references

Q4506458

0 references

Q3241581

0 references

Using Expectation-Maximization for Reinforcement Learning

0 references

Q4527272

0 references

Q3266141

0 references

Q5630824

0 references

Decomposition Principle for Linear Programs

0 references

Decomposition of systems governed by Markov chains

0 references

Finite state Markovian decision processes

0 references

Q4739658

0 references

Q4315289

0 references

Weighted reward criteria in Competitive Markov Decision Processes

0 references

Q5463020

0 references

A decomposition approach for undiscounted two-person zero-sum stochastic games

0 references

Q4434179

0 references

Q3807647

0 references

Planning in a hierarchy of abstraction spaces

0 references

Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning

0 references

Q4258591

0 references

Q3666564

0 references

Q3998396

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:606196

@@ Property / author @@
-Mohammed Abbad
@@ Property / author: Mohammed Abbad / rank @@
-Normal rank
@@ Property / author @@
+Mohammed Abbad
@@ Property / author: Mohammed Abbad / rank @@
+Normal rank
@@ Property / Wikidata QID @@
+Q58650427
@@ Property / Wikidata QID: Q58650427 / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / OpenAlex ID @@
+W2091387107
@@ Property / OpenAlex ID: W2091387107 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3262596
@@ Property / cites work: Q3262596 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4144754
@@ Property / cites work: Q4144754 / rank @@
+Normal rank
@@ Property / cites work @@
+Calculating availability and performability measures of repairable computer systems using randomization
+Normal rank
@@ Property / cites work @@
+Bounds for the Positive Eigenvectors of Nonnegative Matrices and for their Approximations by Decomposition
+Normal rank
@@ Property / cites work @@
+Q3321201
@@ Property / cites work: Q3321201 / rank @@
+Normal rank
@@ Property / cites work @@
+Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey
+Normal rank
@@ Property / cites work @@
+Markov decision processes with exponentially representable discounting
+Normal rank
@@ Property / cites work @@
+On some algorithms for limiting average Markov decision processes
+Normal rank
@@ Property / cites work @@
+Q4427313
@@ Property / cites work: Q4427313 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3174169
@@ Property / cites work: Q3174169 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3135096
@@ Property / cites work: Q3135096 / rank @@
+Normal rank
@@ Property / cites work @@
+Sample-path optimality and variance-maximization for Markov decision processes
+Normal rank
@@ Property / cites work @@
+Average optimality for continuous-time Markov decision processes with a policy iteration approach
+Normal rank
@@ Property / cites work @@
+Markov Decision Processes with Variance Minimization: A New Condition and Approach
+Normal rank
@@ Property / cites work @@
+An improved algorithm for solving communicating average reward Markov decision processes
+Normal rank
@@ Property / cites work @@
+Planning and acting in partially observable stochastic domains
+Normal rank
@@ Property / cites work @@
+Q3960718
@@ Property / cites work: Q3960718 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4699290
@@ Property / cites work: Q4699290 / rank @@
+Normal rank
@@ Property / cites work @@
+The Complexity of Markov Decision Processes
@@ Property / cites work: The Complexity of Markov Decision Processes / rank @@
+Normal rank
@@ Property / cites work @@
+Algorithms for aggregated limiting average Markov decision problems
+Normal rank
@@ Property / cites work @@
+Hierarchical algorithms for discounted and weighted Markov decision processes
+Normal rank
@@ Property / cites work @@
+A decomposition algorithm for limiting average Markov decision problems.
+Normal rank
@@ Property / cites work @@
+Optimal decision procedures for finite Markov chains. Part III: General convex systems
+Normal rank
@@ Property / cites work @@
+Multichain Markov Decision Processes with a Sample Path Constraint: A Decomposition Approach
+Normal rank
@@ Property / cites work @@
+Abstraction and approximate decision-theoretic planning.
+Normal rank
@@ Property / cites work @@
+Q4506458
@@ Property / cites work: Q4506458 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3241581
@@ Property / cites work: Q3241581 / rank @@
+Normal rank
@@ Property / cites work @@
+Using Expectation-Maximization for Reinforcement Learning
+Normal rank
@@ Property / cites work @@
+Q4527272
@@ Property / cites work: Q4527272 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3266141
@@ Property / cites work: Q3266141 / rank @@
+Normal rank
@@ Property / cites work @@
+Q5630824
@@ Property / cites work: Q5630824 / rank @@
+Normal rank
@@ Property / cites work @@
+Decomposition Principle for Linear Programs
@@ Property / cites work: Decomposition Principle for Linear Programs / rank @@
+Normal rank
@@ Property / cites work @@
+Decomposition of systems governed by Markov chains
+Normal rank
@@ Property / cites work @@
+Finite state Markovian decision processes
@@ Property / cites work: Finite state Markovian decision processes / rank @@
+Normal rank
@@ Property / cites work @@
+Q4739658
@@ Property / cites work: Q4739658 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4315289
@@ Property / cites work: Q4315289 / rank @@
+Normal rank
@@ Property / cites work @@
+Weighted reward criteria in Competitive Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+Q5463020
@@ Property / cites work: Q5463020 / rank @@
+Normal rank
@@ Property / cites work @@
+A decomposition approach for undiscounted two-person zero-sum stochastic games
+Normal rank
@@ Property / cites work @@
+Q4434179
@@ Property / cites work: Q4434179 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3807647
@@ Property / cites work: Q3807647 / rank @@
+Normal rank
@@ Property / cites work @@
+Planning in a hierarchy of abstraction spaces
@@ Property / cites work: Planning in a hierarchy of abstraction spaces / rank @@
+Normal rank
@@ Property / cites work @@
+Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning
+Normal rank
@@ Property / cites work @@
+Q4258591
@@ Property / cites work: Q4258591 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3666564
@@ Property / cites work: Q3666564 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3998396
@@ Property / cites work: Q3998396 / rank @@
+Normal rank