Suboptimal policy determination for large-scale Markov decision processes. I: Description and bounds (Q799497): Difference between revisions

This paper is the first of two papers that present and evaluate an approach for determining suboptimal policies for large-scale Markov decision processes (MDP). Part 1 is devoted to the determination of bounds that motivate the development and indicate the quality of the suboptimal design approach; Part 2 [see the following review] is concerned with the implementation and evaluation of the suboptimal design approach. The specific MDP considered is the infinite-horizon, expected total discounted cost MDP with finite state and action spaces. The approach can be described as follows. First, the original MDP is approximated by a specially structured MDP. The special structure suggests how to construct associated smaller, more computationally tractable MDP's. The suboptimal policy for the original MDP is then constructed from the solutions of these smaller MDP's. The key feature of this approach is that the state and action space cardinalities of the smaller MDP's are exponential reductions of the state and action space cardinalities of the original MDP.

0 references

Mathematics Subject Classification ID

90C40

0 references

zbMATH DE Number

3874992

0 references

zbMATH Keywords

infinite-horizon expected total discounted cost

0 references

suboptimal policies

0 references

large-scale Markov decision processes

0 references

finite state and action spaces

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

A survey of maintenance models: The control and surveillance of deteriorating systems

0 references

Quality Control under Markovian Deterioration

0 references

Q5561586

0 references

Q3910270

0 references

Convex composite multi-objective nonsmooth programming

0 references

Applications of dynamic programming and other optimization methods in pest management

0 references

Optimal Integrated Control of Univoltine Pest Populations with Age Structure

0 references

Approximations of Dynamic Programs, I

0 references

Approximations of Dynamic Programs, II

0 references

An Iterative Aggregation Procedure for Markov Decision Processes

0 references

Multilayer control of large Markov chains

0 references

Suboptimal Design for Large Scale, Multimodule Systems

0 references

Dynamic programming and stochastic control

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:799497

@@ Property / cites work @@
+A survey of maintenance models: The control and surveillance of deteriorating systems
+Normal rank
@@ Property / cites work @@
+Quality Control under Markovian Deterioration
@@ Property / cites work: Quality Control under Markovian Deterioration / rank @@
+Normal rank
@@ Property / cites work @@
+Q5561586
@@ Property / cites work: Q5561586 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3910270
@@ Property / cites work: Q3910270 / rank @@
+Normal rank
@@ Property / cites work @@
+Convex composite multi-objective nonsmooth programming
+Normal rank
@@ Property / cites work @@
+Applications of dynamic programming and other optimization methods in pest management
+Normal rank
@@ Property / cites work @@
+Optimal Integrated Control of Univoltine Pest Populations with Age Structure
+Normal rank
@@ Property / cites work @@
+Approximations of Dynamic Programs, I
@@ Property / cites work: Approximations of Dynamic Programs, I / rank @@
+Normal rank
@@ Property / cites work @@
+Approximations of Dynamic Programs, II
@@ Property / cites work: Approximations of Dynamic Programs, II / rank @@
+Normal rank
@@ Property / cites work @@
+An Iterative Aggregation Procedure for Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+Multilayer control of large Markov chains
@@ Property / cites work: Multilayer control of large Markov chains / rank @@
+Normal rank
@@ Property / cites work @@
+Suboptimal Design for Large Scale, Multimodule Systems
+Normal rank
@@ Property / cites work @@
+Dynamic programming and stochastic control
@@ Property / cites work: Dynamic programming and stochastic control / rank @@
+Normal rank