Suboptimal policy determination for large-scale Markov decision processes. I: Description and bounds (Q799497)

!

WARNING

This is the item page for this Wikibase entity, intended for internal use and editing purposes.

Please use the normal view instead:

Suboptimal policy determination for large-scale Markov decision processes. I: Description and bounds

scientific article; zbMATH DE number 3874992

Language	Label	Description	Also known as
default for all languages	No label defined
English	Suboptimal policy determination for large-scale Markov decision processes. I: Description and bounds	scientific article; zbMATH DE number 3874992

Statements

instance of

scholarly article

0 references

title

Suboptimal policy determination for large-scale Markov decision processes. I: Description and bounds (English)

0 references

0 references

0 references

Journal of Optimization Theory and Applications

0 references

publication date

1985

0 references

review text

This paper is the first of two papers that present and evaluate an approach for determining suboptimal policies for large-scale Markov decision processes (MDP). Part 1 is devoted to the determination of bounds that motivate the development and indicate the quality of the suboptimal design approach; Part 2 [see the following review] is concerned with the implementation and evaluation of the suboptimal design approach. The specific MDP considered is the infinite-horizon, expected total discounted cost MDP with finite state and action spaces. The approach can be described as follows. First, the original MDP is approximated by a specially structured MDP. The special structure suggests how to construct associated smaller, more computationally tractable MDP's. The suboptimal policy for the original MDP is then constructed from the solutions of these smaller MDP's. The key feature of this approach is that the state and action space cardinalities of the smaller MDP's are exponential reductions of the state and action space cardinalities of the original MDP.

0 references

zbMATH Keywords

infinite-horizon expected total discounted cost

0 references

suboptimal policies

0 references

large-scale Markov decision processes

0 references

finite state and action spaces

0 references

MaRDI profile type

MaRDI publication profile