Optimal policies for controlled Markov chains with a constraint (Q1068783)

The authors deal with the dynamic optimization of discrete-time Markovian systems. It is well known that the behaviour of many systems in practice can be described, from a mathematical point of view, by Markov systems. For example, we can mention computer-communication networks, production operations, computer operating systems and macroeconomic systems. However, the aim of the paper is to discuss basic questions of the former optimization problem with constraints. Assumptions under which an optimal policy exists are given in the paper. Further, it is shown that this policy always stationary and either non-randomized stationary, (i.e. simple) or consists of a mix of two non-randomized policies, equivalent to choosing independently one of two simple policies at each time by the toss of a (biased) coin. Lagrangian multiplier techniques are used to derive the mentioned results.

0 references

zbMATH Keywords

dynamic optimization

0 references

discrete-time Markovian systems

0 references

optimal policy

0 references

Lagrangian multiplier techniques

0 references

reviewed by

Vlasta Kaňková