Non-homogeneous Markov decision processes with a constraint (Q1378679)

From MaRDI portal

Jump to:navigation, search

scientific article

Language	Label	Description	Also known as
English	Non-homogeneous Markov decision processes with a constraint	scientific article

Statements

scholarly article

0 references

Non-homogeneous Markov decision processes with a constraint (English)

0 references

0 references

Journal of Mathematical Analysis and Applications

0 references

publication date

20 April 1999

0 references

This paper considers a finite state and action non-homogeneous Markov decision process with finite state and action sets. There are two reward functions. The goal is to maximize average rewards per unit time for one function subject to the constraint that the total discounted reward for another function is equal to a given value. For unconstrained problems, \textit{W. Hopp, J. Bean} and \textit{R. Smith} [Oper. Res. 35, No. 6, 875-883 (1987; Zbl 0651.90090)] introduced the notion of periodic forecast horizon optimality which was later studied by \textit{J. Bean, R. Smith} and \textit{J. Lasserre} [Math. Oper. Res. 9, 391-401 (1990)] under the name algorithmic optimality. A policy \(\pi\) is algorithmic optimal if there is a sequence of optimal \(N\)-horizon policies that converge to \(\pi\) as \(N\to\infty\). In this paper, the author defines algorithmic optimal policies for problems with the described constraint and provides conditions under which these policies are average optimal among policies satisfying this constraint.

0 references

zbMATH Keywords

constrained Markov decision process

0 references

algorithmic optimality

0 references

average optimality

0 references

Eugene A. Feinberg

0 references

MaRDI profile type

MaRDI publication profile

0 references

full work available at URL

https://doi.org/10.1006/jmaa.1997.5610

0 references

Conditions for the Existence of Planning Horizons

0 references

Denumerable state nonhomogeneous Markov decision processes

0 references

Concepts of Forecast and Decision Horizons: Applications to Dynamic Stochastic Optimization Problems

0 references

Variance-Penalized Markov Decision Processes

0 references

0 references

A New Optimality Criterion for Nonhomogeneous Markov Decision Processes

0 references

Constrained Undiscounted Stochastic Dynamic Programming

0 references

0 references

0 references

Technical Note—Dynamic Programming and Probabilistic Constraints

0 references

Utility, probabilistic constraints, mean and variance of discounted rewards in Markov decision processes

0 references

Mean, variance and probabilistic criteria in finite Markov decision processes: A review

0 references

Identifiers

zbMATH Open document ID

0 references

Mathematics Subject Classification ID

0 references

zbMATH DE Number

0 references

0 references

10.1006/JMAA.1997.5610

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:1378679

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Item:Q1378679&oldid=38637423"