A variance minimization problem for a Markov decision process (Q1091952): Difference between revisions

This paper deals with a discrete time Markov decision process with finite states and finite actions. The author investigates the problem to determine an optimal random policy that minimizes the variance of reward, with some constraint on the average reward. Introducing a parametric Markov decision process, he gives a procedure to find this optimal policy.

0 references

zbMATH Keywords

discrete time Markov decision process

0 references

finite states

0 references

finite actions

0 references

optimal random policy

0 references

variance of reward

0 references

parametric Markov decision process

0 references

reviewed by

Makiko Nisio

0 references

MaRDI profile type

MaRDI publication profile

0 references

full work available at URL

https://doi.org/10.1016/0377-2217(87)90148-2

0 references

cites work

Calculating the variance in Markov-processes with random reward

0 references

Optimal policies for controlled Markov chains with a constraint

0 references

Finite state Markovian decision processes

0 references

Constrained Undiscounted Stochastic Dynamic Programming

0 references

Q3266141

0 references

Markov decision processes with a new optimality criterion: Discrete time

0 references

Q5618142

0 references

The variance of discounted Markov decision processes

0 references

Identifiers

zbMATH Open document ID

0623.90087

0 references

DOI

10.1016/0377-2217(87)90148-2

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:1091952

@@ Property / reviewed by @@
+Makiko Nisio
@@ Property / reviewed by: Makiko Nisio / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1016/0377-2217(87)90148-2
+Normal rank
@@ Property / OpenAlex ID @@
+W1972505433
@@ Property / OpenAlex ID: W1972505433 / rank @@
+Normal rank
@@ Property / cites work @@
+Calculating the variance in Markov-processes with random reward
+Normal rank
@@ Property / cites work @@
+Optimal policies for controlled Markov chains with a constraint
+Normal rank
@@ Property / cites work @@
+Finite state Markovian decision processes
@@ Property / cites work: Finite state Markovian decision processes / rank @@
+Normal rank
@@ Property / cites work @@
+Constrained Undiscounted Stochastic Dynamic Programming
+Normal rank
@@ Property / cites work @@
+Q3266141
@@ Property / cites work: Q3266141 / rank @@
+Normal rank
@@ Property / cites work @@
+Markov decision processes with a new optimality criterion: Discrete time
+Normal rank
@@ Property / cites work @@
+Q5618142
@@ Property / cites work: Q5618142 / rank @@
+Normal rank
@@ Property / cites work @@
+The variance of discounted Markov decision processes
+Normal rank