A variance minimization problem for a Markov decision process (Q1091952): Difference between revisions
From MaRDI portal
Set OpenAlex properties. |
ReferenceBot (talk | contribs) Changed an Item |
||
Property / cites work | |||
Property / cites work: Calculating the variance in Markov-processes with random reward / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Optimal policies for controlled Markov chains with a constraint / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Finite state Markovian decision processes / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Constrained Undiscounted Stochastic Dynamic Programming / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q3266141 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Markov decision processes with a new optimality criterion: Discrete time / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q5618142 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: The variance of discounted Markov decision processes / rank | |||
Normal rank |
Latest revision as of 09:51, 18 June 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | A variance minimization problem for a Markov decision process |
scientific article |
Statements
A variance minimization problem for a Markov decision process (English)
0 references
1987
0 references
This paper deals with a discrete time Markov decision process with finite states and finite actions. The author investigates the problem to determine an optimal random policy that minimizes the variance of reward, with some constraint on the average reward. Introducing a parametric Markov decision process, he gives a procedure to find this optimal policy.
0 references
discrete time Markov decision process
0 references
finite states
0 references
finite actions
0 references
optimal random policy
0 references
variance of reward
0 references
parametric Markov decision process
0 references