Deterministic policies based on maximum regrets in MDPs with imprecise rewards (Q5069649): Difference between revisions
From MaRDI portal
ReferenceBot (talk | contribs) Changed an Item |
Set OpenAlex properties. |
||
Property / full work available at URL | |||
Property / full work available at URL: https://doi.org/10.3233/aic-190632 / rank | |||
Normal rank | |||
Property / OpenAlex ID | |||
Property / OpenAlex ID: W3201191077 / rank | |||
Normal rank |
Latest revision as of 08:45, 30 July 2024
scientific article; zbMATH DE number 7509003
Language | Label | Description | Also known as |
---|---|---|---|
English | Deterministic policies based on maximum regrets in MDPs with imprecise rewards |
scientific article; zbMATH DE number 7509003 |
Statements
Deterministic policies based on maximum regrets in MDPs with imprecise rewards (English)
0 references
19 April 2022
0 references
Markov decision process
0 references
minimax regret
0 references
unknown rewards
0 references
branch-and-bound
0 references
deterministic policy
0 references
stochastic policy
0 references
0 references
0 references