ON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITS (Q5358114)
From MaRDI portal
scientific article; zbMATH DE number 6776204
Language | Label | Description | Also known as |
---|---|---|---|
English | ON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITS |
scientific article; zbMATH DE number 6776204 |
Statements
ON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITS (English)
0 references
19 September 2017
0 references
stochastic dynamic programming
0 references
0 references