Some reward–penalty rules for the multi-armed bandit problem which are asymptotically optimal (Q4743532)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: Some reward–penalty rules for the multi-armed bandit problem which are asymptotically optimal |
scientific article; zbMATH DE number 3798793
| Language | Label | Description | Also known as |
|---|---|---|---|
| default for all languages | No label defined |
||
| English | Some reward–penalty rules for the multi-armed bandit problem which are asymptotically optimal |
scientific article; zbMATH DE number 3798793 |
Statements
Some reward–penalty rules for the multi-armed bandit problem which are asymptotically optimal (English)
0 references
1983
0 references
multirmed bandit problem
0 references
randomised allocation indices
0 references
Gittins index
0 references
mathematical learning
0 references