No label defined (Q3668675)
From MaRDI portal
| This is the item page for this Wikibase entity, intended for internal use and editing purposes. Please use this page instead for the normal view: scientific article; zbMATH DE number 3822951 |
scientific article; zbMATH DE number 3822951
| Language | Label | Description | Also known as |
|---|---|---|---|
| default for all languages | No label defined |
||
| English | No label defined |
scientific article; zbMATH DE number 3822951 |
Statements
1982
0 references
two-armed bandit problem
0 references
finite horizon
0 references
monotonicity properties for expected cumulative discounted reward
0 references
characterizations of optimal policies
0 references
learning algorithm
0 references