Optimal Exploration–Exploitation in a Multi-armed Bandit Problem with Non-stationary Rewards (Q5113912): Difference between revisions
From MaRDI portal
ReferenceBot (talk | contribs) Changed an Item |
Created claim: Wikidata QID (P12): Q126855665, #quickstatements; #temporary_batch_1722243545156 |
||
Property / Wikidata QID | |||
Property / Wikidata QID: Q126855665 / rank | |||
Normal rank |
Revision as of 10:11, 29 July 2024
scientific article; zbMATH DE number 7213023
Language | Label | Description | Also known as |
---|---|---|---|
English | Optimal Exploration–Exploitation in a Multi-armed Bandit Problem with Non-stationary Rewards |
scientific article; zbMATH DE number 7213023 |
Statements
Optimal Exploration–Exploitation in a Multi-armed Bandit Problem with Non-stationary Rewards (English)
0 references
18 June 2020
0 references
multi-armed bandit
0 references
exploration/exploitation
0 references
nonstationary
0 references
dynamic oracle
0 references
minimax regret
0 references
dynamic regret
0 references