Combining multiple strategies for multiarmed bandit problems and asymptotic optimality (Q892592): Difference between revisions

Summary: This brief paper provides a simple algorithm that selects a strategy at each time in a given set of multiple strategies for stochastic multiarmed bandit problems, thereby playing the arm by the chosen strategy at each time. The algorithm follows the idea of the probabilistic \(\epsilon_t\)-switching in the \(\epsilon_t\)-greedy strategy and is asymptotically optimal in the sense that the selected strategy converges to the best in the set under some conditions on the strategies in the set and the sequence of \(\epsilon_t\).

0 references

zbMATH Keywords

multiarmed bandit problems

0 references

asymptotic optimality

0 references

multiple strategies

0 references

Identifiers

zbMATH Open document ID

1326.93115

0 references

DOI

10.1155/2015/264953

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:892592

Revision as of 22:04, 27 December 2023 Daniel (talk \| contribs) Bureaucrats, Interface administrators, private, Suppressors, Administrators 617,292 edits ‎Created claim: Wikidata QID (P12): Q59112383, #quickstatements; #temporary_batch_1703710783098 Tag: QuickStatements [1.0.4] ← Older edit	Revision as of 16:10, 30 January 2024 Import240129110113 (talk \| contribs) Bots 7,163,963 edits Added link to MaRDI item. Newer edit →
links / mardi / name	links / mardi / name
		Publication:892592