Tsallis-INF: an optimal algorithm for stochastic and adversarial bandits (Q4998901)

From MaRDI portal

Jump to:navigation, search

This is the item page for this Wikibase entity, intended for internal use and editing purposes.

Please use this page instead for the normal view: Tsallis-INF: an optimal algorithm for stochastic and adversarial bandits

scientific article; zbMATH DE number 7370545

Language	Label	Description	Also known as
default for all languages	No label defined
English	Tsallis-INF: an optimal algorithm for stochastic and adversarial bandits	scientific article; zbMATH DE number 7370545

Statements

scholarly article

0 references

0 references

0 references

publication date

9 July 2021

0 references

full work available at URL

https://arxiv.org/abs/1807.07623

0 references

https://jmlr.csail.mit.edu/papers/v22/19-753.html

0 references

zbMATH Keywords

bandits

0 references

online learning

0 references

best of both worlds

0 references

online mirror descent

0 references

Tsallis entropy

0 references

multi-armed bandits

0 references

stochastic

0 references

adversarial

0 references

I.I.D.

0 references

MaRDI profile type

MaRDI publication profile

0 references

Perturbation techniques in online learning and optimization

0 references

Regret bounds and minimax policies under partial monitoring

0 references

Finite-time analysis of the multiarmed bandit problem

0 references

The Nonstochastic Multiarmed Bandit Problem

0 references

Multi-player bandits revisited

0 references

Regret analysis of stochastic and nonstochastic multi-armed bandit problems

0 references

Kullback-Leibler upper confidence bounds for optimal sequential allocation

0 references

Prediction, Learning, and Games

0 references

Elements of Information Theory

0 references

Thompson sampling: an asymptotically optimal finite-time analysis

0 references

Asymptotically efficient adaptive allocation rules

0 references

Stochastic bandits robust to adversarial corruptions

0 references

A generalized online mirror descent with applications to classification and regression

0 references

Some aspects of the sequential design of experiments

0 references

Online learning and online convex optimization

0 references

Possible generalization of Boltzmann-Gibbs statistics.

0 references

Recommended article

Near-optimal regret bounds for Thompson sampling

Similarity Score

0.7904950976371765

Recommender Run

Recommender Run 4

0 references

A minimax and asymptotically optimal algorithm for stochastic bandits

Similarity Score

0.7741712331771851

Recommender Run

Recommender Run 4

0 references

Algorithms for adversarial bandit problems with multiple plays

Similarity Score

0.7681640386581421

Recommender Run

Recommender Run 4

0 references

The Nonstochastic Multiarmed Bandit Problem

Similarity Score

0.7674009799957275

Recommender Run

Recommender Run 4

0 references

Thompson sampling: an asymptotically optimal finite-time analysis

Similarity Score

0.7644397616386414

Recommender Run

Recommender Run 4

0 references

Identifiers

Mathematics Subject Classification ID

0 references

zbMATH DE Number

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:4998901

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Item:Q4998901&oldid=55386766"