An asymptotically optimal policy for finite support models in the multiarmed bandit problem (Q415624)

From MaRDI portal

Revision as of 04:44, 5 July 2024 by ReferenceBot (talk | contribs) (‎Changed an Item)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Jump to:navigation, search

scientific article

Language	Label	Description	Also known as
English	An asymptotically optimal policy for finite support models in the multiarmed bandit problem	scientific article

Statements

scholarly article

0 references

An asymptotically optimal policy for finite support models in the multiarmed bandit problem (English)

0 references

zbMATH Open document ID

0 references

10.1007/s10994-011-5257-4

0 references

0 references

Akimichi Takemura

0 references

Machine Learning

0 references

publication date

8 May 2012

0 references

full work available at URL

https://arxiv.org/abs/0905.2776

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

zbMATH DE Number

0 references

zbMATH Keywords

bandit problems

0 references

finite-time regret

0 references

MED policy

0 references

convex optimization

0 references

0 references

MaRDI profile type

MaRDI publication profile

0 references

0 references

0 references

The Continuum-Armed Bandit Problem

0 references

Sample mean based index policies by <i>O</i>(log <i>n</i>) regret for the multi-armed bandit problem

0 references

Finite-time analysis of the multiarmed bandit problem

0 references

The Nonstochastic Multiarmed Bandit Problem

0 references

0 references

Optimal adaptive policies for sequential allocation problems

0 references

Elements of Information Theory

0 references

0 references

Introduction to sensitivity and stability analysis in nonlinear programming

0 references

0 references

Multi-armed bandit problem revisited

0 references

The Multi-Armed Bandit Problem: Decomposition and Computation

0 references

Asymptotically efficient adaptive allocation rules

0 references

Exploration of multi-state environments: Local measures and back-propagation of uncertainty

0 references

Convergence of stochastic processes

0 references

Some aspects of the sequential design of experiments

0 references

Non-overlapping domain decomposition for evolution operators

0 references

Nonparametric bandit methods

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:415624

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Item:Q415624&oldid=35276830"