An asymptotically optimal policy for finite support models in the multiarmed bandit problem (Q415624): Difference between revisions

From MaRDI portal
Importer (talk | contribs)
Created a new Item
 
ReferenceBot (talk | contribs)
Changed an Item
 
(6 intermediate revisions by 5 users not shown)
Property / Mathematics Subject Classification ID
 
Property / Mathematics Subject Classification ID: 91A15 / rank
 
Normal rank
Property / Mathematics Subject Classification ID
 
Property / Mathematics Subject Classification ID: 91A26 / rank
 
Normal rank
Property / Mathematics Subject Classification ID
 
Property / Mathematics Subject Classification ID: 90C25 / rank
 
Normal rank
Property / zbMATH DE Number
 
Property / zbMATH DE Number: 6031871 / rank
 
Normal rank
Property / zbMATH Keywords
 
bandit problems
Property / zbMATH Keywords: bandit problems / rank
 
Normal rank
Property / zbMATH Keywords
 
finite-time regret
Property / zbMATH Keywords: finite-time regret / rank
 
Normal rank
Property / zbMATH Keywords
 
MED policy
Property / zbMATH Keywords: MED policy / rank
 
Normal rank
Property / zbMATH Keywords
 
convex optimization
Property / zbMATH Keywords: convex optimization / rank
 
Normal rank
Property / Wikidata QID
 
Property / Wikidata QID: Q56675674 / rank
 
Normal rank
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2131958277 / rank
 
Normal rank
Property / arXiv ID
 
Property / arXiv ID: 0905.2776 / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Continuum-Armed Bandit Problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Sample mean based index policies by <i>O</i>(log <i>n</i>) regret for the multi-armed bandit problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Finite-time analysis of the multiarmed bandit problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Nonstochastic Multiarmed Bandit Problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4821526 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal adaptive policies for sequential allocation problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Elements of Information Theory / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3046711 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Introduction to sensitivity and stability analysis in nonlinear programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4692329 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Multi-armed bandit problem revisited / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Multi-Armed Bandit Problem: Decomposition and Computation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Asymptotically efficient adaptive allocation rules / rank
 
Normal rank
Property / cites work
 
Property / cites work: Exploration of multi-state environments: Local measures and back-propagation of uncertainty / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convergence of stochastic processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Some aspects of the sequential design of experiments / rank
 
Normal rank
Property / cites work
 
Property / cites work: Non-overlapping domain decomposition for evolution operators / rank
 
Normal rank
Property / cites work
 
Property / cites work: Nonparametric bandit methods / rank
 
Normal rank
links / mardi / namelinks / mardi / name
 

Latest revision as of 03:44, 5 July 2024

scientific article
Language Label Description Also known as
English
An asymptotically optimal policy for finite support models in the multiarmed bandit problem
scientific article

    Statements

    An asymptotically optimal policy for finite support models in the multiarmed bandit problem (English)
    0 references
    0 references
    0 references
    8 May 2012
    0 references
    bandit problems
    0 references
    finite-time regret
    0 references
    MED policy
    0 references
    convex optimization
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references