On ergodic two-armed bandits (Q417067)

From MaRDI portal

Jump to:navigation, search

This is the item page for this Wikibase entity, intended for internal use and editing purposes.

Please use this page instead for the normal view: On ergodic two-armed bandits

scientific article; zbMATH DE number 6034161

Language	Label	Description	Also known as
default for all languages	No label defined
English	On ergodic two-armed bandits	scientific article; zbMATH DE number 6034161

Statements

scholarly article

0 references

On ergodic two-armed bandits (English)

0 references

0 references

P. Vandekerkhove

0 references

The Annals of Applied Probability

0 references

publication date

13 May 2012

0 references

full work available at URL

https://arxiv.org/abs/0905.0463

0 references

https://projecteuclid.org/euclid.aoap/1333372004

0 references

This paper focuses on the Narenda two-armed bandit algorithm under assumption that the payoff sequences are unknown and deterministic. The authors consider the Narenda algorithm under the conditions required by \textit{D. Lamberton, G. Pagès} and \textit{P. Tarrès} [Ann. Appl. Probab. 14, No. 3, 1424--1454 (2004; Zbl 1048.62079)] without monotonicity and under weak ergodic assumptions. The obtained results point out that, even with strongly dependent outcomes, the payoff probability accumulates statistical information on the ergodic behaviour of the two arms to induce a corresponding appropriate decision.

0 references

Krzysztof Piasecki

0 references

zbMATH Keywords

convergence

0 references

stochastic algorithms

0 references

MaRDI profile type

MaRDI publication profile

0 references

A two armed bandit type problem

0 references

0 references

Stochastic algorithms

0 references

How Fast Is the Bandit?

0 references

A penalized bandit algorithm

0 references

When can the two-armed bandit algorithm be trusted?

0 references

Stochastic approximation with averaging innovation applied to finance

0 references

The law of the iterated logarithm for additive functionals of Markov chains

0 references

Learning Automata - A Survey

0 references

On the linear model with two absorbing barriers

0 references

A two armed bandit type problem revisited

0 references

Use of Stochastic Automata for Parameter Self-Optimization with Multimodal Performance Criteria

0 references

Recommended article

Similarity Score

0.92509377

Recommender Run

Recommender Run 3

0 references

A two armed bandit type problem revisited

Similarity Score

0.8933675

Recommender Run

Recommender Run 3

0 references

Finite-time lower bounds for the two-armed bandit problem

Similarity Score

0.8932164

Recommender Run

Recommender Run 3

0 references

Further contributions to the ''two-armed bandit'' problem

Similarity Score

0.8900617

Recommender Run

Recommender Run 3

0 references

Randomization in the two-armed bandit problem

Similarity Score

0.8827189

Recommender Run

Recommender Run 3

0 references

On two continuum armed bandit problems in high dimensions

Similarity Score

0.88181907

Recommender Run

Recommender Run 3

0 references

Poissonian two-armed bandit: a new approach

Similarity Score

0.8817457

Recommender Run

Recommender Run 3

0 references

On the Bernoulli three-armed bandit problem

Similarity Score

0.87988853

Recommender Run

Recommender Run 3

0 references

On the problem of the two-armed bandit with impulse controls and discounting

Similarity Score

0.8778867

Recommender Run

Recommender Run 3

0 references

Identifiers

zbMATH Open document ID

0 references

10.1214/10-AAP751

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

zbMATH DE Number

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:417067

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Item:Q417067&oldid=42442180"