On ergodic two-armed bandits (Q417067): Difference between revisions

This paper focuses on the Narenda two-armed bandit algorithm under assumption that the payoff sequences are unknown and deterministic. The authors consider the Narenda algorithm under the conditions required by \textit{D. Lamberton, G. Pagès} and \textit{P. Tarrès} [Ann. Appl. Probab. 14, No. 3, 1424--1454 (2004; Zbl 1048.62079)] without monotonicity and under weak ergodic assumptions. The obtained results point out that, even with strongly dependent outcomes, the payoff probability accumulates statistical information on the ergodic behaviour of the two arms to induce a corresponding appropriate decision.

0 references

reviewed by

Krzysztof Piasecki

0 references

zbMATH Keywords

convergence

0 references

stochastic algorithms

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

A two armed bandit type problem

0 references

Q3711437

0 references

Stochastic algorithms

0 references

How Fast Is the Bandit?

0 references

A penalized bandit algorithm

0 references

When can the two-armed bandit algorithm be trusted?

0 references

Stochastic approximation with averaging innovation applied to Finance

0 references

The law of the iterated logarithm for additive functionals of Markov chains

0 references

Learning Automata - A Survey

0 references

On the linear model with two absorbing barriers

0 references

A two armed bandit type problem revisited

0 references

Use of Stochastic Automata for Parameter Self-Optimization with Multimodal Performance Criteria

0 references

Identifiers

zbMATH Open document ID

1275.62056

0 references

DOI

10.1214/10-AAP751

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:417067

@@ Property / arXiv ID @@
+.0463
@@ Property / arXiv ID: 0905.0463 / rank @@
+Normal rank
@@ Property / cites work @@
+A two armed bandit type problem
@@ Property / cites work: A two armed bandit type problem / rank @@
+Normal rank
@@ Property / cites work @@
+Q3711437
@@ Property / cites work: Q3711437 / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic algorithms
@@ Property / cites work: Stochastic algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+How Fast Is the Bandit?
@@ Property / cites work: How Fast Is the Bandit? / rank @@
+Normal rank
@@ Property / cites work @@
+A penalized bandit algorithm
@@ Property / cites work: A penalized bandit algorithm / rank @@
+Normal rank
@@ Property / cites work @@
+When can the two-armed bandit algorithm be trusted?
+Normal rank
@@ Property / cites work @@
+Stochastic approximation with averaging innovation applied to Finance
+Normal rank
@@ Property / cites work @@
+The law of the iterated logarithm for additive functionals of Markov chains
+Normal rank
@@ Property / cites work @@
+Learning Automata - A Survey
@@ Property / cites work: Learning Automata - A Survey / rank @@
+Normal rank
@@ Property / cites work @@
+On the linear model with two absorbing barriers
@@ Property / cites work: On the linear model with two absorbing barriers / rank @@
+Normal rank
@@ Property / cites work @@
+A two armed bandit type problem revisited
@@ Property / cites work: A two armed bandit type problem revisited / rank @@
+Normal rank
@@ Property / cites work @@
+Use of Stochastic Automata for Parameter Self-Optimization with Multimodal Performance Criteria
+Normal rank
@@ Property / OpenAlex ID @@
+W2136855133
@@ Property / OpenAlex ID: W2136855133 / rank @@
+Normal rank