On ergodic two-armed bandits (Q417067): Difference between revisions

From MaRDI portal
Added link to MaRDI item.
RedirectionBot (talk | contribs)
Removed claim: author (P16): Item:Q417066
Property / author
 
Property / author: Pierre Vandekerkhove / rank
Normal rank
 

Revision as of 16:10, 14 February 2024

scientific article
Language Label Description Also known as
English
On ergodic two-armed bandits
scientific article

    Statements

    On ergodic two-armed bandits (English)
    0 references
    0 references
    0 references
    13 May 2012
    0 references
    This paper focuses on the Narenda two-armed bandit algorithm under assumption that the payoff sequences are unknown and deterministic. The authors consider the Narenda algorithm under the conditions required by \textit{D. Lamberton, G. Pagès} and \textit{P. Tarrès} [Ann. Appl. Probab. 14, No. 3, 1424--1454 (2004; Zbl 1048.62079)] without monotonicity and under weak ergodic assumptions. The obtained results point out that, even with strongly dependent outcomes, the payoff probability accumulates statistical information on the ergodic behaviour of the two arms to induce a corresponding appropriate decision.
    0 references
    convergence
    0 references
    stochastic algorithms
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references