On ergodic two-armed bandits (Q417067)

From MaRDI portal





scientific article; zbMATH DE number 6034161
Language Label Description Also known as
default for all languages
No label defined
    English
    On ergodic two-armed bandits
    scientific article; zbMATH DE number 6034161

      Statements

      On ergodic two-armed bandits (English)
      0 references
      0 references
      0 references
      13 May 2012
      0 references
      This paper focuses on the Narenda two-armed bandit algorithm under assumption that the payoff sequences are unknown and deterministic. The authors consider the Narenda algorithm under the conditions required by \textit{D. Lamberton, G. Pagès} and \textit{P. Tarrès} [Ann. Appl. Probab. 14, No. 3, 1424--1454 (2004; Zbl 1048.62079)] without monotonicity and under weak ergodic assumptions. The obtained results point out that, even with strongly dependent outcomes, the payoff probability accumulates statistical information on the ergodic behaviour of the two arms to induce a corresponding appropriate decision.
      0 references
      convergence
      0 references
      stochastic algorithms
      0 references

      Identifiers

      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references
      0 references