On ergodic two-armed bandits (Q417067): Difference between revisions

From MaRDI portal
Import240304020342 (talk | contribs)
Set profile property.
Set OpenAlex properties.
 
(2 intermediate revisions by 2 users not shown)
Property / arXiv ID
 
Property / arXiv ID: 0905.0463 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A two armed bandit type problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3711437 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: How Fast Is the Bandit? / rank
 
Normal rank
Property / cites work
 
Property / cites work: A penalized bandit algorithm / rank
 
Normal rank
Property / cites work
 
Property / cites work: When can the two-armed bandit algorithm be trusted? / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic approximation with averaging innovation applied to Finance / rank
 
Normal rank
Property / cites work
 
Property / cites work: The law of the iterated logarithm for additive functionals of Markov chains / rank
 
Normal rank
Property / cites work
 
Property / cites work: Learning Automata - A Survey / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the linear model with two absorbing barriers / rank
 
Normal rank
Property / cites work
 
Property / cites work: A two armed bandit type problem revisited / rank
 
Normal rank
Property / cites work
 
Property / cites work: Use of Stochastic Automata for Parameter Self-Optimization with Multimodal Performance Criteria / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2136855133 / rank
 
Normal rank

Latest revision as of 09:51, 30 July 2024

scientific article
Language Label Description Also known as
English
On ergodic two-armed bandits
scientific article

    Statements

    On ergodic two-armed bandits (English)
    0 references
    0 references
    0 references
    13 May 2012
    0 references
    This paper focuses on the Narenda two-armed bandit algorithm under assumption that the payoff sequences are unknown and deterministic. The authors consider the Narenda algorithm under the conditions required by \textit{D. Lamberton, G. Pagès} and \textit{P. Tarrès} [Ann. Appl. Probab. 14, No. 3, 1424--1454 (2004; Zbl 1048.62079)] without monotonicity and under weak ergodic assumptions. The obtained results point out that, even with strongly dependent outcomes, the payoff probability accumulates statistical information on the ergodic behaviour of the two arms to induce a corresponding appropriate decision.
    0 references
    convergence
    0 references
    stochastic algorithms
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references