Further contributions to the ''two-armed bandit'' problem (Q1059967)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Further contributions to the ''two-armed bandit'' problem
scientific article

    Statements

    Further contributions to the ''two-armed bandit'' problem (English)
    0 references
    1985
    0 references
    A version of the two-armed bandit with two states of nature and two repeatable experiments is studied. With an infinite horizon and with or without discounting, an optimal procedure is to perform one experiment whenever the posterior probability of one of the states of nature exceeds a constant \(\xi^*\), and perform the other experiment whenever the posterior is less than \(\xi^*\) with indifference when the posterior equals \(\xi^*\). \(\xi^*\) is expressed in terms involving expectations of ladder variables and can be calculated using Spitzer series.
    0 references
    dynamic programming
    0 references
    random walks
    0 references
    two-armed bandit
    0 references
    two states
    0 references
    two repeatable experiments
    0 references
    infinite horizon
    0 references
    discounting
    0 references
    posterior probability
    0 references
    Spitzer series
    0 references
    0 references

    Identifiers