Adaptive treatment allocation and the multi-armed bandit problem (Q1102059): Difference between revisions

From MaRDI portal
Added link to MaRDI item.
Set OpenAlex properties.
 
(3 intermediate revisions by 2 users not shown)
Property / author
 
Property / author: Tze Leung Lai / rank
Normal rank
 
Property / reviewed by
 
Property / reviewed by: Kevin D. Glazebrook / rank
Normal rank
 
Property / author
 
Property / author: Tze Leung Lai / rank
 
Normal rank
Property / reviewed by
 
Property / reviewed by: Kevin D. Glazebrook / rank
 
Normal rank
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1214/aos/1176350495 / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W1973885534 / rank
 
Normal rank

Latest revision as of 02:55, 20 March 2024

scientific article
Language Label Description Also known as
English
Adaptive treatment allocation and the multi-armed bandit problem
scientific article

    Statements

    Adaptive treatment allocation and the multi-armed bandit problem (English)
    0 references
    1987
    0 references
    There are k distinct statistical populations each specified by a univariate density function characterized by a parameter of unknown value. The question concerns how \(x_ 1,x_ 2,...,x_ N\) should be sampled sequentially from the k populations in order to maximize (in some sense) the mean value of their sum. A class of simple allocation rules based on upper confidence bounds for the population parameters is proposed. These rules are shown to exhibit asymptotic optimality in both a Bayesian and a frequentist sense. A simulation study provides evidence that the rules perform well even for moderate values of N.
    0 references
    adaptive treatment allocation
    0 references
    multi-armed bandit problem
    0 references
    boundary crossing
    0 references
    adaptive control
    0 references
    dynamic allocation
    0 references
    upper confidence bounds
    0 references
    asymptotic optimality
    0 references
    simulation study
    0 references
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references