The learning component of dynamic allocation indices

From MaRDI portal

Publication:1206729

Jump to:navigation, search

DOI10.1214/aos/1176348788zbMath0760.62080OpenAlexW2063853624MaRDI QIDQ1206729

You-Gan Wang, J. C. Gittins

Publication date: 1 April 1993

Published in: The Annals of Statistics (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1214/aos/1176348788

zbMATH Keywords

Gittins index exponential discounting normal distributions multiarmed bandit problem Bernoulli reward process dynamical allocation index expected immediate reward learning component optimal allocation rule target processes upward adjustment

Mathematics Subject Classification ID

Bayesian problems; characterization of Bayes procedures (62C10) Sequential statistical methods (62L99) Optimal stochastic control (93E20) Markov and semi-Markov decision processes (90C40) Sequential statistical design (62L05)

Related Items (9)

Bayesian adaptive bandit-based designs using the Gittins index for multi-armed trials with normally distributed endpoints ⋮ Generalized two-stage bandit problem ⋮ Bandit bounds from stochastic variability extrema ⋮ One-armed bandit process with a covariate ⋮ A confirmation of a conjecture on Feldman’s two-armed bandit problem ⋮ The prediction distribution for the heteroscedastic multivariate lineary models ⋮ Optimal learning with non-Gaussian rewards ⋮ Small-sample performance of Bernoulli two-armed bandit Bayesian strategies ⋮ Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges

This page was built for publication: The learning component of dynamic allocation indices

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1206729&oldid=13276360"