Arm-acquiring bandits

From MaRDI portal

Publication:1154396

Jump to:navigation, search

DOI10.1214/aop/1176994469zbMath0464.90081OpenAlexW2151323275WikidataQ55980286 ScholiaQ55980286MaRDI QIDQ1154396

Peter Whittle

Publication date: 1981

Published in: The Annals of Probability (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1214/aop/1176994469

zbMATH Keywords

bandit problem allocation index Gittings index policy multiarmed bandit optimal expected total discounted reward optimality of policies

Mathematics Subject Classification ID

Dynamic programming (90C39) Statistical decision theory (62C99) Markov and semi-Markov decision processes (90C40) Nontrigonometric harmonic analysis (42C99)

Related Items (31)

A bisection/successive approximation method for computing Gittins indices ⋮ Optimal control of single-server queueing networks ⋮ Multi-armed bandit problem revisited ⋮ Stochastic scheduling and forwards induction ⋮ Open Bandit Processes with Uncountable States and Time-Backward Effects ⋮ Optimal selection of obsolescence mitigation strategies using a restless bandit model ⋮ Competing Markov decision processes ⋮ Resource capacity allocation to stochastic dynamic competitors: knapsack problem for perishable items and index-knapsack heuristic ⋮ Four proofs of Gittins' multiarmed bandit theorem ⋮ Scheduling of multi-class multi-server queueing systems with abandonments ⋮ Optimal myopic policies and index policies for stochastic scheduling problems ⋮ Multi-machine preventive maintenance scheduling with imperfect interventions: a restless bandit approach ⋮ Index policy for multiarmed bandit problem with dynamic risk measures ⋮ A perpetual search for talents across overlapping generations: a learning process ⋮ Empirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problems ⋮ Optimal Exploration–Exploitation in a Multi-armed Bandit Problem with Non-stationary Rewards ⋮ Optimal schedule of elective surgery operations subject to disruptions by emergencies ⋮ Branching bandits: A sequential search process with correlated pay-offs. ⋮ Flow time distributions in a \(K\) class \(M/G/1\) priority feedback queue ⋮ On the evaluation of strategies for branching bandit processes ⋮ Stochastic scheduling: a short history of index policies and new approaches to index generation for dynamic resource allocation ⋮ Ballooning multi-armed bandits ⋮ Independently Expiring Multiarmed Bandits ⋮ New results for generalized bandit problems ⋮ Tax problems in the undiscounted case ⋮ Two-Armed Restless Bandits with Imperfect Information: Stochastic Control and Indexability ⋮ Generalized Bandit Problems ⋮ Branching Bandit Processes ⋮ Robust control of the multi-armed bandit problem ⋮ A General Theory of MultiArmed Bandit Processes with Constrained Arm Switches ⋮ Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges

This page was built for publication: Arm-acquiring bandits

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:1154396&oldid=13209883"