Multi‐Armed Bandit Allocation Indices

From MaRDI portal
Publication:3083924

DOI10.1002/9780470980033zbMath1401.90257OpenAlexW2499002200MaRDI QIDQ3083924

Richard R. Weber, Kevin D. Glazebrook, J. C. Gittins

Publication date: 24 March 2011

Full work available at URL: https://doi.org/10.1002/9780470980033




Related Items (91)

Optimal stopping problems with restricted stopping timesBayesian adaptive bandit-based designs using the Gittins index for multi-armed trials with normally distributed endpointsA forwards induction approach to candidate drug selectionConditions for indexability of restless bandits and an algorithm to compute Whittle indexInfomax strategies for an optimal balance between exploration and exploitationOpen Bandit Processes with Uncountable States and Time-Backward EffectsMinimizing the mean slowdown in a single-server queueIncentivizing Exploration with Heterogeneous Value of MoneyControl problems in online advertising and benefits of randomized bidding strategiesFour proofs of Gittins' multiarmed bandit theoremPerspectives of approximate dynamic programmingWhittle index approach to size-aware scheduling for time-varying channels with multiple statesBayesian Exploration: Incentivizing Exploration in Bayesian GamesIntegrated Online Learning and Adaptive Control in Queueing Systems with Uncertain PayoffsAmeso optimization: a relaxation of discrete midpoint convexityKullback-Leibler upper confidence bounds for optimal sequential allocationOn the computation of Whittle's index for Markovian restless banditsThe multi-armed bandit, with constraintsMulti-machine preventive maintenance scheduling with imperfect interventions: a restless bandit approachOptimal Learning for Stochastic Optimization with Nonlinear Parametric Belief ModelsMulti-round cooperative search games with multiple playersOptimal activation of halting multi‐armed bandit modelsUnnamed ItemMulti-armed bandit-based hyper-heuristics for combinatorial optimization problemsOn competitive analysis for polling systemsA novel statistical test for treatment differences in clinical trials using a response‐adaptive forward‐looking Gittins Index RuleTopp-Leone distribution with an application to binomial samplingOn Submodular Search and Machine SchedulingUnnamed ItemA foreground-background queueing model with speed or capacity modulationOptimal dynamic resource allocation to prevent defaultsMULTI-ARMED BANDITS UNDER GENERAL DEPRECIATION AND COMMITMENTON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITSASYMPTOTICALLY OPTIMAL MULTI-ARMED BANDIT POLICIES UNDER A COST CONSTRAINTExponential asymptotic optimality of Whittle index policyEmpirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problemsConsumer strategy, vendor strategy and equilibrium in duopoly markets with production costsOpen Problem—M/G/1 Scheduling with Preemption DelaysA confirmation of a conjecture on Feldman’s two-armed bandit problemOptimal schedule of elective surgery operations subject to disruptions by emergenciesAn adversarial model for scheduling with testingr-extreme signalling for congestion controlA unified framework for stochastic optimizationA Knowledge Gradient Policy for Sequencing Experiments to Identify the Structure of RNA Molecules Using a Sparse Additive Belief ModelA reinforcement learning approach to personalized learning recommendation systemsMYOPIC POLICIES FOR NON-PREEMPTIVE SCHEDULING OF JOBS WITH DECAYING VALUEBANDIT STRATEGIES EVALUATED IN THE CONTEXT OF CLINICAL TRIALS IN RARE LIFE-THREATENING DISEASESOptimal learning before choiceAdaptive Matching for Expert Systems with Uncertain Task TypesOptimal Online Learning for Nonlinear Belief Models Using Discrete PriorsAlgorithms for recursive delegationStochastic scheduling: a short history of index policies and new approaches to index generation for dynamic resource allocationLocks, Bombs and Testing: The Case of Independent LocksOn Bayesian index policies for sequential resource allocationOptimal switching between cash-flow streamsA linear-quadratic Gaussian approach to dynamic information acquisitionUnnamed ItemUnnamed ItemAn online algorithm for the risk-aware restless banditComplete expected improvement converges to an optimal budget allocationAdaptive policies for perimeter surveillance problemsAn asymptotically optimal heuristic for general nonstationary finite-horizon restless multi-armed, multi-action banditsOptimal learning with non-Gaussian rewardsOn index policies for stochastic minsum schedulingLearning to Optimize via Information-Directed SamplingAsymptotically optimal index policies for an abandonment queue with convex holding costOn the dynamic allocation of assets subject to failureOpen problems in queueing theory inspired by datacenter computingTwo-Armed Restless Bandits with Imperfect Information: Stochastic Control and IndexabilityImprovements and Generalizations of Stochastic Knapsack and Markovian Bandits Approximation AlgorithmsBayesian Exploration for Approximate Dynamic ProgrammingBayesian Incentive-Compatible Bandit ExplorationUnnamed ItemGittins Index for Simple Family of Markov Bandit Processes with Switching Cost and No DiscountingTechnical Note—A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient AgentsLearning Unknown Service Rates in Queues: A Multiarmed Bandit ApproachMatching While LearningUnnamed ItemAn asymptotically optimal strategy for constrained multi-armed bandit problemsUncertainty in learning, choice, and visual fixationFrom reinforcement learning to optimal control: a unified framework for sequential decisionsReinforcement learning: an industrial perspectiveOn the Gittins index for multistage jobsUnnamed ItemThe pure exploration problem with general reward functions depending on full distributionsA General Theory of MultiArmed Bandit Processes with Constrained Arm SwitchesApproximately optimal scheduling of an \(\mathrm{M}/\mathrm{G}/1\) queue with heavy tailsMulti-armed bandit models for the optimal design of clinical trials: benefits and challengesOptimal discrete search with technological choiceWhittle index based Q-learning for restless bandits with average rewardA Restless Bandit Model for Resource Allocation, Competition, and Reservation




This page was built for publication: Multi‐Armed Bandit Allocation Indices