The Multi-Armed Bandit Problem: Decomposition and Computation

Recommendations

On the Gittins index for multiarmed bandits
scientific article; zbMATH DE number 194374
scientific article; zbMATH DE number 4087408
Extensions of the multiarmed bandit problem: The discounted case
Computing a classic index for finite-horizon bandits

Cited in

(79)

Combinatorial multi-armed bandit and its extension to probabilistically triggered arms
The performance of forwards induction policies
scientific article; zbMATH DE number 4087408 (Why is no real title available?)
DES and RES processes and their explicit solutions
Optimal and Efficient Auctions for the Gradual Procurement of Strategic Service Provider Agents
Finite state multi-armed bandit problems: Sensitive-discount, average-reward and average-overtaking optimality
A tutorial on Gaussian process regression: modelling, exploring, and exploiting functions
Optimal learning with non-Gaussian rewards
A reinforcement learning approach for resolving inconsistencies in qualitative constraint networks
Finite-time analysis of the multiarmed bandit problem
Optimal stopping of Markov chains and three abstract optimization problems
On the solution of stochastic optimization and variational problems in imperfect information regimes
The learning component of dynamic allocation indices
scientific article; zbMATH DE number 7380836 (Why is no real title available?)
Tax problems in the undiscounted case
Branching Bandit Processes
Partially observed Markov decision process multiarmed bandits-structural results
A faster index algorithm and a computational study for bandits with switching costs
Four proofs of Gittins' multiarmed bandit theorem
City streets parking enforcement inspection decisions: the Chinese postman's perspective
The \(K\)-armed dueling bandits problem
On the optimal allocation of service to impatient tasks
Robust control of the multi-armed bandit problem
Dynamic allocation policies for the finite horizon one armed bandit problem
An asymptotically optimal policy for finite support models in the multiarmed bandit problem
Index policies for discounted bandit problems with availability constraints
Denumerable-Armed Bandits
Optimal activation of halting multi‐armed bandit models
scientific article; zbMATH DE number 7038557 (Why is no real title available?)
A Structured Multiarmed Bandit Problem and the Greedy Policy
An Approximation Approach for Response-Adaptive Clinical Trial Design
scientific article; zbMATH DE number 194374 (Why is no real title available?)
Response-adaptive designs for clinical trials: simultaneous learning from multiple patients
A verification theorem for threshold-indexability of real-state discounted restless bandits
Asymptotically optimal multi-armed bandit policies under a cost constraint
Ameso optimization: a relaxation of discrete midpoint convexity
A perpetual search for talents across overlapping generations: a learning process
Continue, quit, restart probability model
Perspectives of approximate dynamic programming
Resource capacity allocation to stochastic dynamic competitors: knapsack problem for perishable items and index-knapsack heuristic
An optimal stopping policy for car rental businesses with purchasing customers
Derman's book as inspiration: some results on LP for MDPs
The multi-armed bandit, with constraints
On Solving Finite State Multi-Armed Bandit Problem by Linear Programming
Rule-based knowledge graph completion (invited paper)
Optimal control of single-server queueing networks
A Note on M. N. Katehakis' and Y.-R. Chen's Computation of the Gittins Index
Risk-Sensitive and Risk-Neutral Multiarmed Bandits
Enhancing gene expression programming based on space partition and jump for symbolic regression
Computing a classic index for finite-horizon bandits
On the sensitivity of restless bandits solutions to uncertainty in the models of the arms
A stochastic differential equation driven by Poisson random measure and its application in a duopoly market
Percentile optimization in multi-armed bandit problems
Consumer strategy, vendor strategy and equilibrium in duopoly markets with production costs
An optimal selection for ensembles of influential projects
Index policy for multiarmed bandit problem with dynamic risk measures
Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback
A generalized Gittins index for a Markov chain and its recursive calculation
Stochastic scheduling: a short history of index policies and new approaches to index generation for dynamic resource allocation
On the resolution of misspecified convex optimization and monotone variational inequality problems
Index Policies for Shooting Problems
Stochastic scheduling and forwards induction
Optimal online learning for nonlinear belief models using discrete priors
Extensions of the multiarmed bandit problem: The discounted case
Survey of linear programming for standard and nonstandard Markovian control problems. Part II: Applications
Discounted Multiarmed Bandit Problems on a Collection of Machines with Varying Speeds
Incentivizing exploration with heterogeneous value of money
Optimistic Gittins Indices
Empirical Gittins index strategies with -explorations for multi-armed bandit problems
A bisection/successive approximation method for computing Gittins indices
A dynamic roy model of academic specialization
Competing Markov decision processes
Testing indexability and computing Whittle and Gittins index in subcubic time
Adaptive approaches to stochastic programming
Multi-armed bandits under general depreciation and commitment
The Nonstochastic Multiarmed Bandit Problem
The Irrevocable Multiarmed Bandit Problem
The efficacy of league formats in ranking teams
A common value experimentation with multiarmed bandits

This page was built for publication: The Multi-Armed Bandit Problem: Decomposition and Computation

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3755256)