Extensions of the multiarmed bandit problem: The discounted case

optimal strategies sequential design of experiments semi-Markov models stochastic resource allocation Gittins indices arrivals bandit problemscomplex decision structures

Mathematics Subject Classification ID

Deterministic scheduling theory in operations research (90B35) Dynamic programming (90C39) Markov and semi-Markov decision processes (90C40) Adaptive control/observation systems (93C40)

Recommendations

Generalized Bandit Problems
Optimal strategies for families of alternative bandit processes
Denumerable-Armed Bandits
The Multi-Armed Bandit Problem: Decomposition and Computation
scientific article; zbMATH DE number 194374

Cited in

(70)

A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges
scientific article; zbMATH DE number 4078557 (Why is no real title available?)
A general theory of multiarmed bandit processes with constrained arm switches
Sequencing an N-Stage Process with Feedback
Optimal strategies for families of alternative bandit processes
Stationary multi-choice bandit problems.
Asymptotic properties of bandit processes with geometric responses.
Optimal, recursive procedures of identification
scientific article; zbMATH DE number 47588 (Why is no real title available?)
scientific article; zbMATH DE number 3854141 (Why is no real title available?)
The one-armed \(\mathrm{Erlang}(k)\) bandit reward process
Stochastic scheduling of parallel queues with set-up costs
A new algorithm for the multi-item exponentially discounted optimal selection problem.
Independently Expiring Multiarmed Bandits
The Multi-Armed Bandit Problem: Decomposition and Computation
New results for generalized bandit problems
Tax problems in the undiscounted case
Branching Bandit Processes
On Gittins' index theorem in continuous time
Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems
Generalized Bandit Problems
Bandit and covariate processes, with finite or non-denumerable set of arms
Four proofs of Gittins' multiarmed bandit theorem
Performance evaluation of scheduling control of queueing networks: Fluid model heuristics
Reading policies for joins: an asymptotic analysis
Flow time distributions in a \(K\) class \(M/G/1\) priority feedback queue
Dynamic allocation policies for the finite horizon one armed bandit problem
Optimal stopping for Brownian motion with applications to sequential analysis and option pricing
Simultaneous optimization of flow control and scheduling in a single server queue with two job classes
Simultaneous optimization of flow-control and scheduling in a single server queue with two job classes: Numerical results and approximation
Denumerable-Armed Bandits
Evaluating strategies for generalized bandit problems
On the Worth of Perfect Information in Bandits with Random Discounting
scientific article; zbMATH DE number 548896 (Why is no real title available?)
A survey of Markov decision models for control of networks of queues
On the Gittins index in the M/G/1 queue
A perpetual search for talents across overlapping generations: a learning process
Resource capacity allocation to stochastic dynamic competitors: knapsack problem for perishable items and index-knapsack heuristic
Derman's book as inspiration: some results on LP for MDPs
The multi-armed bandit, with constraints
Optimal stopping problems for multiarmed bandit processes with arms' independence
Optimal control of single-server queueing networks
The archievable region method in the optimal control of queueing systems; formulations, bounds and policies
Open bandit processes with uncountable states and time-backward effects
Dynamic priority allocation via restless bandit marginal productivity indices
A doscounted uniform one-armed bandit problem
Index policy for multiarmed bandit problem with dynamic risk measures
A generalized Gittins index for a Markov chain and its recursive calculation
Sample path methods in the control of queues
Stochastic scheduling: a short history of index policies and new approaches to index generation for dynamic resource allocation
Stochastic scheduling and forwards induction
Optimal intensity control of a multi-class queue
Survey of linear programming for standard and nonstandard Markovian control problems. Part II: Applications
Multi-armed bandits in discrete and continuous time
Discounted Multiarmed Bandit Problems on a Collection of Machines with Varying Speeds
On the problem of the two-armed bandit with impulse controls and discounting
Discrete multiarmed bandits and multiparameter processes
Multi-armed bandit problem revisited
Optimistic Gittins Indices
Dynamic stochastic dominance in bandit decision problems
On an Optimal Stopping Problem for Multi-Parameter Diffusion Processes
Empirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problems
A bisection/successive approximation method for computing Gittins indices
Competing Markov decision processes
On the evaluation of strategies for branching bandit processes
scientific article; zbMATH DE number 4056829 (Why is no real title available?)
Multi-armed bandits under general depreciation and commitment
A comparative study of ad hoc techniques and evolutionary methods for multi-armed bandit problems
scientific article; zbMATH DE number 3891095 (Why is no real title available?)

This page was built for publication: Extensions of the multiarmed bandit problem: The discounted case

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3682272)