Extensions of the multiarmed bandit problem: The discounted case
DOI10.1109/TAC.1985.1103989zbMATH Open0566.90096MaRDI QIDQ3682272FDOQ3682272
Authors: Jean Walrand, Cagatay Buyukkoc, Pravin Varaiya
Publication date: 1985
Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)
Recommendations
optimal strategiessequential design of experimentssemi-Markov modelsstochastic resource allocationGittins indicesarrivalsbandit problemscomplex decision structures
Deterministic scheduling theory in operations research (90B35) Dynamic programming (90C39) Markov and semi-Markov decision processes (90C40) Adaptive control/observation systems (93C40)
Cited In (70)
- On the Worth of Perfect Information in Bandits with Random Discounting
- Title not available (Why is that?)
- Open bandit processes with uncountable states and time-backward effects
- Index policy for multiarmed bandit problem with dynamic risk measures
- A doscounted uniform one-armed bandit problem
- Optimistic Gittins Indices
- A bisection/successive approximation method for computing Gittins indices
- A general theory of multiarmed bandit processes with constrained arm switches
- Title not available (Why is that?)
- Multi-armed bandit models for the optimal design of clinical trials: benefits and challenges
- Sequencing an N-Stage Process with Feedback
- Optimal strategies for families of alternative bandit processes
- Stationary multi-choice bandit problems.
- Asymptotic properties of bandit processes with geometric responses.
- Optimal, recursive procedures of identification
- Title not available (Why is that?)
- The one-armed \(\mathrm{Erlang}(k)\) bandit reward process
- Title not available (Why is that?)
- Stochastic scheduling of parallel queues with set-up costs
- A new algorithm for the multi-item exponentially discounted optimal selection problem.
- Independently Expiring Multiarmed Bandits
- New results for generalized bandit problems
- The Multi-Armed Bandit Problem: Decomposition and Computation
- Tax problems in the undiscounted case
- Branching Bandit Processes
- On Gittins' index theorem in continuous time
- Generalized Bandit Problems
- Reinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problems
- Bandit and covariate processes, with finite or non-denumerable set of arms
- Four proofs of Gittins' multiarmed bandit theorem
- Performance evaluation of scheduling control of queueing networks: Fluid model heuristics
- Reading policies for joins: an asymptotic analysis
- Flow time distributions in a \(K\) class \(M/G/1\) priority feedback queue
- Dynamic allocation policies for the finite horizon one armed bandit problem
- Optimal stopping for Brownian motion with applications to sequential analysis and option pricing
- Simultaneous optimization of flow control and scheduling in a single server queue with two job classes
- Simultaneous optimization of flow-control and scheduling in a single server queue with two job classes: Numerical results and approximation
- Denumerable-Armed Bandits
- Evaluating strategies for generalized bandit problems
- On the Gittins index in the M/G/1 queue
- A survey of Markov decision models for control of networks of queues
- A perpetual search for talents across overlapping generations: a learning process
- Resource capacity allocation to stochastic dynamic competitors: knapsack problem for perishable items and index-knapsack heuristic
- Derman's book as inspiration: some results on LP for MDPs
- The multi-armed bandit, with constraints
- Optimal stopping problems for multiarmed bandit processes with arms' independence
- Optimal control of single-server queueing networks
- The archievable region method in the optimal control of queueing systems; formulations, bounds and policies
- Dynamic priority allocation via restless bandit marginal productivity indices
- A generalized Gittins index for a Markov chain and its recursive calculation
- Sample path methods in the control of queues
- Stochastic scheduling: a short history of index policies and new approaches to index generation for dynamic resource allocation
- Survey of linear programming for standard and nonstandard Markovian control problems. Part II: Applications
- Discounted Multiarmed Bandit Problems on a Collection of Machines with Varying Speeds
- Stochastic scheduling and forwards induction
- Optimal intensity control of a multi-class queue
- On the problem of the two-armed bandit with impulse controls and discounting
- Multi-armed bandits in discrete and continuous time
- Discrete multiarmed bandits and multiparameter processes
- Multi-armed bandit problem revisited
- Empirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problems
- On an Optimal Stopping Problem for Multi-Parameter Diffusion Processes
- Dynamic stochastic dominance in bandit decision problems
- Competing Markov decision processes
- On the evaluation of strategies for branching bandit processes
- Title not available (Why is that?)
- Multi-armed bandits under general depreciation and commitment
- Title not available (Why is that?)
- A comparative study of ad hoc techniques and evolutionary methods for multi-armed bandit problems
- A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
This page was built for publication: Extensions of the multiarmed bandit problem: The discounted case
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3682272)