Extensions of the multiarmed bandit problem: The discounted case

From MaRDI portal
Publication:3682272

DOI10.1109/TAC.1985.1103989zbMath0566.90096MaRDI QIDQ3682272

Cagatay Buyukkoc, Jean Walrand, Pravin P. Varaiya

Publication date: 1985

Published in: IEEE Transactions on Automatic Control (Search for Journal in Brave)




Related Items

A bisection/successive approximation method for computing Gittins indicesOptimal control of single-server queueing networksMulti-armed bandit problem revisitedStochastic scheduling and forwards inductionOpen Bandit Processes with Uncountable States and Time-Backward EffectsBandit and covariate processes, with finite or non-denumerable set of armsOptimistic Gittins IndicesSimultaneous optimization of flow control and scheduling in a single server queue with two job classesSimultaneous optimization of flow-control and scheduling in a single server queue with two job classes: Numerical results and approximationCompeting Markov decision processesOn Gittins' index theorem in continuous timeResource capacity allocation to stochastic dynamic competitors: knapsack problem for perishable items and index-knapsack heuristicFour proofs of Gittins' multiarmed bandit theoremStochastic scheduling of parallel queues with set-up costsA generalized Kalman filter for fixed point approximation and efficient temporal-difference learningOn an Optimal Stopping Problem for Multi-Parameter Diffusion ProcessesThe multi-armed bandit, with constraintsDerman's book as inspiration: some results on LP for MDPsSample path methods in the control of queuesThe archievable region method in the optimal control of queueing systems; formulations, bounds and policiesPerformance evaluation of scheduling control of queueing networks: Fluid model heuristicsIndex policy for multiarmed bandit problem with dynamic risk measuresA perpetual search for talents across overlapping generations: a learning processMULTI-ARMED BANDITS UNDER GENERAL DEPRECIATION AND COMMITMENTEmpirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problemsDynamic priority allocation via restless bandit marginal productivity indicesFlow time distributions in a \(K\) class \(M/G/1\) priority feedback queueOn the evaluation of strategies for branching bandit processesStochastic scheduling: a short history of index policies and new approaches to index generation for dynamic resource allocationA generalized Gittins index for a Markov chain and its recursive calculationReading policies for joins: an asymptotic analysisA comparative study of ad hoc techniques and evolutionary methods for multi-armed bandit problemsReinforcement learning and evolutionary algorithms for non-stationary multi-armed bandit problemsOptimal stopping for Brownian motion with applications to sequential analysis and option pricingOn the Gittins index in the M/G/1 queueStationary multi-choice bandit problems.Independently Expiring Multiarmed BanditsNew results for generalized bandit problemsTax problems in the undiscounted caseSurvey of linear programming for standard and nonstandard Markovian control problems. Part II: ApplicationsA survey of Markov decision models for control of networks of queuesOptimal intensity control of a multi-class queueBranching Bandit ProcessesDynamic allocation policies for the finite horizon one armed bandit problemMulti-armed bandits in discrete and continuous timeA General Theory of MultiArmed Bandit Processes with Constrained Arm SwitchesDiscrete multiarmed bandits and multiparameter processesMulti-armed bandit models for the optimal design of clinical trials: benefits and challengesOptimal stopping problems for multiarmed bandit processes with arms' independence




This page was built for publication: Extensions of the multiarmed bandit problem: The discounted case