The Multi-Armed Bandit Problem: Decomposition and Computation
DOI10.1287/MOOR.12.2.262zbMATH Open0618.90097DBLPjournals/mor/KatehakisV87OpenAlexW2118309135WikidataQ29306812 ScholiaQ29306812MaRDI QIDQ3755256FDOQ3755256
Arthur F. jun. Veinott, Michael N. Katehakis
Publication date: 1987
Published in: Mathematics of Operations Research (Search for Journal in Brave)
Full work available at URL: https://semanticscholar.org/paper/e4fe28113fed71999a0db30a930e0b42d3ce55f1
Recommendations
multi-armed bandit problemapproximately optimal policyapproximate largest-index rulelargest-index rulesparse transition matrices
Numerical mathematical programming methods (65K05) Dynamic programming (90C39) Markov and semi-Markov decision processes (90C40)
Cited In (67)
- Title not available (Why is that?)
- The performance of forwards induction policies
- Finite state multi-armed bandit problems: Sensitive-discount, average-reward and average-overtaking optimality
- Incentivizing Exploration with Heterogeneous Value of Money
- Finite-time analysis of the multiarmed bandit problem
- Optimal learning with non-Gaussian rewards
- A tutorial on Gaussian process regression: modelling, exploring, and exploiting functions
- ASYMPTOTICALLY OPTIMAL MULTI-ARMED BANDIT POLICIES UNDER A COST CONSTRAINT
- Optimal stopping of Markov chains and three abstract optimization problems
- Title not available (Why is that?)
- On the solution of stochastic optimization and variational problems in imperfect information regimes
- The learning component of dynamic allocation indices
- Tax problems in the undiscounted case
- Branching Bandit Processes
- Four proofs of Gittins' multiarmed bandit theorem
- City streets parking enforcement inspection decisions: the Chinese postman's perspective
- On the optimal allocation of service to impatient tasks
- The \(K\)-armed dueling bandits problem
- Optimal Online Learning for Nonlinear Belief Models Using Discrete Priors
- Dynamic allocation policies for the finite horizon one armed bandit problem
- Robust control of the multi-armed bandit problem
- Index policies for discounted bandit problems with availability constraints
- An asymptotically optimal policy for finite support models in the multiarmed bandit problem
- Denumerable-Armed Bandits
- Title not available (Why is that?)
- A Structured Multiarmed Bandit Problem and the Greedy Policy
- An Approximation Approach for Response-Adaptive Clinical Trial Design
- Title not available (Why is that?)
- Ameso optimization: a relaxation of discrete midpoint convexity
- Response-adaptive designs for clinical trials: simultaneous learning from multiple patients
- A perpetual search for talents across overlapping generations: a learning process
- An optimal stopping policy for car rental businesses with purchasing customers
- Continue, quit, restart probability model
- Perspectives of approximate dynamic programming
- Resource capacity allocation to stochastic dynamic competitors: knapsack problem for perishable items and index-knapsack heuristic
- On Solving Finite State Multi-Armed Bandit Problem by Linear Programming
- Derman's book as inspiration: some results on LP for MDPs
- The multi-armed bandit, with constraints
- Optimal control of single-server queueing networks
- MULTI-ARMED BANDITS UNDER GENERAL DEPRECIATION AND COMMITMENT
- A Note on M. N. Katehakis' and Y.-R. Chen's Computation of the Gittins Index
- Enhancing gene expression programming based on space partition and jump for symbolic regression
- Consumer strategy, vendor strategy and equilibrium in duopoly markets with production costs
- Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback
- A generalized Gittins index for a Markov chain and its recursive calculation
- DES AND RES PROCESSES AND THEIR EXPLICIT SOLUTIONS
- Stochastic scheduling: a short history of index policies and new approaches to index generation for dynamic resource allocation
- Extensions of the multiarmed bandit problem: The discounted case
- On the resolution of misspecified convex optimization and monotone variational inequality problems
- Survey of linear programming for standard and nonstandard Markovian control problems. Part II: Applications
- Discounted Multiarmed Bandit Problems on a Collection of Machines with Varying Speeds
- Stochastic scheduling and forwards induction
- Competing Markov decision processes
- A bisection/successive approximation method for computing Gittins indices
- Adaptive approaches to stochastic programming
- The Nonstochastic Multiarmed Bandit Problem
- The Irrevocable Multiarmed Bandit Problem
- The efficacy of league formats in ranking teams
- A common value experimentation with multiarmed bandits
- Combinatorial multi-armed bandit and its extension to probabilistically triggered arms
- Optimal and Efficient Auctions for the Gradual Procurement of Strategic Service Provider Agents
- A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits
- Optimal activation of halting multi‐armed bandit models
- A stochastic differential equation driven by Poisson random measure and its application in a duopoly market
- Index policy for multiarmed bandit problem with dynamic risk measures
- Optimistic Gittins Indices
- Testing indexability and computing Whittle and Gittins index in subcubic time
This page was built for publication: The Multi-Armed Bandit Problem: Decomposition and Computation
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3755256)