The Multi-Armed Bandit Problem: Decomposition and Computation

From MaRDI portal
Publication:3755256

DOI10.1287/moor.12.2.262zbMath0618.90097OpenAlexW2118309135WikidataQ29306812 ScholiaQ29306812MaRDI QIDQ3755256

Arthur F. jun. Veinott, Michael N. Katehakis

Publication date: 1987

Published in: Mathematics of Operations Research (Search for Journal in Brave)

Full work available at URL: https://semanticscholar.org/paper/e4fe28113fed71999a0db30a930e0b42d3ce55f1



Related Items

A bisection/successive approximation method for computing Gittins indices, Optimal control of single-server queueing networks, The performance of forwards induction policies, Stochastic scheduling and forwards induction, Optimistic Gittins Indices, Response-adaptive designs for clinical trials: simultaneous learning from multiple patients, Competing Markov decision processes, Incentivizing Exploration with Heterogeneous Value of Money, Resource capacity allocation to stochastic dynamic competitors: knapsack problem for perishable items and index-knapsack heuristic, Four proofs of Gittins' multiarmed bandit theorem, Continue, quit, restart probability model, Perspectives of approximate dynamic programming, On the optimal allocation of service to impatient tasks, Ameso optimization: a relaxation of discrete midpoint convexity, The multi-armed bandit, with constraints, Derman's book as inspiration: some results on LP for MDPs, Optimal activation of halting multi‐armed bandit models, Index policy for multiarmed bandit problem with dynamic risk measures, Testing indexability and computing Whittle and Gittins index in subcubic time, A perpetual search for talents across overlapping generations: a learning process, An asymptotically optimal policy for finite support models in the multiarmed bandit problem, Optimal and Efficient Auctions for the Gradual Procurement of Strategic Service Provider Agents, MULTI-ARMED BANDITS UNDER GENERAL DEPRECIATION AND COMMITMENT, DES AND RES PROCESSES AND THEIR EXPLICIT SOLUTIONS, ASYMPTOTICALLY OPTIMAL MULTI-ARMED BANDIT POLICIES UNDER A COST CONSTRAINT, Consumer strategy, vendor strategy and equilibrium in duopoly markets with production costs, Index policies for discounted bandit problems with availability constraints, A Verification Theorem for Threshold-Indexability of Real-State Discounted Restless Bandits, A common value experimentation with multiarmed bandits, The efficacy of league formats in ranking teams, Optimal Online Learning for Nonlinear Belief Models Using Discrete Priors, A tutorial on Gaussian process regression: modelling, exploring, and exploiting functions, Stochastic scheduling: a short history of index policies and new approaches to index generation for dynamic resource allocation, An Approximation Approach for Response-Adaptive Clinical Trial Design, A generalized Gittins index for a Markov chain and its recursive calculation, City streets parking enforcement inspection decisions: the Chinese postman's perspective, Survey of linear programming for standard and nonstandard Markovian control problems. Part II: Applications, Optimal learning with non-Gaussian rewards, Enhancing gene expression programming based on space partition and jump for symbolic regression, On the Solution of Stochastic Optimization and Variational Problems in Imperfect Information Regimes, Optimal stopping of Markov chains and three abstract optimization problems, Branching Bandit Processes, Finite state multi-armed bandit problems: Sensitive-discount, average-reward and average-overtaking optimality, On the resolution of misspecified convex optimization and monotone variational inequality problems, Dynamic allocation policies for the finite horizon one armed bandit problem, An optimal stopping policy for car rental businesses with purchasing customers, Robust control of the multi-armed bandit problem, Adaptive approaches to stochastic programming