Optimal adaptive policies for sequential allocation problems

DOI10.1006/AAMA.1996.0007MaRDI QIDQ1922542zbMATH OpenOpenAlexWikidataFDO

Authors A. N. Burnetas, Michael N. Katehakis

Publication date 22 January 1997

Published in Advances in Applied Mathematics (Search for Journal in Brave)

Full work available at URL https://semanticscholar.org/paper/2fa3f78bd544c4bdb7986b5dd9feda47492b1e34

zbMATH Keywords

sequential sampling allocation problems

Mathematics Subject Classification ID

Limit theorems in probability theory (60F99) Mathematical programming (90C99)

Recommendations

Asymptotically efficient adaptive allocation rules
Optimal sequential allocation with imperfect feedback information
Optimal Adaptive Policies for Markov Decision Processes
scientific article; zbMATH DE number 4003938
scientific article; zbMATH DE number 4045510
Adaptive Sequential Stochastic Optimization
Dynamic allocation policies for the finite horizon one armed bandit problem

Cited in

(32)

Robustness of stochastic bandit policies
Learning the distribution with largest mean: two bandit frameworks
The multi-armed bandit problem: an efficient nonparametric solution
Kullback-Leibler upper confidence bounds for optimal sequential allocation
Adaptive policies for perimeter surveillance problems
Exploration-exploitation policies with almost sure, arbitrarily slow growing asymptotic regret
Irreversible adaptive allocation rules
Reading policies for joins: an asymptotic analysis
Structured Policies for a Sequential Design Problem with General Distributions
Robust control of the multi-armed bandit problem
An asymptotically optimal policy for finite support models in the multiarmed bandit problem
Response-adaptive designs for clinical trials: simultaneous learning from multiple patients
Asymptotically optimal multi-armed bandit policies under a cost constraint
Tracking the mean of a piecewise stationary sequence
On bidding for a fixed number of items in a sequence of auctions
A perpetual search for talents across overlapping generations: a learning process
Sequential Bayes-optimal policies for multiple comparisons with a known standard
Normal bandits of unknown means and variances
Self-Optimizing and Pareto-Optimal Policies in General Environments based on Bayes-Mixtures
Asymptotically optimal algorithms for budgeted multiple play bandits
Explore first, exploit next: the true shape of regret in bandit problems
scientific article; zbMATH DE number 1538064 (Why is no real title available?)
Infinite Arms Bandit: Optimality via Confidence Bounds
Pair-matching: link prediction with adaptive queries
Optimal sequential sampling from two populations.
Optimal and asymptotically optimal decision rules for sequential screening and resource allocation
scientific article; zbMATH DE number 6982311 (Why is no real title available?)
Adaptive aggregation for reinforcement learning in average reward Markov decision processes
Consistency of sequential Bayesian sampling policies
Adaptive policies for sequential sampling under incomplete information and a cost constraint
Multi-armed bandits under general depreciation and commitment
A non-parametric solution to the multi-armed bandit problem with covariates

This page was built for publication: Optimal adaptive policies for sequential allocation problems

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1922542)