The multi-armed bandit problem with covariates
From MaRDI portal
Publication:355096
DOI10.1214/13-AOS1101zbMath1360.62436arXiv1110.6084WikidataQ56675681 ScholiaQ56675681MaRDI QIDQ355096
Vianney Perchet, Philippe Rigollet
Publication date: 24 July 2013
Published in: The Annals of Statistics (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1110.6084
nonparametric bandit; multi-armed bandit; regret bounds; contextual bandit; adaptive partition; sequential allocation; successive elimination
62G08: Nonparametric regression and quantile regression
62H30: Classification and discrimination; cluster analysis (statistical aspects)
68T05: Learning and adaptive systems in artificial intelligence
62L05: Sequential statistical design
62L12: Sequential estimation
Related Items
Unnamed Item, Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback, Learning the distribution with largest mean: two bandit frameworks, Unnamed Item, Infinite Arms Bandit: Optimality via Confidence Bounds, Nonparametric Pricing Analytics with Customer Covariates, Bayesian adaptive bandit-based designs using the Gittins index for multi-armed trials with normally distributed endpoints, Smoothness-Adaptive Contextual Bandits, Smooth Contextual Bandits: Bridging the Parametric and Nondifferentiable Regret Regimes, A Single-Index Model With a Surface-Link for Optimizing Individualized Dose Rules, Ranking and Selection with Covariates for Personalized Decision Making, An adaptive multiclass nearest neighbor classifier, Online Decision Making with High-Dimensional Covariates, Online Learning of Nash Equilibria in Congestion Games, Randomized allocation with arm elimination in a bandit problem with covariates, Statistical Inference for Online Decision Making: In a Contextual Bandit Setting, Learning in Repeated Auctions, Treatment recommendation with distributional targets, Batched bandit problems, A non-parametric solution to the multi-armed bandit problem with covariates, Gaussian process bandits with adaptive discretization, Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards, Bandit and covariate processes, with finite or non-denumerable set of arms, Dynamic Assortment Personalization in High Dimensions
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem
- Woodroofe's one-armed bandit problem revisited
- Fast learning rates for plug-in classifiers
- Asymptotically efficient adaptive allocation rules
- Smooth discrimination analysis
- Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates
- Optimal aggregation of classifiers in statistical learning.
- An Asymptotic Minimax Theorem for the Two Armed Bandit Problem
- A One-Armed Bandit Problem with a Concomitant Variable
- A Note on Performance Limitations in Bandit Problems With Side Information
- Online Learning with Prior Knowledge
- Prediction, Learning, and Games
- Some aspects of the sequential design of experiments
- Finite-time analysis of the multiarmed bandit problem