The multi-armed bandit problem with covariates

From MaRDI portal

Publication:355096

Jump to:navigation, search

DOI10.1214/13-AOS1101zbMath1360.62436arXiv1110.6084OpenAlexW3100895096WikidataQ56675681 ScholiaQ56675681MaRDI QIDQ355096

Vianney Perchet, Philippe Rigollet

Publication date: 24 July 2013

Published in: The Annals of Statistics (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1110.6084

zbMATH Keywords

nonparametric bandit multi-armed bandit regret bounds contextual bandit adaptive partition sequential allocation successive elimination

Mathematics Subject Classification ID

Nonparametric regression and quantile regression (62G08) Classification and discrimination; cluster analysis (statistical aspects) (62H30) Learning and adaptive systems in artificial intelligence (68T05) Sequential statistical design (62L05) Sequential estimation (62L12)

Related Items (26)

Bayesian adaptive bandit-based designs using the Gittins index for multi-armed trials with normally distributed endpoints ⋮ A non-parametric solution to the multi-armed bandit problem with covariates ⋮ Batched bandit problems ⋮ Bandit and covariate processes, with finite or non-denumerable set of arms ⋮ Smoothness-Adaptive Contextual Bandits ⋮ Smooth Contextual Bandits: Bridging the Parametric and Nondifferentiable Regret Regimes ⋮ A Single-Index Model With a Surface-Link for Optimizing Individualized Dose Rules ⋮ Ranking and Selection with Covariates for Personalized Decision Making ⋮ Technical note—Knowledge gradient for selection with covariates: Consistency and computation ⋮ Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback ⋮ Unnamed Item ⋮ An adaptive multiclass nearest neighbor classifier ⋮ Treatment recommendation with distributional targets ⋮ Transfer learning for contextual multi-armed bandits ⋮ Learning the distribution with largest mean: two bandit frameworks ⋮ Gaussian process bandits with adaptive discretization ⋮ Online Decision Making with High-Dimensional Covariates ⋮ Randomized allocation with nonparametric estimation for contextual multi-armed bandits with delayed rewards ⋮ Unnamed Item ⋮ Infinite Arms Bandit: Optimality via Confidence Bounds ⋮ Randomized allocation with arm elimination in a bandit problem with covariates ⋮ Dynamic Assortment Personalization in High Dimensions ⋮ Nonparametric Pricing Analytics with Customer Covariates ⋮ Online Learning of Nash Equilibria in Congestion Games ⋮ Statistical Inference for Online Decision Making: In a Contextual Bandit Setting ⋮ Learning in Repeated Auctions

Cites Work

This page was built for publication: The multi-armed bandit problem with covariates

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:355096&oldid=12228796"