A linear response bandit problem

From MaRDI portal

Publication:5168867

Jump to:navigation, search

DOI10.1214/11-SSY032zbMath1352.91009OpenAlexW2069129115MaRDI QIDQ5168867

Assaf J. Zeevi, Alexander Goldenshluger

Publication date: 21 July 2014

Full work available at URL: https://doi.org/10.1214/11-ssy032

zbMATH Keywords

minimax estimation regret bandit problems sequential allocation rate-optimal policy

Mathematics Subject Classification ID

Stopping times; optimal stopping problems; gambling theory (60G40) Probabilistic games; gambling (91A60)

Related Items

Smoothness-Adaptive Contextual Bandits, Smooth Contextual Bandits: Bridging the Parametric and Nondifferentiable Regret Regimes, Bandit Theory: Applications to Learning Healthcare Systems and Clinical Trials, Ranking and Selection with Covariates for Personalized Decision Making, Optimal designs for the development of personalized treatment rules, A general characterization of optimal tie-breaker designs, Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection, Transfer learning for contextual multi-armed bandits, Online Decision Making with High-Dimensional Covariates, Infinite Arms Bandit: Optimality via Confidence Bounds, Randomized allocation with arm elimination in a bandit problem with covariates, Dynamic Assortment Personalization in High Dimensions, Regret lower bound and optimal algorithm for high-dimensional contextual linear bandit, Statistical Inference for Online Decision Making via Stochastic Gradient Descent, Nonparametric Pricing Analytics with Customer Covariates, Statistical Inference for Online Decision Making: In a Contextual Bandit Setting, Unnamed Item

Cites Work

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:5168867&oldid=19731810"