Linearly Parameterized Bandits
From MaRDI portal
Publication:3169099
DOI10.1287/moor.1100.0446zbMath1217.93190arXiv0812.3465OpenAlexW2061753713MaRDI QIDQ3169099
Paat Rusmevichientong, John N. Tsitsiklis
Publication date: 27 April 2011
Published in: Mathematics of Operations Research (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/0812.3465
Robustness and adaptive procedures (parametric inference) (62F35) Application models in control theory (93C95) Adaptive control/observation systems (93C40) Stochastic learning and adaptive control (93E35) Sequential estimation (62L12)
Related Items (32)
A linear response bandit problem ⋮ Ranking and Selection with Covariates for Personalized Decision Making ⋮ Dynamic Learning and Decision Making via Basis Weight Vectors ⋮ Technical note—Knowledge gradient for selection with covariates: Consistency and computation ⋮ Unnamed Item ⋮ Online Resource Allocation with Personalized Learning ⋮ A tractable online learning algorithm for the multinomial logit contextual bandit ⋮ Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback ⋮ Optimal Learning in Linear Regression with Combinatorial Feature Selection ⋮ Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection ⋮ Empirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problems ⋮ Multi-armed linear bandits with latent biases ⋮ MNL-Bandit: A Dynamic Learning Approach to Assortment Selection ⋮ Online Decision Making with High-Dimensional Covariates ⋮ Online Network Revenue Management Using Thompson Sampling ⋮ Learning in Combinatorial Optimization: What and How to Explore ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Randomized allocation with arm elimination in a bandit problem with covariates ⋮ Learning to Optimize via Information-Directed Sampling ⋮ Active Learning of Bayesian Linear Models with High-Dimensional Binary Features by Parameter Confidence-Region Estimation ⋮ Best arm identification in generalized linear bandits ⋮ Online Collaborative Filtering on Graphs ⋮ Regret lower bound and optimal algorithm for high-dimensional contextual linear bandit ⋮ Technical Note—A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient Agents ⋮ Dynamic Pricing with Multiple Products and Partially Specified Demand Distribution ⋮ Unnamed Item ⋮ Learning to Optimize via Posterior Sampling ⋮ Stochastic continuum-armed bandits with additive models: minimax regrets and adaptive algorithm ⋮ A Bandit-Learning Approach to Multifidelity Approximation ⋮ Satisficing in Time-Sensitive Bandit Learning
This page was built for publication: Linearly Parameterized Bandits