Linearly Parameterized Bandits

From MaRDI portal
Publication:3169099

DOI10.1287/moor.1100.0446zbMath1217.93190arXiv0812.3465OpenAlexW2061753713MaRDI QIDQ3169099

Paat Rusmevichientong, John N. Tsitsiklis

Publication date: 27 April 2011

Published in: Mathematics of Operations Research (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/0812.3465




Related Items (32)

A linear response bandit problemRanking and Selection with Covariates for Personalized Decision MakingDynamic Learning and Decision Making via Basis Weight VectorsTechnical note—Knowledge gradient for selection with covariates: Consistency and computationUnnamed ItemOnline Resource Allocation with Personalized LearningA tractable online learning algorithm for the multinomial logit contextual banditNonstochastic Multi-Armed Bandits with Graph-Structured FeedbackOptimal Learning in Linear Regression with Combinatorial Feature SelectionNearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset SelectionEmpirical Gittins index strategies with \(\varepsilon\)-explorations for multi-armed bandit problemsMulti-armed linear bandits with latent biasesMNL-Bandit: A Dynamic Learning Approach to Assortment SelectionOnline Decision Making with High-Dimensional CovariatesOnline Network Revenue Management Using Thompson SamplingLearning in Combinatorial Optimization: What and How to ExploreUnnamed ItemUnnamed ItemUnnamed ItemRandomized allocation with arm elimination in a bandit problem with covariatesLearning to Optimize via Information-Directed SamplingActive Learning of Bayesian Linear Models with High-Dimensional Binary Features by Parameter Confidence-Region EstimationBest arm identification in generalized linear banditsOnline Collaborative Filtering on GraphsRegret lower bound and optimal algorithm for high-dimensional contextual linear banditTechnical Note—A Note on the Equivalence of Upper Confidence Bounds and Gittins Indices for Patient AgentsDynamic Pricing with Multiple Products and Partially Specified Demand DistributionUnnamed ItemLearning to Optimize via Posterior SamplingStochastic continuum-armed bandits with additive models: minimax regrets and adaptive algorithmA Bandit-Learning Approach to Multifidelity ApproximationSatisficing in Time-Sensitive Bandit Learning




This page was built for publication: Linearly Parameterized Bandits