Pages that link to "Item:Q1017665"
From MaRDI portal
The following pages link to Exploration-exploitation tradeoff using variance estimates in multi-armed bandits (Q1017665):
Displayed 38 items.
- Optimal learning with Bernstein Online Aggregation (Q72768) (← links)
- Kullback-Leibler upper confidence bounds for optimal sequential allocation (Q366995) (← links)
- Exploration and exploitation of scratch games (Q374139) (← links)
- Robustness of stochastic bandit policies (Q391739) (← links)
- UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem (Q653803) (← links)
- Corruption-tolerant bandit learning (Q669323) (← links)
- Boundary crossing probabilities for general exponential families (Q722599) (← links)
- Multi-armed bandits with episode context (Q766259) (← links)
- Improving multi-armed bandit algorithms in online pricing settings (Q1644914) (← links)
- An adaptive and robust biological network based on the vacant-particle transportation model (Q1670644) (← links)
- On Bayesian index policies for sequential resource allocation (Q1750289) (← links)
- Time-uniform, nonparametric, nonasymptotic confidence sequences (Q2039804) (← links)
- Dismemberment and design for controlling the replication variance of regret for the multi-armed bandit (Q2081727) (← links)
- A PAC algorithm in relative precision for bandit problem with costly sampling (Q2084297) (← links)
- Trading utility and uncertainty: applying the value of information to resolve the exploration-exploitation dilemma in reinforcement learning (Q2094051) (← links)
- Tune and mix: learning to rank using ensembles of calibrated multi-class classifiers (Q2251439) (← links)
- Mixing time estimation in reversible Markov chains from a single sample path (Q2330466) (← links)
- Bayesian optimistic Kullback-Leibler exploration (Q2425228) (← links)
- Constructing effective personalized policies using counterfactual inference from biased data sets with many features (Q2425241) (← links)
- Pure exploration in finitely-armed and continuous-armed bandits (Q2431430) (← links)
- Concentration inequalities for sampling without replacement (Q2515502) (← links)
- Primal-Dual Algorithms for Optimization with Stochastic Dominance (Q2954172) (← links)
- (Q4558206) (← links)
- (Q4558474) (← links)
- Adaptive Sampling Strategies for Stochastic Optimization (Q4562248) (← links)
- Finite-Time Analysis for the Knowledge-Gradient Policy (Q4610155) (← links)
- Learning Unknown Service Rates in Queues: A Multiarmed Bandit Approach (Q4994160) (← links)
- (Q4998881) (← links)
- (Q5053221) (← links)
- Nonasymptotic Analysis of Monte Carlo Tree Search (Q5060499) (← links)
- Setting Reserve Prices in Second-Price Auctions with Unobserved Bids (Q5060778) (← links)
- EXPLORATION–EXPLOITATION POLICIES WITH ALMOST SURE, ARBITRARILY SLOW GROWING ASYMPTOTIC REGRET (Q5070864) (← links)
- Data-Driven Decisions for Problems with an Unspecified Objective Function (Q5137432) (← links)
- ASYMPTOTICALLY OPTIMAL MULTI-ARMED BANDIT POLICIES UNDER A COST CONSTRAINT (Q5358116) (← links)
- Functional Sequential Treatment Allocation (Q5881136) (← links)
- Robust supervised learning with coordinate gradient descent (Q6172182) (← links)
- Multi-armed linear bandits with latent biases (Q6198758) (← links)
- A confirmation of a conjecture on Feldman’s two-armed bandit problem (Q6198964) (← links)