scientific article
From MaRDI portal
Publication:3093383
zbMath1222.68195MaRDI QIDQ3093383
Yishay Mansour, Shie Mannor, Eyal Even-Dar
Publication date: 12 October 2011
Full work available at URL: http://www.jmlr.org/papers/v7/evendar06a.html
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
Related Items (33)
Best Arm Identification for Contaminated Bandits ⋮ The multi-armed bandit problem with covariates ⋮ Posterior-Based Stopping Rules for Bayesian Ranking-and-Selection Procedures ⋮ Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model ⋮ Unnamed Item ⋮ Learning approximately optimal contracts ⋮ Learning approximately optimal contracts ⋮ A quality assuring, cost optimal multi-armed bandit mechanism for expertsourcing ⋮ Variable Selection Via Thompson Sampling ⋮ Unnamed Item ⋮ Good arm identification via bandit feedback ⋮ Exploratory machine learning with unknown unknowns ⋮ Optimal Scheduling of Entropy Regularizer for Continuous-Time Linear-Quadratic Reinforcement Learning ⋮ Learning the distribution with largest mean: two bandit frameworks ⋮ The \(K\)-armed dueling bandits problem ⋮ Tractable Sampling Strategies for Ordinal Optimization ⋮ UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem ⋮ Dynamic pricing with finite price sets: a non-parametric approach ⋮ Unnamed Item ⋮ An analysis of model-based interval estimation for Markov decision processes ⋮ Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization ⋮ Active ranking from pairwise comparisons and when parametric assumptions do not help ⋮ Rollout sampling approximate policy iteration ⋮ Randomized allocation with arm elimination in a bandit problem with covariates ⋮ Adaptive racing ranking-based immune optimization approach solving multi-objective expected value programming ⋮ A bad arm existence checking problem: how to utilize asymmetric problem structure? ⋮ Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback ⋮ Bayesian Incentive-Compatible Bandit Exploration ⋮ Multi-armed bandits with episode context ⋮ Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models ⋮ Unnamed Item ⋮ Choosing the best arm with guaranteed confidence ⋮ The pure exploration problem with general reward functions depending on full distributions
This page was built for publication: