Publication:3093383

From MaRDI portal


zbMath1222.68195MaRDI QIDQ3093383

Yishay Mansour, Shie Mannor, Eyal Even-Dar

Publication date: 12 October 2011

Full work available at URL: http://www.jmlr.org/papers/v7/evendar06a.html


68T05: Learning and adaptive systems in artificial intelligence


Related Items

Unnamed Item, Learning the distribution with largest mean: two bandit frameworks, Unnamed Item, Unnamed Item, Unnamed Item, Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models, Posterior-Based Stopping Rules for Bayesian Ranking-and-Selection Procedures, Tractable Sampling Strategies for Ordinal Optimization, Best Arm Identification for Contaminated Bandits, Randomized allocation with arm elimination in a bandit problem with covariates, Variable Selection Via Thompson Sampling, Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization, The multi-armed bandit problem with covariates, Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model, The \(K\)-armed dueling bandits problem, UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem, Multi-armed bandits with episode context, An analysis of model-based interval estimation for Markov decision processes, A quality assuring, cost optimal multi-armed bandit mechanism for expertsourcing, Adaptive racing ranking-based immune optimization approach solving multi-objective expected value programming, Rollout sampling approximate policy iteration, Choosing the best arm with guaranteed confidence, The pure exploration problem with general reward functions depending on full distributions, Dynamic pricing with finite price sets: a non-parametric approach, Active ranking from pairwise comparisons and when parametric assumptions do not help, A bad arm existence checking problem: how to utilize asymmetric problem structure?, Good arm identification via bandit feedback, Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback, Bayesian Incentive-Compatible Bandit Exploration