Publication:3093383
From MaRDI portal
zbMath1222.68195MaRDI QIDQ3093383
Yishay Mansour, Shie Mannor, Eyal Even-Dar
Publication date: 12 October 2011
Full work available at URL: http://www.jmlr.org/papers/v7/evendar06a.html
68T05: Learning and adaptive systems in artificial intelligence
Related Items
Unnamed Item, Learning the distribution with largest mean: two bandit frameworks, Unnamed Item, Unnamed Item, Unnamed Item, Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models, Posterior-Based Stopping Rules for Bayesian Ranking-and-Selection Procedures, Tractable Sampling Strategies for Ordinal Optimization, Best Arm Identification for Contaminated Bandits, Randomized allocation with arm elimination in a bandit problem with covariates, Variable Selection Via Thompson Sampling, Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization, The multi-armed bandit problem with covariates, Minimax PAC bounds on the sample complexity of reinforcement learning with a generative model, The \(K\)-armed dueling bandits problem, UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem, Multi-armed bandits with episode context, An analysis of model-based interval estimation for Markov decision processes, A quality assuring, cost optimal multi-armed bandit mechanism for expertsourcing, Adaptive racing ranking-based immune optimization approach solving multi-objective expected value programming, Rollout sampling approximate policy iteration, Choosing the best arm with guaranteed confidence, The pure exploration problem with general reward functions depending on full distributions, Dynamic pricing with finite price sets: a non-parametric approach, Active ranking from pairwise comparisons and when parametric assumptions do not help, A bad arm existence checking problem: how to utilize asymmetric problem structure?, Good arm identification via bandit feedback, Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback, Bayesian Incentive-Compatible Bandit Exploration