scientific article

From MaRDI portal
Revision as of 22:48, 3 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:3093383

zbMath1222.68195MaRDI QIDQ3093383

Yishay Mansour, Shie Mannor, Eyal Even-Dar

Publication date: 12 October 2011

Full work available at URL: http://www.jmlr.org/papers/v7/evendar06a.html

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.



Related Items (33)

Best Arm Identification for Contaminated BanditsThe multi-armed bandit problem with covariatesPosterior-Based Stopping Rules for Bayesian Ranking-and-Selection ProceduresMinimax PAC bounds on the sample complexity of reinforcement learning with a generative modelUnnamed ItemLearning approximately optimal contractsLearning approximately optimal contractsA quality assuring, cost optimal multi-armed bandit mechanism for expertsourcingVariable Selection Via Thompson SamplingUnnamed ItemGood arm identification via bandit feedbackExploratory machine learning with unknown unknownsOptimal Scheduling of Entropy Regularizer for Continuous-Time Linear-Quadratic Reinforcement LearningLearning the distribution with largest mean: two bandit frameworksThe \(K\)-armed dueling bandits problemTractable Sampling Strategies for Ordinal OptimizationUCB revisited: improved regret bounds for the stochastic multi-armed bandit problemDynamic pricing with finite price sets: a non-parametric approachUnnamed ItemAn analysis of model-based interval estimation for Markov decision processesHyperband: A Novel Bandit-Based Approach to Hyperparameter OptimizationActive ranking from pairwise comparisons and when parametric assumptions do not helpRollout sampling approximate policy iterationRandomized allocation with arm elimination in a bandit problem with covariatesAdaptive racing ranking-based immune optimization approach solving multi-objective expected value programmingA bad arm existence checking problem: how to utilize asymmetric problem structure?Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit FeedbackBayesian Incentive-Compatible Bandit ExplorationMulti-armed bandits with episode contextNonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit modelsUnnamed ItemChoosing the best arm with guaranteed confidenceThe pure exploration problem with general reward functions depending on full distributions




This page was built for publication: