The \(K\)-armed dueling bandits problem
From MaRDI portal
Publication:440003
DOI10.1016/J.JCSS.2011.12.028zbMath1283.68181DBLPjournals/jcss/YueBKJ12OpenAlexW2044493620WikidataQ29300682 ScholiaQ29300682MaRDI QIDQ440003
Thorsten Joachims, Yisong Yue, Josef Broder, Robert D. Kleinberg
Publication date: 17 August 2012
Published in: Journal of Computer and System Sciences (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.jcss.2011.12.028
Computational learning theory (68Q32) Learning and adaptive systems in artificial intelligence (68T05) Probabilistic games; gambling (91A60) Online algorithms; streaming algorithms (68W27)
Related Items (13)
Top-\(\kappa\) selection with pairwise comparisons ⋮ Parallel distributed block coordinate descent methods based on pairwise comparison oracle ⋮ Unnamed Item ⋮ Unnamed Item ⋮ The \(K\)-armed dueling bandits problem ⋮ Lexicographic refinements in stationary possibilistic Markov decision processes ⋮ How good is a two-party election game? ⋮ Active ranking from pairwise comparisons and when parametric assumptions do not help ⋮ Preference-based reinforcement learning: evolutionary direct policy search using a preference-based racing algorithm ⋮ Global optimization based on active preference learning with radial basis functions ⋮ On testing transitivity in online preference learning ⋮ Lexicographic refinements in possibilistic decision trees and finite-horizon Markov decision processes ⋮ Unnamed Item
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- The \(K\)-armed dueling bandits problem
- Asymptotically efficient adaptive allocation rules
- Regret bounds for sleeping experts and bandits
- Computing with Noisy Information
- A PAC-Bayesian margin bound for linear classifiers
- The Nonstochastic Multiarmed Bandit Problem
- 10.1162/153244303321897663
- 10.1162/1532443041827916
- Probability Inequalities for Sums of Bounded Random Variables
- Regret Minimization Under Partial Monitoring
- Robust Reductions from Ranking to Classification
- Some aspects of the sequential design of experiments
- Finite-time analysis of the multiarmed bandit problem
This page was built for publication: The \(K\)-armed dueling bandits problem