Combinatorial bandits
From MaRDI portal
Publication:439986
DOI10.1016/j.jcss.2012.01.001zbMath1262.91052WikidataQ59538560 ScholiaQ59538560MaRDI QIDQ439986
Gábor Lugosi, Nicolò Cesa-Bianchi
Publication date: 17 August 2012
Published in: Journal of Computer and System Sciences (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/j.jcss.2012.01.001
Related Items
Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback, Sequential Shortest Path Interdiction with Incomplete Information, Unnamed Item, Online Learning over a Finite Action Set with Limited Switching, Learning Unknown Service Rates in Queues: A Multiarmed Bandit Approach, Bounded Regret for Finitely Parameterized Multi-Armed Bandits, Continuous Assortment Optimization with Logit Choice Probabilities and Incomplete Information, Learning in Combinatorial Optimization: What and How to Explore, Nested-Batch-Mode Learning and Stochastic Optimization with An Application to Sequential MultiStage Testing in Materials Science, Per-Round Knapsack-Constrained Linear Submodular Bandits, Online learning of network bottlenecks via minimax paths, Multi-armed bandits with censored consumption of resources, Variable Selection Via Thompson Sampling, Online team formation under different synergies, Online learning of energy consumption for navigation of electric vehicles, A combinatorial multi-armed bandit approach to correlation clustering, Bandit online optimization over the permutahedron, Combining initial segments of lists, An improved upper bound on the expected regret of UCB-type policies for a matching-selection bandit problem, Adaptive policies for perimeter surveillance problems, Asymptotically optimal algorithms for budgeted multiple play bandits, Multi-channel transmission scheduling with hopping scheme under uncertain channel states, A Combinatorial Metrical Task System Problem Under the Uniform Metric, Polynomial-Time Algorithms for Multiple-Arm Identification with Full-Bandit Feedback
Cites Work
- Local characteristics, entropy and limit theorems for spanning trees and domino tilings via transfer-impedances
- Efficient algorithms for online decision problems
- Probability on Trees and Networks
- A polynomial-time approximation algorithm for the permanent of a matrix with nonnegative entries
- Polynomial-Time Approximation Algorithms for the Ising Model
- Adaptive routing with end-to-end feedback
- Robbing the bandit
- How to Get a Perfectly Random Sample from a Generic Markov Chain and Generate a Random Spanning Tree of a Directed Graph
- Learning Theory
- The Nonstochastic Multiarmed Bandit Problem
- 10.1162/1532443041424328
- Learning Permutations with Exponential Weights
- Prediction, Learning, and Games
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item