Introduction to Multi-Armed Bandits

From MaRDI portal

Publication:5213200

Jump to:navigation, search

DOI10.1561/2200000068zbMath1478.68006arXiv1904.07272OpenAlexW4206275166WikidataQ126833114 ScholiaQ126833114MaRDI QIDQ5213200

Aleksandrs Slivkins

Publication date: 31 January 2020

Published in: Foundations and Trends® in Machine Learning (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1904.07272

zbMATH Keywords

reinforcement learning randomized algorithms online optimization probabilistic games

Mathematics Subject Classification ID

Introductory exposition (textbooks, tutorial papers, etc.) pertaining to computer science (68-01) Learning and adaptive systems in artificial intelligence (68T05) Introductory exposition (textbooks, tutorial papers, etc.) pertaining to probability theory (60-01) Stopping times; optimal stopping problems; gambling theory (60G40) Rationality and learning in game theory (91A26) Multistage and repeated games (91A20) Randomized algorithms (68W20) Probabilistic games; gambling (91A60) Online algorithms; streaming algorithms (68W27)

Related Items

Dynamic Learning and Market Making in Spread Betting Markets with Informed Bettors, Bayesian Exploration: Incentivizing Exploration in Bayesian Games, Multiplayer Bandits Without Observing Collision Information, Maximizing revenue for publishers using header bidding and ad exchange auctions, Multi-round cooperative search games with multiple players, Optimal activation of halting multi‐armed bandit models, Multi-armed bandit-based hyper-heuristics for combinatorial optimization problems, Online Resource Allocation with Personalized Learning, Regret minimization in online Bayesian persuasion: handling adversarial receiver's types under full and partial feedback models, Online learning of network bottlenecks via minimax paths, Multi-armed bandits with censored consumption of resources, A central limit theorem, loss aversion and multi-armed bandits, Convergence rate analysis for optimal computing budget allocation algorithms, Semi-Supervised Node Classification via Semi-Global Graph Transformer Based on Homogeneity Augmentation, Universal regression with adversarial responses, Control-data separation and logical condition propagation for efficient inference on probabilistic programs, Efficient and generalizable tuning strategies for stochastic gradient MCMC, Improving Hoeffding's inequality using higher moments information, Quantum greedy algorithms for multi-armed bandits, Ballooning multi-armed bandits, Bayesian adversarial multi-node bandit for optimal smart grid protection against cyber attacks, Multi-armed bandit with sub-exponential rewards, Unnamed Item, Reinforcement Learning Based Interactive Agent for Personalized Mathematical Skill Enhancement, Learning in Repeated Auctions, Bypassing the Monster: A Faster and Simpler Optimal Algorithm for Contextual Bandits Under Realizability

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:5213200&oldid=19816467"