Mathematical Research Data Initiative
Main page
Recent changes
Random page
SPARQL
MaRDI@GitHub
New item
In other projects
MaRDI portal item
Discussion
View source
View history
English
Log in

A survey on online learning methods: Thompson sampling and others

From MaRDI portal
Publication:3176079
Jump to:navigation, search

DOI10.15960/J.CNKI.ISSN.1007-6093.2017.04.006zbMATH Open1399.68104MaRDI QIDQ3176079FDOQ3176079

Simai He, Yu Jia Jin, Dongdong Ge, Hua Wang

Publication date: 18 July 2018





Recommendations

  • A Tutorial on Thompson Sampling
  • An information-theoretic analysis of Thompson sampling
  • Multi-Armed Bandits: Theory and Applications to Online Learning in Networks
  • Learning to optimize via posterior sampling
  • Near-optimal regret bounds for Thompson sampling


zbMATH Keywords

multi-armed banditonline learningupper confidence boundonline convex optimizationThompson sampling


Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Convex programming (90C25) Sampling theory, sample surveys (62D05) Stopping times; optimal stopping problems; gambling theory (60G40)



Cited In (1)

  • A Survey of Preference-Based Online Learning with Bandit Algorithms





This page was built for publication: A survey on online learning methods: Thompson sampling and others

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3176079)

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3176079&oldid=16411759"
Tools
What links here
Related changes
Printable version
Permanent link
Page information
This page was last edited on 4 February 2024, at 05:00. Warning: Page may not contain recent updates.
Privacy policy
About MaRDI portal
Disclaimers
Imprint
Powered by MediaWiki