Modification of improved upper confidence bounds for regulating exploration in Monte-Carlo tree search
From MaRDI portal
(Redirected from Publication:307787)
Recommendations
Cites work
- Asymptotically efficient adaptive allocation rules
- Finite-time analysis of the multiarmed bandit problem
- Pure exploration in multi-armed bandits problems
- Simple regret optimization in online planning for Markov decision processes
- Thompson sampling: an asymptotically optimal finite-time analysis
- UCB revisited: improved regret bounds for the stochastic multi-armed bandit problem
Cited in
(4)
This page was built for publication: Modification of improved upper confidence bounds for regulating exploration in Monte-Carlo tree search
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q307787)