Mathematical Research Data Initiative
Main page
Recent changes
Random page
SPARQL
MaRDI@GitHub
New item
Special pages
In other projects
MaRDI portal item
Discussion
View source
View history
English
Log in

Some reward–penalty rules for the multi-armed bandit problem which are asymptotically optimal

From MaRDI portal
Publication:4743532
Jump to:navigation, search

DOI10.2307/1426995zbMATH Open0506.60067OpenAlexW2326969892MaRDI QIDQ4743532FDOQ4743532


Authors: Kevin D. Glazebrook Edit this on Wikidata


Publication date: 1983

Published in: Advances in Applied Probability (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.2307/1426995





zbMATH Keywords

Gittins indexrandomised allocation indicesmathematical learningmultirmed bandit problem


Mathematics Subject Classification ID

Applications of Markov chains and discrete-time Markov processes on general state spaces (social mobility, learning theory, industrial processes, etc.) (60J20) Decision theory for games (91A35)







This page was built for publication: Some reward–penalty rules for the multi-armed bandit problem which are asymptotically optimal

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4743532)

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:4743532&oldid=19011044"
Tools
What links here
Related changes
Printable version
Permanent link
Page information
This page was last edited on 7 February 2024, at 22:13. Warning: Page may not contain recent updates.
Privacy policy
About MaRDI portal
Disclaimers
Imprint
Powered by MediaWiki