Multi-armed bandits based on a variant of simulated annealing (Q2520136): Difference between revisions

From MaRDI portal
Set OpenAlex properties.
ReferenceBot (talk | contribs)
Changed an Item
 
Property / cites work
 
Property / cites work: Finite-time analysis of the multiarmed bandit problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Nonstochastic Multiarmed Bandit Problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic approximation. A dynamical systems viewpoint. / rank
 
Normal rank
Property / cites work
 
Property / cites work: Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: An Adaptive Sampling Algorithm for Solving Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: An Asymptotically Efficient Simulation-Based Algorithm for Finite Horizon Stochastic Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Irrevocable Multiarmed Bandit Problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Adaptive game playing using multiplicative weights / rank
 
Normal rank

Latest revision as of 02:32, 13 July 2024

scientific article
Language Label Description Also known as
English
Multi-armed bandits based on a variant of simulated annealing
scientific article

    Statements

    Identifiers