Online Learning in Markov Decision Processes with Continuous Actions (Q2835638): Difference between revisions

From MaRDI portal
Set OpenAlex properties.
ReferenceBot (talk | contribs)
Changed an Item
 
Property / cites work
 
Property / cites work: The Nonstochastic Multiarmed Bandit Problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2896090 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5302093 / rank
 
Normal rank
Property / cites work
 
Property / cites work: From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning / rank
 
Normal rank

Latest revision as of 01:07, 13 July 2024

scientific article
Language Label Description Also known as
English
Online Learning in Markov Decision Processes with Continuous Actions
scientific article

    Statements