Batch mode reinforcement learning based on the synthesis of artificial trajectories (Q378762): Difference between revisions

From MaRDI portal
Added link to MaRDI item.
ReferenceBot (talk | contribs)
Changed an Item
 
(6 intermediate revisions by 5 users not shown)
Property / author
 
Property / author: Susan A. Murphy / rank
Normal rank
 
Property / author
 
Property / author: Susan A. Murphy / rank
 
Normal rank
Property / describes a project that uses
 
Property / describes a project that uses: Approxrl / rank
 
Normal rank
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2134689794 / rank
 
Normal rank
Property / Wikidata QID
 
Property / Wikidata QID: Q42258641 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3241581 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Technical update: Least-squares temporal difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5477859 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Machine Learning: ECML 2003 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3093261 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Towards Min Max Generalization in Reinforcement Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: 10.1162/1532443041827907 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5405216 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3096132 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal Dynamic Treatment Regimes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Marginal Mean Models for Dynamic Regimes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Least squares policy evaluation algorithms with linear function approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Kernel-based reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect / rank
 
Normal rank

Latest revision as of 01:48, 7 July 2024

scientific article
Language Label Description Also known as
English
Batch mode reinforcement learning based on the synthesis of artificial trajectories
scientific article

    Statements

    Batch mode reinforcement learning based on the synthesis of artificial trajectories (English)
    0 references
    0 references
    0 references
    0 references
    0 references
    12 November 2013
    0 references
    0 references
    reinforcement learning
    0 references
    optimal control
    0 references
    artificial trajectories
    0 references
    function approximators
    0 references
    0 references
    0 references