Batch policy learning in average reward Markov decision processes (Q2112817): Difference between revisions

From MaRDI portal
RedirectionBot (talk | contribs)
Removed claim: author (P16): Item:Q548553
RedirectionBot (talk | contribs)
Changed an Item
Property / author
 
Property / author: Susan A. Murphy / rank
 
Normal rank

Revision as of 08:23, 16 February 2024

scientific article
Language Label Description Also known as
English
Batch policy learning in average reward Markov decision processes
scientific article

    Statements

    Batch policy learning in average reward Markov decision processes (English)
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    12 January 2023
    0 references
    Markov decision process
    0 references
    average reward
    0 references
    policy optimization
    0 references
    doubly robust estimator
    0 references

    Identifiers