Reinforcement learning based algorithms for average cost Markov decision processes

From MaRDI portal

Publication:2643632

Jump to:navigation, search

DOI10.1007/S10626-006-0003-YzbMATH Open1146.90521OpenAlexW2061769118MaRDI QIDQ2643632FDOQ2643632

Authors: Mohammed Shahid Abdulla, Shalabh Bhatnagar

Publication date: 27 August 2007

Published in: Discrete Event Dynamic Systems (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/s10626-006-0003-y

Recommendations

zbMATH Keywords

Markov decision processes reinforcement learning policy iteration simultaneous perturbation stochastic approximation actor-critic algorithms normalized Hadamard matrices TD-learning two timescale stochastic approximation

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40)

Cites Work

Cited In (11)

This page was built for publication: Reinforcement learning based algorithms for average cost Markov decision processes

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2643632)

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:2643632&oldid=15464766"