The actor-critic algorithm as multi-time-scale stochastic approximation.

From MaRDI portal

Revision as of 01:12, 30 January 2024 by Import240129110155 (talk | contribs) (Created automatically from import240129110155)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:5955801

Jump to:navigation, search

DOI10.1007/BF02745577zbMath1075.90557OpenAlexW2047364871MaRDI QIDQ5955801

Vijaymohan R. Konda, Vivek S. Borkar

Publication date: 18 February 2002

Published in: Sādhanā (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/bf02745577

Mathematics Subject Classification ID

Stochastic approximation (62L20) Markov and semi-Markov decision processes (90C40)

Related Items (1)

Reinforcement learning based algorithms for average cost Markov decision processes

Cites Work

This page was built for publication: The actor-critic algorithm as multi-time-scale stochastic approximation.

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:5955801&oldid=12121596"