scientific article; zbMATH DE number 7370615

From MaRDI portal

Revision as of 09:59, 8 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:4999029

Jump to:navigation, search

MaRDI QIDQ4999029

Jason D. Lee, Alekh Agarwal, Sham M. Kakade, Gaurav Mahajan

Publication date: 9 July 2021

Full work available at URL: https://arxiv.org/abs/1908.00261

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.

zbMATH Keywords

reinforcement learning policy gradient

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05)

Related Items (11)

A Two-Timescale Stochastic Algorithm Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic ⋮ Model-free design of stochastic LQR controller from a primal-dual optimization perspective ⋮ Scalable Reinforcement Learning for Multiagent Networked Systems ⋮ On linear and super-linear convergence of natural policy gradient algorithm ⋮ Softmax policy gradient methods can take exponential time to converge ⋮ Geometry and convergence of natural policy gradient methods ⋮ Recent advances in reinforcement learning in finance ⋮ Learning Stationary Nash Equilibrium Policies in \(n\)-Player Stochastic Games with Independent Chains ⋮ Multi-agent natural actor-critic reinforcement learning algorithms ⋮ Towards multi‐agent reinforcement learning‐driven over‐the‐counter market simulations ⋮ Reinforcement learning with dynamic convex risk measures

Cites Work

This page was built for publication:

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:4999029&oldid=19452567"