scientific article

From MaRDI portal

Publication:3093234

Jump to:navigation, search

zbMath1222.68207MaRDI QIDQ3093234

Evan Greensmith, Bartlett, Peter L., Jonathan Baxter

Publication date: 12 October 2011

Full work available at URL: http://www.jmlr.org/papers/v5/greensmith04a.html

Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.

zbMATH Keywords

baseline reinforcement learning actor-critic policy gradient GPOMDP

Mathematics Subject Classification ID

Markov processes: estimation; hidden Markov models (62M05) Learning and adaptive systems in artificial intelligence (68T05)

Related Items

The factored policy-gradient planner ⋮ Adaptive playouts for online learning of policies during Monte Carlo tree search ⋮ Learning to control a structured-prediction decoder for detection of HTTP-layer DDoS attackers ⋮ Unnamed Item ⋮ Optimistic reinforcement learning by forward Kullback-Leibler divergence optimization ⋮ Personalized dynamic treatment regimes in continuous time: a Bayesian approach for optimizing clinical decisions with timing ⋮ A Bayesian decision framework for optimizing sequential combination antiretroviral therapy in people with HIV ⋮ Scalable Control Variates for Monte Carlo Methods Via Stochastic Optimization ⋮ Optimised graded metamaterials for mechanical energy confinement and amplification via reinforcement learning ⋮ Analysis and improvement of policy gradient estimation ⋮ Efficient Sample Reuse in Policy Gradients with Parameter-Based Exploration ⋮ Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies ⋮ Deep Reinforcement Learning: A State-of-the-Art Walkthrough ⋮ On-line policy gradient estimation with multi-step sampling ⋮ Importance sampling in reinforcement learning with an estimated behavior policy ⋮ TD-regularized actor-critic methods ⋮ Natural actor-critic algorithms ⋮ Reinforcement Learning in Sparse-Reward Environments With Hindsight Policy Gradients

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3093234&oldid=16179998"