scientific article; zbMATH DE number 1753152
From MaRDI portal
Publication:4533362
zbMATH Open0994.68119MaRDI QIDQ4533362FDOQ4533362
Authors: Jonathan Baxter, Peter L. Bartlett
Publication date: 13 October 2002
Title of this publication is not available (Why is that?)
Recommendations
- scientific article; zbMATH DE number 1753153
- Approximate gradient methods in policy-space optimization of Markov reward processes
- Estimation and approximation bounds for gradient-based reinforcement learning
- Policy gradient in continuous time
- Variance reduction techniques for gradient estimates in reinforcement learning
Cited In (48)
- L2SR: learning to sample and reconstruct for accelerated MRI via reinforcement learning
- Title not available (Why is that?)
- Geometry and convergence of natural policy gradient methods
- Policy gradient in continuous time
- Variational actor-critic algorithms,
- Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning
- Risk-Sensitive Reinforcement Learning via Policy Gradient Search
- The factored policy-gradient planner
- Estimation and approximation bounds for gradient-based reinforcement learning
- Hessian matrix distribution for Bayesian policy gradient reinforcement learning
- Variance-constrained actor-critic algorithms for discounted and average reward MDPs
- A novel online gait optimization approach for biped robots with point-feet
- Finding optimal memoryless policies of POMDPs under the expected average reward criterion
- A policy gradient method for semi-Markov decision processes with application to call admission control
- An incremental off-policy search in a model-free Markov decision process using a single sample path
- Policy gradient approach of event-based optimization and its online implementation
- Transient-state natural gas transmission in gunbarrel pipeline networks
- Infinite Time Horizon Maximum Causal Entropy Inverse Reinforcement Learning
- Reinforcement learning
- Basic ideas for event-based optimization of Markov systems
- Adaptive critic design with graph Laplacian for online learning control of nonlinear systems
- Queueing network controls via deep reinforcement learning
- Natural actor-critic algorithms
- Simulation-based optimization of Markov decision processes: an empirical process theory approach
- Synaptic dynamics: linear model and adaptation algorithm
- Policy gradient in Lipschitz Markov decision processes
- Finite-time analysis of natural actor-critic for POMDPs
- A unified approach to Markov decision problems and performance sensitivity analysis with discounted and average criteria: multichain cases
- Finding intrinsic rewards by embodied evolution and constrained reinforcement learning
- Title not available (Why is that?)
- A performance gradient perspective on gradient‐based policy iteration and a modified value iteration
- Model-based reinforcement learning with dimension reduction
- Variance reduction techniques for gradient estimates in reinforcement learning
- A stochastic policy search model for matching behavior
- Recurrent policy gradients
- Smoothing policies and safe policy gradients
- Parameterized Markov decision process and its application to service rate control
- Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
- Multi-agent reinforcement learning: a selective overview of theories and algorithms
- Title not available (Why is that?)
- Title not available (Why is that?)
- Dealing with multiple experts and non-stationarity in inverse reinforcement learning: an application to real-life problems
- Risk-constrained reinforcement learning with percentile risk criteria
- On-line policy gradient estimation with multi-step sampling
- Risk-averse policy optimization via risk-neutral policy optimization
- Asymptotic bias of stochastic gradient search
- Global convergence of policy gradient methods to (almost) locally optimal policies
- Reinforcement learning algorithms with function approximation: recent advances and applications
This page was built for publication:
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4533362)