OnActor-Critic Algorithms

DOI10.1137/S0363012901385691MaRDI QIDQ4443033zbMATH OpenOpenAlexFDO

Authors Vijay R. Konda, John N. Tsitsiklis

Publication date 8 January 2004

Published in SIAM Journal on Control and Optimization (Search for Journal in Brave)

Full work available at URL https://doi.org/10.1137/s0363012901385691

zbMATH Keywords

stochastic approximation Markov decision processes reinforcement learning actor-critic algorithms

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05) Stochastic learning and adaptive control (93E35)

Recommendations

An actor-critic algorithm for constrained Markov decision processes
Actor-Critic--Type Learning Algorithms for Markov Decision Processes
Actor-critic algorithms with online feature adaptation
Natural actor-critic algorithms
A convergent online single time scale actor critic algorithm

Cited in

(only showing first 100 items - show all)

A Spiking Neural Network Model of an Actor-Critic Learning Agent
Performance optimization for a class of generalized stochastic Petri nets
Actor prioritized experience replay
Stochastic optimization for real time service capacity allocation under random service demand
Global convergence of policy gradient methods to (almost) locally optimal policies
Reinforcement learning algorithms with function approximation: recent advances and applications
Policy optimization for \(\mathcal{H}_2\) linear control with \(\mathcal{H}_\infty\) robustness guarantee: implicit regularization and global convergence
Fundamental design principles for reinforcement learning algorithms
Mixed density methods for approximate dynamic programming
From infinite to finite programs: explicit error bounds with applications to approximate dynamic programming
Bayesian sequential optimal experimental design for nonlinear models using policy gradient reinforcement learning
Neural circuits for learning context-dependent associations of stimuli
Reinforcement learning for a class of continuous-time input constrained optimal control problems
Deep reinforcement learning for infinite horizon mean field problems in continuous spaces
TD-regularized actor-critic methods
Geometry and convergence of natural policy gradient methods
Tutorial on Amortized Optimization
A convergent online single time scale actor critic algorithm
Variational actor-critic algorithms,
Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning
Estimation and approximation bounds for gradient-based reinforcement learning
Actor-Critic–Like Stochastic Adaptive Search for Continuous Simulation Optimization
Convergence rate of linear two-time-scale stochastic approximation.
Dynamic programming and suboptimal control: a survey from ADP to MPC
Variance-constrained actor-critic algorithms for discounted and average reward MDPs
Recent advances in reinforcement learning in finance
Multi-agent natural actor-critic reinforcement learning algorithms
An incremental off-policy search in a model-free Markov decision process using a single sample path
Adaptive-resolution reinforcement learning with polynomial exploration in deterministic domains
Actor-Critic--Type Learning Algorithms for Markov Decision Processes
Dynamic treatment regimes: technical challenges and applications
An adaptive actor-critic algorithm with multi-step simulated experiences for controlling nonholonomic mobile robots
Actor-critic algorithms for hierarchical Markov decision processes
Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling
A constrained optimization perspective on actor-critic algorithms and application to network routing
A new learning algorithm for optimal stopping
Actor-critic method for high dimensional static Hamilton-Jacobi-Bellman partial differential equations based on neural networks
A Small Gain Analysis of Single Timescale Actor Critic
Tutorial series on brain-inspired computing. IV: Reinforcement learning: machine learning and natural learning
Multivariate compact law of the iterated logarithm for averaged stochastic approximation algorithms
On the sample complexity of actor-critic method for reinforcement learning with function approximation
Hebbian versus gradient training of ESN actors in closed-loop ACD
Approximate Newton Policy Gradient Algorithms
Deep Reinforcement Learning: A State-of-the-Art Walkthrough
Simple and optimal methods for stochastic variational inequalities. II: Markovian noise and policy evaluation in reinforcement learning
An Actor-Critic Algorithm With Second-Order Actor and Critic
Toward multi-target self-organizing pursuit in a partially observable Markov game
Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities
Stabilization of stochastic approximation by step size adaptation
Natural actor-critic algorithms
Multiscale Q-learning with linear function approximation
Reinforcement Learning, Spike-Time-Dependent Plasticity, and the BCM Rule
Queueing network controls via deep reinforcement learning
Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
An online actor-critic algorithm with function approximation for constrained Markov decision processes
Actor-critic algorithms based on symmetric perturbation sampling
Weak convergence of dynamical systems in two timescales
Autonomous reinforcement learning with experience replay
Immediate return preference emerged from a synaptic learning rule for return maximization
Softmax policy gradient methods can take exponential time to converge
Sell or store? An ADP approach to marketing renewable energy
Reinforcement learning for a biped robot based on a CPG-actor-critic method
A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs
Multi-agent off-policy actor-critic algorithm for distributed multi-task reinforcement learning
Global convergence of natural policy gradient with Hessian-aided momentum variance reduction
Finite-time analysis of natural actor-critic for POMDPs
Finding intrinsic rewards by embodied evolution and constrained reinforcement learning
On the convergence of simulation-based iterative methods for solving singular linear systems
Stochastic approximation and reinforcement learning: the interface and a little beyond
Efficient model-based reinforcement learning for approximate online optimal control
A tutorial on the cross-entropy method
Error controlled actor-critic
Non-iterative generation of an optimal mesh for a blade passage using deep reinforcement learning
An actor-critic algorithm with policy gradients to solve the job shop scheduling problem using deep double recurrent agents
Reward-respecting subtasks for model-based reinforcement learning
Approximate stochastic annealing for online control of infinite horizon Markov decision processes
An accelerated proximal algorithm for regularized nonconvex and nonsmooth bi-level optimization
Control strategy of speed servo systems based on deep reinforcement learning
Asynchronous stochastic approximation with differential inclusions
What is the value of the cross-sectional approach to deep reinforcement learning?
Artificial Intelligence and Soft Computing - ICAISC 2004
On centralized critics in multi-agent reinforcement learning
A stabilizing reinforcement learning approach for sampled systems with partially unknown models
An Improved Unconstrained Approach for Bilevel Optimization
Asymptotic analysis of temporal-difference learning algorithms with constant step-sizes
Real-time reinforcement learning by sequential actor-critics and experience replay
Smoothing policies and safe policy gradients
Blackbox simulation optimization
Model-based reinforcement learning for approximate optimal regulation
Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning
An actor-critic algorithm for constrained Markov decision processes
Reinforcement learning in the brain
Preference-based reinforcement learning: a formal framework and a policy iteration algorithm
Multi-agent reinforcement learning: a selective overview of theories and algorithms
Natural actor-critic based on batch recursive least-squares
An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes
A new approximate dynamic programming algorithm based on an actor–critic framework for optimal control of alkali–surfactant–polymer flooding
Reinforcement learning based algorithms for average cost Markov decision processes
scientific article; zbMATH DE number 7625165 (Why is no real title available?)
Linear stochastic approximation driven by slowly varying Markov chains

This page was built for publication: OnActor-Critic Algorithms

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4443033)