Linear least-squares algorithms for temporal difference learning

Recommendations

Linear least-squares algorithms for temporal difference learning
Technical update: Least-squares temporal difference learning
The convergence of \(TD(\lambda)\) for general \(\lambda\)
Least squares temporal difference methods: An analysis under general conditions

Cites work

scientific article; zbMATH DE number 3875113 (Why is no real title available?)
scientific article; zbMATH DE number 4066707 (Why is no real title available?)
scientific article; zbMATH DE number 3514781 (Why is no real title available?)
A Stochastic Approximation Method
Asynchronous stochastic approximation and Q-learning
Instrumental variable methods for system identification
On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
Practical issues in temporal difference learning
Recursive estimation and time-series analysis. An introduction
The convergence of \(TD(\lambda)\) for general \(\lambda\)
\({\mathcal Q}\)-learning

Cited in

(39)

A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation
On average versus discounted reward temporal-difference learning
Off-policy estimation of long-term average outcomes with applications to mobile health
A least squares temporal difference actor–critic algorithm with applications to warehouse management
Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning
Technical update: Least-squares temporal difference learning
The optimal unbiased value estimator and its relation to LSTD, TD and MC
Multilevel preconditioners for temporal-difference learning methods related to recommendation engines
Temporal difference-based policy iteration for optimal control of stochastic systems
Kalman temporal differences
Deep learning in computational mechanics: a review
Reinforcement learning
Transfer learning via inter-task mappings for temporal difference learning
scientific article; zbMATH DE number 6276214 (Why is no real title available?)
Variance regularization in sequential Bayesian optimization
scientific article; zbMATH DE number 2087264 (Why is no real title available?)
Regularized feature selection in reinforcement learning
Approximating the stationary Bellman equation by hierarchical tensor products
A functional model method for nonconvex nonsmooth conditional stochastic optimization
On the worst-case analysis of temporal-difference learning algorithms
scientific article; zbMATH DE number 1753141 (Why is no real title available?)
On the convergence of temporal-difference learning with linear function approximation
A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning
Least squares policy evaluation algorithms with linear function approximation
On the convergence of simulation-based iterative methods for solving singular linear systems
Linear least-squares algorithms for temporal difference learning
True online temporal-difference learning
Generalized TD learning
Optimal policy evaluation using kernel-based temporal difference methods
Average cost temporal-difference learning
Eligibility traces and forgetting factor in recursive least-squares-based temporal difference
Multikernel recursive least-squares temporal difference learning
Proximal gradient temporal difference learning: stable reinforcement learning with polynomial sample complexity
Temporal difference learning with incremental nearest neighbors in continuous spaces
Learning algorithms based on linearization
Finite-time performance of distributed temporal-difference learning with linear function approximation
A finite time analysis of temporal difference learning with linear function approximation
Proximal algorithms and temporal difference methods for solving fixed point problems

This page was built for publication: Linear least-squares algorithms for temporal difference learning

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1911340)