Linear least-squares algorithms for temporal difference learning
From MaRDI portal
Recommendations
Cites work
- scientific article; zbMATH DE number 3875113 (Why is no real title available?)
- scientific article; zbMATH DE number 4066707 (Why is no real title available?)
- scientific article; zbMATH DE number 3514781 (Why is no real title available?)
- A Stochastic Approximation Method
- Asynchronous stochastic approximation and Q-learning
- Instrumental variable methods for system identification
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- Practical issues in temporal difference learning
- Recursive estimation and time-series analysis. An introduction
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
- \({\mathcal Q}\)-learning
Cited in
(39)- A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
- Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation
- On average versus discounted reward temporal-difference learning
- Off-policy estimation of long-term average outcomes with applications to mobile health
- A least squares temporal difference actor–critic algorithm with applications to warehouse management
- Derivatives of logarithmic stationary distributions for policy gradient reinforcement learning
- Technical update: Least-squares temporal difference learning
- The optimal unbiased value estimator and its relation to LSTD, TD and MC
- Multilevel preconditioners for temporal-difference learning methods related to recommendation engines
- Temporal difference-based policy iteration for optimal control of stochastic systems
- Kalman temporal differences
- Deep learning in computational mechanics: a review
- Reinforcement learning
- Transfer learning via inter-task mappings for temporal difference learning
- scientific article; zbMATH DE number 6276214 (Why is no real title available?)
- Variance regularization in sequential Bayesian optimization
- scientific article; zbMATH DE number 2087264 (Why is no real title available?)
- Regularized feature selection in reinforcement learning
- Approximating the stationary Bellman equation by hierarchical tensor products
- A functional model method for nonconvex nonsmooth conditional stochastic optimization
- On the worst-case analysis of temporal-difference learning algorithms
- scientific article; zbMATH DE number 1753141 (Why is no real title available?)
- On the convergence of temporal-difference learning with linear function approximation
- A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning
- Least squares policy evaluation algorithms with linear function approximation
- On the convergence of simulation-based iterative methods for solving singular linear systems
- Linear least-squares algorithms for temporal difference learning
- True online temporal-difference learning
- Generalized TD learning
- Optimal policy evaluation using kernel-based temporal difference methods
- Average cost temporal-difference learning
- Eligibility traces and forgetting factor in recursive least-squares-based temporal difference
- Multikernel recursive least-squares temporal difference learning
- Proximal gradient temporal difference learning: stable reinforcement learning with polynomial sample complexity
- Temporal difference learning with incremental nearest neighbors in continuous spaces
- Learning algorithms based on linearization
- Finite-time performance of distributed temporal-difference learning with linear function approximation
- A finite time analysis of temporal difference learning with linear function approximation
- Proximal algorithms and temporal difference methods for solving fixed point problems
This page was built for publication: Linear least-squares algorithms for temporal difference learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1911340)