Technical update: Least-squares temporal difference learning
From MaRDI portal
Publication:1604819
DOI10.1023/A:1017936530646zbMath1014.68072MaRDI QIDQ1604819
Publication date: 8 July 2002
Published in: Machine Learning (Search for Journal in Brave)
Related Items (21)
Approximate policy iteration: a survey and some new methods ⋮ An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method ⋮ Restricted gradient-descent algorithm for value-function approximation in reinforcement learning ⋮ A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning ⋮ Batch mode reinforcement learning based on the synthesis of artificial trajectories ⋮ An approximate dynamic programming approach to the admission control of elective patients ⋮ Deep reinforcement trading with predictable returns ⋮ The optimal unbiased value estimator and its relation to LSTD, TD and MC ⋮ Reinforcement learning algorithms with function approximation: recent advances and applications ⋮ Asymptotic analysis of value prediction by well-specified and misspecified models ⋮ A two-level optimization model for elective surgery scheduling with downstream capacity constraints ⋮ On Generalized Bellman Equations and Temporal-Difference Learning ⋮ Basis function adaptation in temporal difference reinforcement learning ⋮ Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning ⋮ Proximal algorithms and temporal difference methods for solving fixed point problems ⋮ Convergence of the standard RLS method andUDUTfactorisation of covariance matrix for solving the algebraic Riccati equation of the DLQR via heuristic approximate dynamic programming ⋮ Projected equation methods for approximate solution of large linear systems ⋮ Approximate optimal adaptive control for weakly coupled nonlinear systems: A neuro-inspired approach ⋮ Unnamed Item ⋮ Unnamed Item ⋮ Solving factored MDPs using non-homogeneous partitions
This page was built for publication: Technical update: Least-squares temporal difference learning