Eligibility traces and forgetting factor in recursive least-squares-based temporal difference
From MaRDI portal
Publication:6495643
DOI10.1002/ACS.3282WikidataQ107162054 ScholiaQ107162054MaRDI QIDQ6495643FDOQ6495643
Authors: Simone Baldi, Zichen Zhang, Di Liu
Publication date: 30 April 2024
Published in: International Journal of Adaptive Control and Signal Processing (Search for Journal in Brave)
Recommendations
instrumental variable methodleast squaresreinforcement learningtemporal differenceeligibility traces
Cites Work
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
- Adaptive Control Tutorial
- An analysis of temporal-difference learning with function approximation
- Linear Quadratic Tracking Control of Partially-Unknown Continuous-Time Systems Using Reinforcement Learning
- Reinforcement learning for adaptive optimal control of continuous-time linear periodic systems
- Technical update: Least-squares temporal difference learning
- Adaptive Dynamic Programming and Adaptive Optimal Output Regulation of Linear Systems
- Convergence results for single-step on-policy reinforcement-learning algorithms
- Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning
- Linear least-squares algorithms for temporal difference learning
- Title not available (Why is that?)
- Practical issues in temporal difference learning
- Adaptive Optimal Control for Large-Scale Nonlinear Systems
- An adaptive optimization scheme with satisfactory transient performance
- Composite Model Reference Adaptive Control with Parameter Convergence Under Finite Excitation
- Adaptive Control Design Based on Adaptive Optimization Principles
- Stability of Stochastic Approximations With “Controlled Markov” Noise and Temporal Difference Learning
- Initial Excitation-Based Iterative Algorithm for Approximate Optimal Control of Completely Unknown LTI Systems
- Adaptive critic design with graph Laplacian for online learning control of nonlinear systems
- \(Q(\lambda )\)-learning adaptive fuzzy logic controllers for pursuit-evasion differential games
- Adaptive dynamic programming for model-free tracking of trajectories with time-varying parameters
This page was built for publication: Eligibility traces and forgetting factor in recursive least-squares-based temporal difference
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6495643)