Reinforcement learning with replacing eligibility traces
From MaRDI portal
Recommendations
Cites work
- scientific article; zbMATH DE number 3126094 (Why is no real title available?)
- scientific article; zbMATH DE number 3841285 (Why is no real title available?)
- scientific article; zbMATH DE number 4160608 (Why is no real title available?)
- scientific article; zbMATH DE number 3633709 (Why is no real title available?)
- A Note on the Inversion of Matrices by Random Walks
- Asynchronous stochastic approximation and Q-learning
- On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
- Practical issues in temporal difference learning
- Temporal-difference methods and Markov models
- The convergence of \(TD(\lambda)\) for general \(\lambda\)
Cited in
(13)- The optimal unbiased value estimator and its relation to LSTD, TD and MC
- scientific article; zbMATH DE number 1881080 (Why is no real title available?)
- A Gentle Introduction to Reinforcement Learning
- Machine Learning: ECML 2004
- Learning to control by neural networks using eligibility traces
- REINFORCEMENT LEARNING WITH GOAL-DIRECTED ELIGIBILITY TRACES
- Guiding exploration by pre-existing knowledge without modifying reward
- Off-policy learning with eligibility traces: a survey
- Reinforcement learning with replacing eligibility traces
- Eligibility traces and forgetting factor in recursive least-squares-based temporal difference
- Risk-averse policy optimization via risk-neutral policy optimization
- Policy mirror descent inherently explores action space
- Experimental analysis on Sarsa(λ) and Q(λ) with different eligibility traces strategies
This page was built for publication: Reinforcement learning with replacing eligibility traces
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q1911343)