Error controlled actor-critic
From MaRDI portal
Publication:6205028
DOI10.1016/J.INS.2022.08.079arXiv2109.02517MaRDI QIDQ6205028FDOQ6205028
Authors: Xingen Gao, Fei Chao, Chang-Le Zhou, Zhen Ge, Longzhi Yang, Xiang Chang, Changjing Shang, Qiang Shen
Publication date: 11 April 2024
Published in: Information Sciences (Search for Journal in Brave)
Abstract: On error of value function inevitably causes an overestimation phenomenon and has a negative impact on the convergence of the algorithms. To mitigate the negative effects of the approximation error, we propose Error Controlled Actor-critic which ensures confining the approximation error in value function. We present an analysis of how the approximation error can hinder the optimization process of actor-critic methods.Then, we derive an upper boundary of the approximation error of Q function approximator and find that the error can be lowered by restricting on the KL-divergence between every two consecutive policies when training the policy. The results of experiments on a range of continuous control tasks demonstrate that the proposed actor-critic algorithm apparently reduces the approximation error and significantly outperforms other model-free RL algorithms.
Full work available at URL: https://arxiv.org/abs/2109.02517
Cites Work
- Title not available (Why is that?)
- A theory of learning from different domains
- Numerical solution of systems of second-order boundary value problems using continuous genetic algorithm
- Reinforcement learning algorithms with function approximation: recent advances and applications
- Practical issues in temporal difference learning
- Synthesis of nonlinear control surfaces by a layered associative search network
- Fuzzy conformable fractional differential equations: novel extended approach and new numerical solutions
This page was built for publication: Error controlled actor-critic
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6205028)