Error controlled actor-critic
From MaRDI portal
Publication:6205028
Abstract: On error of value function inevitably causes an overestimation phenomenon and has a negative impact on the convergence of the algorithms. To mitigate the negative effects of the approximation error, we propose Error Controlled Actor-critic which ensures confining the approximation error in value function. We present an analysis of how the approximation error can hinder the optimization process of actor-critic methods.Then, we derive an upper boundary of the approximation error of Q function approximator and find that the error can be lowered by restricting on the KL-divergence between every two consecutive policies when training the policy. The results of experiments on a range of continuous control tasks demonstrate that the proposed actor-critic algorithm apparently reduces the approximation error and significantly outperforms other model-free RL algorithms.
Recommendations
Cites work
- A theory of learning from different domains
- Fuzzy conformable fractional differential equations: novel extended approach and new numerical solutions
- Numerical solution of systems of second-order boundary value problems using continuous genetic algorithm
- Practical issues in temporal difference learning
- Reinforcement learning algorithms with function approximation: recent advances and applications
- Reinforcement learning. An introduction
- Synthesis of nonlinear control surfaces by a layered associative search network
This page was built for publication: Error controlled actor-critic
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6205028)