On the Analysis of Model-free Methods for the Linear Quadratic Regulator
From MaRDI portal
Publication:6344644
arXiv2007.03861MaRDI QIDQ6344644FDOQ6344644
Authors: Zeyu Jin, Johann Michael Schmitt, Zaiwen Wen
Publication date: 7 July 2020
Abstract: Many reinforcement learning methods achieve great success in practice but lack theoretical foundation. In this paper, we study the convergence analysis on the problem of the Linear Quadratic Regulator (LQR). The global linear convergence properties and sample complexities are established for several popular algorithms such as the policy gradient algorithm, TD-learning and the actor-critic (AC) algorithm. Our results show that the actor-critic algorithm can reduce the sample complexity compared with the policy gradient algorithm. Although our analysis is still preliminary, it explains the benefit of AC algorithm in a certain sense.
This page was built for publication: On the Analysis of Model-free Methods for the Linear Quadratic Regulator
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6344644)