On the Analysis of Model-free Methods for the Linear Quadratic Regulator

Authors Zeyu Jin, Johann Michael Schmitt, Zaiwen Wen

Publication date 7 July 2020

Abstract: Many reinforcement learning methods achieve great success in practice but lack theoretical foundation. In this paper, we study the convergence analysis on the problem of the Linear Quadratic Regulator (LQR). The global linear convergence properties and sample complexities are established for several popular algorithms such as the policy gradient algorithm, TD-learning and the actor-critic (AC) algorithm. Our results show that the actor-critic algorithm can reduce the sample complexity compared with the policy gradient algorithm. Although our analysis is still preliminary, it explains the benefit of AC algorithm in a certain sense.

This page was built for publication: On the Analysis of Model-free Methods for the Linear Quadratic Regulator

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6344644)