A Small Gain Analysis of Single Timescale Actor Critic

Abstract: We consider a version of actor-critic which uses proportional step-sizes and only one critic update with a single sample from the stationary distribution per actor step. We provide an analysis of this method using the small-gain theorem. Specifically, we prove that this method can be used to find a stationary point, and that the resulting sample complexity improves the state of the art for actor-critic methods to

O l e f t (m u^{- 2} e p s i l o n^{- 2} i g h t)

to find an

e p s i l o n

-approximate stationary point where

m u

is the condition number associated with the critic.

Recommendations

Cites work

Cited in

(1)

On the sample complexity of actor-critic method for reinforcement learning with function approximation

This page was built for publication: A Small Gain Analysis of Single Timescale Actor Critic

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6042800)