A constrained optimization perspective on actor-critic algorithms and application to network routing

From MaRDI portal
(Redirected from Publication:286519)




Abstract: We propose a novel actor-critic algorithm with guaranteed convergence to an optimal policy for a discounted reward Markov decision process. The actor incorporates a descent direction that is motivated by the solution of a certain non-linear optimization problem. We also discuss an extension to incorporate function approximation and demonstrate the practicality of our algorithms on a network routing application.









This page was built for publication: A constrained optimization perspective on actor-critic algorithms and application to network routing

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q286519)