A constrained optimization perspective on actor-critic algorithms and application to network routing

DOI10.1016/J.SYSCONLE.2016.02.020MaRDI QIDQ286519zbMATH OpenOpenAlexFDO

Authors L. A. Prashanth, H. L. Prasad, Shalabh, Chandra Prakash

Publication date 20 May 2016

Published in Systems \& Control Letters (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/1507.07984

constrained optimization reinforcement learning actor-critic algorithm

Markov and semi-Markov decision processes (90C40) Nonlinear systems in control theory (93C10) Optimal stochastic control (93E20)

Abstract: We propose a novel actor-critic algorithm with guaranteed convergence to an optimal policy for a discounted reward Markov decision process. The actor incorporates a descent direction that is motivated by the solution of a certain non-linear optimization problem. We also discuss an extension to incorporate function approximation and demonstrate the practicality of our algorithms on a network routing application.

Recommendations

Cites work

Cited in

(6)

This page was built for publication: A constrained optimization perspective on actor-critic algorithms and application to network routing

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q286519)