scientific article; zbMATH DE number 5957388

From MaRDI portal

Publication:3174029

Jump to:navigation, search

MaRDI QIDQ3174029zbMATH OpenFDO

Authors Shalabh Bhatnagar, Vivek Borkar, Madhukar Akarapu

Publication date 12 October 2011

Full work available at URL http://www.jmlr.org/papers/v7/bhatnagar06a.html

zbMATH Keywords

Markov decision processes reinforcement learning optimal control conditioned on a rare event simulation based algorithms SPSA with deterministic perturbations

Mathematics Subject Classification ID

Markov chains (discrete-time Markov processes on discrete state spaces) (60J10) Stochastic systems in control theory (general) (93E03)

Recommendations

Cited in

(2)

This page was built for publication:

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3174029)

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:3174029&oldid=16273270"