scientific article; zbMATH DE number 5957388
From MaRDI portal
Publication:3174029
zbMATH Open1222.93210MaRDI QIDQ3174029FDOQ3174029
Vivek Borkar, Madhukar Akarapu, Shalabh Bhatnagar
Publication date: 12 October 2011
Full work available at URL: http://www.jmlr.org/papers/v7/bhatnagar06a.html
Title of this publication is not available (Why is that?)
Markov decision processesreinforcement learningoptimal control conditioned on a rare eventsimulation based algorithmsSPSA with deterministic perturbations
Markov chains (discrete-time Markov processes on discrete state spaces) (60J10) Stochastic systems in control theory (general) (93E03)
Cited In (1)
Recommendations
- Ergodic control of partially observed Markov chains ๐ ๐
- Ergodic control of partially observed Markov processes with equivalent transition probabilities ๐ ๐
- Ergodic Control of Markov Chains with Constraintsโthe General Case ๐ ๐
- Ergodic Control of Continuous-Time Markov Chains with Pathwise Constraints ๐ ๐
- An approximation approach to ergodic semi-Markov control processes ๐ ๐
- Approximating Ergodic Average Reward Continuous-Time Controlled Markov Chains ๐ ๐
- A partially observed control problem for Markov chains ๐ ๐
This page was built for publication:
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3174029)