Time-varying policy rule under learning
From MaRDI portal
Recommendations
- Stochastic Learning of Time-Varying Parameters in Random Environment
- Policy gradient in continuous time
- A time-varying model of rational learning
- Dynamic policy programming
- Bellman's principle of optimality and deep reinforcement learning for time-varying tasks
- Adaptive policies for time-varying stochastic systems under discounted criterion
- Learning and planning for time-varying MDPs using maximum likelihood estimation
Cites work
Cited in
(3)
This page was built for publication: Time-varying policy rule under learning
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q500481)