Variance-constrained actor-critic algorithms for discounted and average reward MDPs (Q1689603)
From MaRDI portal
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | Variance-constrained actor-critic algorithms for discounted and average reward MDPs |
scientific article |
Statements
Variance-constrained actor-critic algorithms for discounted and average reward MDPs (English)
0 references
12 January 2018
0 references
Markov decision process (MDP)
0 references
reinforcement learning (RL)
0 references
risk sensitive RL
0 references
actor-critic algorithms
0 references
multi-time-scale stochastic approximation
0 references
simultaneous perturbation stochastic approximation (SPSA)
0 references
smoothed functional (SF)
0 references
0 references
0 references
0 references
0 references
0 references
0 references
0 references
0 references