A sensitivity formula for risk-sensitive cost and the actor-critic algorithm
From MaRDI portal
Publication:5958425
DOI10.1016/S0167-6911(01)00152-9zbMath0987.93080OpenAlexW1990437501WikidataQ127227136 ScholiaQ127227136MaRDI QIDQ5958425
Publication date: 3 March 2002
Published in: Systems \& Control Letters (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/s0167-6911(01)00152-9
Markov decision processesstochastic approximationreinforcement learningactor-critic algorithmsparametric sensitivityrisk sensitive control
Optimal stochastic control (93E20) Stochastic approximation (62L20) Markov and semi-Markov decision processes (90C40)
Related Items
Oja's algorithm for graph clustering, Markov spectral decomposition, and risk sensitive control, Risk-Sensitive Reinforcement Learning via Policy Gradient Search, Variance-constrained actor-critic algorithms for discounted and average reward MDPs, Unnamed Item, On tight bounds for function approximation error in risk-sensitive reinforcement learning, Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Risk sensitive control of Markov processes in countable state space
- Connections between stochastic control and dynamic games
- Stochastic approximation with two time scales
- Risk sensitive control of finite state Markov chains in discrete time, with applications to portfolio management
- Multiplicative ergodicity and large deviations for an irreducible Markov chain.
- Analysis of recursive stochastic algorithms
- Perturbation realization, potentials, and sensitivity analysis of Markov processes
- Risk-Sensitive Control of Finite State Machines on an Infinite Horizon I
- Asynchronous Stochastic Approximations
- OnActor-Critic Algorithms
- Simulation-based optimization of Markov reward processes
- Risk-Sensitive Control of Discrete-Time Markov Processes with Infinite Horizon
- Actor-Critic--Type Learning Algorithms for Markov Decision Processes
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- Risk-Sensitive Optimal Control for Markov Decision Processes with Monotone Cost
- Q-Learning for Risk-Sensitive Control