Q-Learning for Risk-Sensitive Control

From MaRDI portal

Publication:5704076

Jump to:navigation, search

DOI10.1287/moor.27.2.294.324zbMath1082.90576OpenAlexW2139914196MaRDI QIDQ5704076

Vivek S. Borkar

Publication date: 11 November 2005

Published in: Mathematics of Operations Research (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1287/moor.27.2.294.324

zbMATH Keywords

dynamic programming Markov decision processes stochastic approximation reinforcement learning risk-sensitive control Q-learning

Mathematics Subject Classification ID

Dynamic programming in optimal control and differential games (49L20) Stochastic learning and adaptive control (93E35) Markov and semi-Markov decision processes (90C40)

Related Items

Oja's algorithm for graph clustering, Markov spectral decomposition, and risk sensitive control, Risk-Sensitive Reinforcement Learning via Policy Gradient Search, Variance-constrained actor-critic algorithms for discounted and average reward MDPs, Unnamed Item, Risk-Sensitive Reinforcement Learning, On tight bounds for function approximation error in risk-sensitive reinforcement learning, Risk-Constrained Reinforcement Learning with Percentile Risk Criteria, A sensitivity formula for risk-sensitive cost and the actor-critic algorithm, Empirical Dynamic Programming, Risk-averse autonomous systems: a brief history and recent developments from the perspective of optimal control, Risk-averse policy optimization via risk-neutral policy optimization

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:5704076&oldid=30429556"