Risk-Sensitive Reinforcement Learning via Policy Gradient Search
From MaRDI portal
Publication:5102286
DOI10.1561/2200000091OpenAlexW4285213594MaRDI QIDQ5102286
Publication date: 6 September 2022
Published in: Foundations and Trends® in Machine Learning (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1810.09126
optimizationstochastic optimizationsimulationMarkov decision processesrisk analysisreinforcement learninglearning and statistical methods
Learning and adaptive systems in artificial intelligence (68T05) Research exposition (monographs, survey articles) pertaining to computer science (68-02)
Cites Work
- Stochastic Estimation of the Maximum of a Regression Function
- A Stochastic Approximation Method
- Stochastic finance. An introduction in discrete time
- A sensitivity formula for risk-sensitive cost and the actor-critic algorithm
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Stochastic finance. An introduction in discrete time.
- Stochastic recursive algorithms for optimization. Simultaneous perturbation methods
- Handbook of simulation optimization
- Risk-averse dynamic programming for Markov decision processes
- An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes
- On general minimax theorems
- Properties of distortion risk measures
- Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algo\-rithms
- Simulation-based algorithms for Markov decision processes.
- Nonconvergence to unstable points in urn models and stochastic approximations
- Deviation inequalities for an estimator of the conditional value-at-risk
- Stochastic approximation. A dynamical systems viewpoint.
- Natural actor-critic algorithms
- Sampling derivatives of probabilities
- Advances in prospect theory: cumulative representation of uncertainty
- Stochastic approximation methods for constrained and unconstrained systems
- A test of generalized expected utility theory
- Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes
- Violations of the betweenness axiom and nonlinearity in probability
- Risk sensitive control of Markov processes in countable state space
- Stochastic approximation with two time scales
- Convex measures of risk and trading constraints
- Risk-sensitive reinforcement learning
- Variance-constrained actor-critic algorithms for discounted and average reward MDPs
- Probabilistically distorted risk-sensitive infinite-horizon dynamic programming
- Average cost temporal-difference learning
- Existence of risk-sensitive optimal stationary policies for controlled Markov processes
- Simulation-based optimization: Parametric optimization techniques and reinforcement learning
- Convergence rate of linear two-time-scale stochastic approximation.
- Do stochastic algorithms avoid traps?
- Simulation-based algorithms for Markov decision processes
- Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling
- Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance
- Concentration bounds for empirical conditional value-at-risk: the unbounded case
- Algorithmic aspects of mean-variance optimization in Markov decision processes
- Large deviations bounds for estimating conditional value-at-risk
- Dynamic coherent risk measures
- An actor-critic algorithm for constrained Markov decision processes
- Finite state Markovian decision processes
- Stochastic simulation: Algorithms and analysis
- Learning Algorithms for Markov Decision Processes with Average Cost
- Coherent Measures of Risk
- Risk-Sensitive Markov Control Processes
- Policy Gradients for CVaR-Constrained MDPs
- Parameter-Free Elicitation of Utility and Probability Weighting Functions
- A Learning Algorithm for Risk-Sensitive Cost
- Computing VaR and CVaR using stochastic approximation and adaptive unconstrained importance sampling
- Variance-Penalized Markov Decision Processes
- Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
- Acceleration of Stochastic Approximation by Averaging
- Prospect Theory: An Analysis of Decision under Risk
- An analysis of temporal-difference learning with function approximation
- Risk-sensitive optimal control of hidden Markov models: structural results
- Optimization with Stochastic Dominance Constraints
- Mixed risk-neutral/minimax control of discrete-time, finite-state Markov decision processes
- The Probability Weighting Function
- Stochastic Approximation for Nonexpansive Maps: Application to Q-Learning Algorithms
- Simulation-based optimization of Markov reward processes
- Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
- Optimization Methods for Large-Scale Machine Learning
- Stochastic Optimization in a Cumulative Prospect Theory Framework
- Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey
- Sensitivity Analysis for Simulations via Likelihood Ratios
- The variance of discounted Markov decision processes
- Percentile performance criteria for limiting average Markov decision processes
- Risk-Sensitive Control on an Infinite Time Horizon
- Optimal Investment Policies for a Firm With a Random Risk Process: Exponential Utility and Minimizing the Probability of Ruin
- Actor-Critic--Type Learning Algorithms for Markov Decision Processes
- The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
- Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
- More Risk-Sensitive Markov Decision Processes
- Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures
- Risk-Averse Control of Undiscounted Transient Markov Models
- Adaptive System Optimization Using Random Directions Stochastic Approximation
- Chance Constrained Programming with Joint Constraints
- Conditional Risk Mappings
- Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming
- Convex Approximations of Chance Constrained Programs
- Risk-Sensitive Markov Decision Processes
- Risk-Sensitive Optimal Control for Markov Decision Processes with Monotone Cost
- Q-Learning for Risk-Sensitive Control
- Robust Dynamic Programming