Risk-Sensitive Reinforcement Learning via Policy Gradient Search (Q5102286): Difference between revisions

@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / OpenAlex ID @@
+W4285213594
@@ Property / OpenAlex ID: W4285213594 / rank @@
+Normal rank
@@ Property / cites work @@
+Parameter-Free Elicitation of Utility and Probability Weighting Functions
+Normal rank
@@ Property / cites work @@
+Learning Algorithms for Markov Decision Processes with Average Cost
+Normal rank
@@ Property / cites work @@
+Stochastic Approximation for Nonexpansive Maps: Application to <i>Q</i>-Learning Algorithms
+Normal rank
@@ Property / cites work @@
+Q4264741
@@ Property / cites work: Q4264741 / rank @@
+Normal rank
@@ Property / cites work @@
+Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey
+Normal rank
@@ Property / cites work @@
+Q5618987
@@ Property / cites work: Q5618987 / rank @@
+Normal rank
@@ Property / cites work @@
+Coherent Measures of Risk
@@ Property / cites work: Coherent Measures of Risk / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic simulation: Algorithms and analysis
@@ Property / cites work: Stochastic simulation: Algorithms and analysis / rank @@
+Normal rank
@@ Property / cites work @@
+Properties of distortion risk measures
@@ Property / cites work: Properties of distortion risk measures / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance
+Normal rank
@@ Property / cites work @@
+Computing VaR and CVaR using stochastic approximation and adaptive unconstrained importance sampling
+Normal rank
@@ Property / cites work @@
+Q4533362
@@ Property / cites work: Q4533362 / rank @@
+Normal rank
@@ Property / cites work @@
+A Learning Algorithm for Risk-Sensitive Cost
@@ Property / cites work: A Learning Algorithm for Risk-Sensitive Cost / rank @@
+Normal rank
@@ Property / cites work @@
+More Risk-Sensitive Markov Decision Processes
@@ Property / cites work: More Risk-Sensitive Markov Decision Processes / rank @@
+Normal rank
@@ Property / cites work @@
+Q3241581
@@ Property / cites work: Q3241581 / rank @@
+Normal rank
@@ Property / cites work @@
+Q2925454
@@ Property / cites work: Q2925454 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4257216
@@ Property / cites work: Q4257216 / rank @@
+Normal rank
@@ Property / cites work @@
+An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes
+Normal rank
@@ Property / cites work @@
+Q3174029
@@ Property / cites work: Q3174029 / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic recursive algorithms for optimization. Simultaneous perturbation methods
+Normal rank
@@ Property / cites work @@
+Natural actor-critic algorithms
@@ Property / cites work: Natural actor-critic algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+A sensitivity formula for risk-sensitive cost and the actor-critic algorithm
+Normal rank
@@ Property / cites work @@
+Q-Learning for Risk-Sensitive Control
@@ Property / cites work: Q-Learning for Risk-Sensitive Control / rank @@
+Normal rank
@@ Property / cites work @@
+An actor-critic algorithm for constrained Markov decision processes
+Normal rank
@@ Property / cites work @@
+Stochastic approximation. A dynamical systems viewpoint.
+Normal rank
@@ Property / cites work @@
+The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
+Normal rank
@@ Property / cites work @@
+Risk-Sensitive Optimal Control for Markov Decision Processes with Monotone Cost
+Normal rank
@@ Property / cites work @@
+Stochastic approximation with two time scales
@@ Property / cites work: Stochastic approximation with two time scales / rank @@
+Normal rank
@@ Property / cites work @@
+Optimization Methods for Large-Scale Machine Learning
+Normal rank
@@ Property / cites work @@
+Do stochastic algorithms avoid traps?
@@ Property / cites work: Do stochastic algorithms avoid traps? / rank @@
+Normal rank
@@ Property / cites work @@
+Large deviations bounds for estimating conditional value-at-risk
+Normal rank
@@ Property / cites work @@
+Optimal Investment Policies for a Firm With a Random Risk Process: Exponential Utility and Minimizing the Probability of Ruin
+Normal rank
@@ Property / cites work @@
+Violations of the betweenness axiom and nonlinearity in probability
+Normal rank
@@ Property / cites work @@
+Risk-Averse Control of Undiscounted Transient Markov Models
+Normal rank
@@ Property / cites work @@
+Simulation-based algorithms for Markov decision processes.
+Normal rank
@@ Property / cites work @@
+Simulation-based algorithms for Markov decision processes
+Normal rank
@@ Property / cites work @@
+Risk-Constrained Reinforcement Learning with Percentile Risk Criteria
+Normal rank
@@ Property / cites work @@
+Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes
+Normal rank
@@ Property / cites work @@
+Q4227194
@@ Property / cites work: Q4227194 / rank @@
+Normal rank
@@ Property / cites work @@
+Mixed risk-neutral/minimax control of discrete-time, finite-state Markov decision processes
+Normal rank
@@ Property / cites work @@
+Optimization with Stochastic Dominance Constraints
+Normal rank
@@ Property / cites work @@
+Finite state Markovian decision processes
@@ Property / cites work: Finite state Markovian decision processes / rank @@
+Normal rank
@@ Property / cites work @@
+Risk-sensitive optimal control of hidden Markov models: structural results
+Normal rank
@@ Property / cites work @@
+Variance-Penalized Markov Decision Processes
@@ Property / cites work: Variance-Penalized Markov Decision Processes / rank @@
+Normal rank
@@ Property / cites work @@
+Percentile performance criteria for limiting average Markov decision processes
+Normal rank
@@ Property / cites work @@
+Risk-Sensitive Control on an Infinite Time Horizon
+Normal rank
@@ Property / cites work @@
+Convex measures of risk and trading constraints
@@ Property / cites work: Convex measures of risk and trading constraints / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic finance. An introduction in discrete time
+Normal rank
@@ Property / cites work @@
+Stochastic finance. An introduction in discrete time.
+Normal rank
@@ Property / cites work @@
+Handbook of simulation optimization
@@ Property / cites work: Handbook of simulation optimization / rank @@
+Normal rank
@@ Property / cites work @@
+Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming
+Normal rank
@@ Property / cites work @@
+Simulation-based optimization: Parametric optimization techniques and reinforcement learning
+Normal rank
@@ Property / cites work @@
+Q4782076
@@ Property / cites work: Q4782076 / rank @@
+Normal rank
@@ Property / cites work @@
+Risk sensitive control of Markov processes in countable state space
+Normal rank
@@ Property / cites work @@
+Existence of risk-sensitive optimal stationary policies for controlled Markov processes
+Normal rank
@@ Property / cites work @@
+Risk-Sensitive Markov Decision Processes
@@ Property / cites work: Risk-Sensitive Markov Decision Processes / rank @@
+Normal rank
@@ Property / cites work @@
+Robust Dynamic Programming
@@ Property / cites work: Robust Dynamic Programming / rank @@
+Normal rank
@@ Property / cites work @@
+Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures
+Normal rank
@@ Property / cites work @@
+Stochastic Optimization in a Cumulative Prospect Theory Framework
+Normal rank
@@ Property / cites work @@
+Prospect Theory: An Analysis of Decision under Risk
+Normal rank
@@ Property / cites work @@
+Stochastic Estimation of the Maximum of a Regression Function
+Normal rank
@@ Property / cites work @@
+Concentration bounds for empirical conditional value-at-risk: the unbounded case
+Normal rank
@@ Property / cites work @@
+Actor-Critic--Type Learning Algorithms for Markov Decision Processes
+Normal rank
@@ Property / cites work @@
+Convergence rate of linear two-time-scale stochastic approximation.
+Normal rank
@@ Property / cites work @@
+Stochastic approximation methods for constrained and unconstrained systems
+Normal rank
@@ Property / cites work @@
+Probabilistically distorted risk-sensitive infinite-horizon dynamic programming
+Normal rank
@@ Property / cites work @@
+Algorithmic aspects of mean-variance optimization in Markov decision processes
+Normal rank
@@ Property / cites work @@
+Simulation-based optimization of Markov reward processes
+Normal rank
@@ Property / cites work @@
+Q5284147
@@ Property / cites work: Q5284147 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4902563
@@ Property / cites work: Q4902563 / rank @@
+Normal rank
@@ Property / cites work @@
+Risk-sensitive reinforcement learning
@@ Property / cites work: Risk-sensitive reinforcement learning / rank @@
+Normal rank
@@ Property / cites work @@
+Chance Constrained Programming with Joint Constraints
+Normal rank
@@ Property / cites work @@
+Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algo\-rithms
+Normal rank
@@ Property / cites work @@
+Convex Approximations of Chance Constrained Programs
+Normal rank
@@ Property / cites work @@
+Nonconvergence to unstable points in urn models and stochastic approximations
+Normal rank
@@ Property / cites work @@
+Sampling derivatives of probabilities
@@ Property / cites work: Sampling derivatives of probabilities / rank @@
+Normal rank
@@ Property / cites work @@
+Q4335417
@@ Property / cites work: Q4335417 / rank @@
+Normal rank
@@ Property / cites work @@
+Acceleration of Stochastic Approximation by Averaging
+Normal rank
@@ Property / cites work @@
+Policy Gradients for CVaR-Constrained MDPs
@@ Property / cites work: Policy Gradients for CVaR-Constrained MDPs / rank @@
+Normal rank
@@ Property / cites work @@
+Adaptive System Optimization Using Random Directions Stochastic Approximation
+Normal rank
@@ Property / cites work @@
+Variance-constrained actor-critic algorithms for discounted and average reward MDPs
+Normal rank
@@ Property / cites work @@
+Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling
+Normal rank
@@ Property / cites work @@
+Q5638126
@@ Property / cites work: Q5638126 / rank @@
+Normal rank
@@ Property / cites work @@
+The Probability Weighting Function
@@ Property / cites work: The Probability Weighting Function / rank @@
+Normal rank
@@ Property / cites work @@
+Q4315289
@@ Property / cites work: Q4315289 / rank @@
+Normal rank
@@ Property / cites work @@
+Sensitivity Analysis for Simulations via Likelihood Ratios
+Normal rank
@@ Property / cites work @@
+Dynamic coherent risk measures
@@ Property / cites work: Dynamic coherent risk measures / rank @@
+Normal rank
@@ Property / cites work @@
+A Stochastic Approximation Method
@@ Property / cites work: A Stochastic Approximation Method / rank @@
+Normal rank
@@ Property / cites work @@
+Q3683893
@@ Property / cites work: Q3683893 / rank @@
+Normal rank
@@ Property / cites work @@
+Risk-averse dynamic programming for Markov decision processes
+Normal rank
@@ Property / cites work @@
+Conditional Risk Mappings
@@ Property / cites work: Conditional Risk Mappings / rank @@
+Normal rank
@@ Property / cites work @@
+Q2925334
@@ Property / cites work: Q2925334 / rank @@
+Normal rank
@@ Property / cites work @@
+Risk-Sensitive Markov Control Processes
@@ Property / cites work: Risk-Sensitive Markov Control Processes / rank @@
+Normal rank
@@ Property / cites work @@
+On general minimax theorems
@@ Property / cites work: On general minimax theorems / rank @@
+Normal rank
@@ Property / cites work @@
+The variance of discounted Markov decision processes
+Normal rank
@@ Property / cites work @@
+A test of generalized expected utility theory
@@ Property / cites work: A test of generalized expected utility theory / rank @@
+Normal rank
@@ Property / cites work @@
+Multivariate stochastic approximation using a simultaneous perturbation gradient approximation
+Normal rank
@@ Property / cites work @@
+Q4626283
@@ Property / cites work: Q4626283 / rank @@
+Normal rank
@@ Property / cites work @@
+An analysis of temporal-difference learning with function approximation
+Normal rank
@@ Property / cites work @@
+Average cost temporal-difference learning
@@ Property / cites work: Average cost temporal-difference learning / rank @@
+Normal rank
@@ Property / cites work @@
+Advances in prospect theory: cumulative representation of uncertainty
+Normal rank
@@ Property / cites work @@
+Deviation inequalities for an estimator of the conditional value-at-risk
+Normal rank
@@ Property / cites work @@
+Q3997540
@@ Property / cites work: Q3997540 / rank @@
+Normal rank
@@ Property / cites work @@
+Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies
+Normal rank