Risk-Sensitive Reinforcement Learning via Policy Gradient Search (Q5102286): Difference between revisions

From MaRDI portal
Added link to MaRDI item.
ReferenceBot (talk | contribs)
Changed an Item
 
(2 intermediate revisions by 2 users not shown)
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W4285213594 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Parameter-Free Elicitation of Utility and Probability Weighting Functions / rank
 
Normal rank
Property / cites work
 
Property / cites work: Learning Algorithms for Markov Decision Processes with Average Cost / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic Approximation for Nonexpansive Maps: Application to <i>Q</i>-Learning Algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4264741 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Discrete-Time Controlled Markov Processes with Average Cost Criterion: A Survey / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5618987 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Coherent Measures of Risk / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic simulation: Algorithms and analysis / rank
 
Normal rank
Property / cites work
 
Property / cites work: Properties of distortion risk measures / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic optimization with momentum: convergence, fluctuations, and traps avoidance / rank
 
Normal rank
Property / cites work
 
Property / cites work: Computing VaR and CVaR using stochastic approximation and adaptive unconstrained importance sampling / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4533362 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Learning Algorithm for Risk-Sensitive Cost / rank
 
Normal rank
Property / cites work
 
Property / cites work: More Risk-Sensitive Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3241581 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2925454 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4257216 / rank
 
Normal rank
Property / cites work
 
Property / cites work: An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3174029 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic recursive algorithms for optimization. Simultaneous perturbation methods / rank
 
Normal rank
Property / cites work
 
Property / cites work: Natural actor-critic algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: A sensitivity formula for risk-sensitive cost and the actor-critic algorithm / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q-Learning for Risk-Sensitive Control / rank
 
Normal rank
Property / cites work
 
Property / cites work: An actor-critic algorithm for constrained Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic approximation. A dynamical systems viewpoint. / rank
 
Normal rank
Property / cites work
 
Property / cites work: The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-Sensitive Optimal Control for Markov Decision Processes with Monotone Cost / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic approximation with two time scales / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimization Methods for Large-Scale Machine Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Do stochastic algorithms avoid traps? / rank
 
Normal rank
Property / cites work
 
Property / cites work: Large deviations bounds for estimating conditional value-at-risk / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal Investment Policies for a Firm With a Random Risk Process: Exponential Utility and Minimizing the Probability of Ruin / rank
 
Normal rank
Property / cites work
 
Property / cites work: Violations of the betweenness axiom and nonlinearity in probability / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-Averse Control of Undiscounted Transient Markov Models / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simulation-based algorithms for Markov decision processes. / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simulation-based algorithms for Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-Constrained Reinforcement Learning with Percentile Risk Criteria / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4227194 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Mixed risk-neutral/minimax control of discrete-time, finite-state Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimization with Stochastic Dominance Constraints / rank
 
Normal rank
Property / cites work
 
Property / cites work: Finite state Markovian decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-sensitive optimal control of hidden Markov models: structural results / rank
 
Normal rank
Property / cites work
 
Property / cites work: Variance-Penalized Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Percentile performance criteria for limiting average Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-Sensitive Control on an Infinite Time Horizon / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convex measures of risk and trading constraints / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic finance. An introduction in discrete time / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic finance. An introduction in discrete time. / rank
 
Normal rank
Property / cites work
 
Property / cites work: Handbook of simulation optimization / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic First- and Zeroth-Order Methods for Nonconvex Stochastic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simulation-based optimization: Parametric optimization techniques and reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4782076 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk sensitive control of Markov processes in countable state space / rank
 
Normal rank
Property / cites work
 
Property / cites work: Existence of risk-sensitive optimal stationary policies for controlled Markov processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-Sensitive Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Robust Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-Averse Approximate Dynamic Programming with Quantile-Based Risk Measures / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic Optimization in a Cumulative Prospect Theory Framework / rank
 
Normal rank
Property / cites work
 
Property / cites work: Prospect Theory: An Analysis of Decision under Risk / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic Estimation of the Maximum of a Regression Function / rank
 
Normal rank
Property / cites work
 
Property / cites work: Concentration bounds for empirical conditional value-at-risk: the unbounded case / rank
 
Normal rank
Property / cites work
 
Property / cites work: Actor-Critic--Type Learning Algorithms for Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convergence rate of linear two-time-scale stochastic approximation. / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic approximation methods for constrained and unconstrained systems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Probabilistically distorted risk-sensitive infinite-horizon dynamic programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Algorithmic aspects of mean-variance optimization in Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simulation-based optimization of Markov reward processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5284147 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4902563 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-sensitive reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Chance Constrained Programming with Joint Constraints / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algo\-rithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convex Approximations of Chance Constrained Programs / rank
 
Normal rank
Property / cites work
 
Property / cites work: Nonconvergence to unstable points in urn models and stochastic approximations / rank
 
Normal rank
Property / cites work
 
Property / cites work: Sampling derivatives of probabilities / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4335417 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Acceleration of Stochastic Approximation by Averaging / rank
 
Normal rank
Property / cites work
 
Property / cites work: Policy Gradients for CVaR-Constrained MDPs / rank
 
Normal rank
Property / cites work
 
Property / cites work: Adaptive System Optimization Using Random Directions Stochastic Approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Variance-constrained actor-critic algorithms for discounted and average reward MDPs / rank
 
Normal rank
Property / cites work
 
Property / cites work: Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5638126 / rank
 
Normal rank
Property / cites work
 
Property / cites work: The Probability Weighting Function / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4315289 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Sensitivity Analysis for Simulations via Likelihood Ratios / rank
 
Normal rank
Property / cites work
 
Property / cites work: Dynamic coherent risk measures / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Stochastic Approximation Method / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3683893 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-averse dynamic programming for Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Conditional Risk Mappings / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2925334 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Risk-Sensitive Markov Control Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: On general minimax theorems / rank
 
Normal rank
Property / cites work
 
Property / cites work: The variance of discounted Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: A test of generalized expected utility theory / rank
 
Normal rank
Property / cites work
 
Property / cites work: Multivariate stochastic approximation using a simultaneous perturbation gradient approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4626283 / rank
 
Normal rank
Property / cites work
 
Property / cites work: An analysis of temporal-difference learning with function approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Average cost temporal-difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Advances in prospect theory: cumulative representation of uncertainty / rank
 
Normal rank
Property / cites work
 
Property / cites work: Deviation inequalities for an estimator of the conditional value-at-risk / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3997540 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Global Convergence of Policy Gradient Methods to (Almost) Locally Optimal Policies / rank
 
Normal rank

Latest revision as of 01:37, 30 July 2024

scientific article; zbMATH DE number 7582341
Language Label Description Also known as
English
Risk-Sensitive Reinforcement Learning via Policy Gradient Search
scientific article; zbMATH DE number 7582341

    Statements

    Risk-Sensitive Reinforcement Learning via Policy Gradient Search (English)
    0 references
    0 references
    0 references
    6 September 2022
    0 references
    reinforcement learning
    0 references
    optimization
    0 references
    learning and statistical methods
    0 references
    Markov decision processes
    0 references
    risk analysis
    0 references
    simulation
    0 references
    stochastic optimization
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references

    Identifiers