Natural actor-critic algorithms (Q1049136): Difference between revisions

From MaRDI portal
Added link to MaRDI item.
ReferenceBot (talk | contribs)
Changed an Item
 
(2 intermediate revisions by 2 users not shown)
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1016/j.automatica.2009.07.008 / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2094387729 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Reinforcement learning based algorithms for average cost Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Learning Algorithms for Markov Decision Processes with Average Cost / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4533362 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Functional Approximations and Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3997575 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4209222 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4001523 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4257216 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Simultaneous Perturbation Stochastic Approximation-Based Actor–Critic Algorithm for Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization / rank
 
Normal rank
Property / cites work
 
Property / cites work: Adaptive Newton-based multivariate smoothed functional algorithms for simulation optimization / rank
 
Normal rank
Property / cites work
 
Property / cites work: Natural actor-critic algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic approximation with two time scales / rank
 
Normal rank
Property / cites work
 
Property / cites work: The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Some Pathological Traps for Stochastic Approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5477859 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Perturbation realization, potentials, and sensitivity analysis of Markov processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Elevator group control using multiple reinforcement learning agents / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2810874 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3093234 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Actor-Critic--Type Learning Algorithms for Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: OnActor-Critic Algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic approximation methods for constrained and unconstrained systems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4346705 / rank
 
Normal rank
Property / cites work
 
Property / cites work: 10.1162/1532443041827907 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simulation-based optimization of Markov reward processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Nonconvergence to unstable points in urn models and stochastic approximations / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4315289 / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the convergence of temporal-difference learning with linear function approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Asynchronous stochastic approximation and Q-learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: An analysis of temporal-difference learning with function approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Average cost temporal-difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Survey of Applications of Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3724211 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simple statistical gradient-following algorithms for connectionist reinforcement learning / rank
 
Normal rank

Latest revision as of 09:03, 2 July 2024

scientific article
Language Label Description Also known as
English
Natural actor-critic algorithms
scientific article

    Statements

    Natural actor-critic algorithms (English)
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    8 January 2010
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    actor-critic reinforcement learning algorithms
    0 references
    policy-gradient methods
    0 references
    approximate dynamic programming
    0 references
    function approximation
    0 references
    two-timescale stochastic approximation
    0 references
    temporal difference learning
    0 references
    natural gradient
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references