An actor-critic algorithm for constrained Markov decision processes (Q2504518): Difference between revisions
From MaRDI portal
Set profile property. |
Normalize DOI. |
||
(2 intermediate revisions by 2 users not shown) | |||
Property / DOI | |||
Property / DOI: 10.1016/j.sysconle.2004.08.007 / rank | |||
Property / full work available at URL | |||
Property / full work available at URL: https://doi.org/10.1016/j.sysconle.2004.08.007 / rank | |||
Normal rank | |||
Property / OpenAlex ID | |||
Property / OpenAlex ID: W2070570138 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q4264741 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q4547448 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q3997575 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q4257216 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Stochastic approximation with two time scales / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q4547443 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Actor-Critic--Type Learning Algorithms for Markov Decision Processes / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: OnActor-Critic Algorithms / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q3093188 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q4902563 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Envelope Theorems for Arbitrary Choice Sets / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q4377607 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q4315289 / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: An analysis of temporal-difference learning with function approximation / rank | |||
Normal rank | |||
Property / cites work | |||
Property / cites work: Q4547446 / rank | |||
Normal rank | |||
Property / DOI | |||
Property / DOI: 10.1016/J.SYSCONLE.2004.08.007 / rank | |||
Normal rank |
Latest revision as of 02:00, 19 December 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | An actor-critic algorithm for constrained Markov decision processes |
scientific article |
Statements
An actor-critic algorithm for constrained Markov decision processes (English)
0 references
25 September 2006
0 references
actor-critic algorithms
0 references
reinforcement learning
0 references
constrained Markov decision processes
0 references
stochastic approximation
0 references
envelope theorem
0 references