Reinforcement learning based algorithms for average cost Markov decision processes (Q2643632): Difference between revisions

From MaRDI portal
Import240304020342 (talk | contribs)
Set profile property.
Import241208061232 (talk | contribs)
Normalize DOI.
 
(2 intermediate revisions by 2 users not shown)
Property / DOI
 
Property / DOI: 10.1007/s10626-006-0003-y / rank
Normal rank
 
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1007/s10626-006-0003-y / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2061769118 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Dynamic programming and stochastic control / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Simultaneous Perturbation Stochastic Approximation-Based Actor–Critic Algorithm for Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Actor-critic algorithms for hierarchical Markov decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Asynchronous Stochastic Approximations / rank
 
Normal rank
Property / cites work
 
Property / cites work: The actor-critic algorithm as multi-time-scale stochastic approximation. / rank
 
Normal rank
Property / cites work
 
Property / cites work: The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Actor-Critic--Type Learning Algorithms for Markov Decision Processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: OnActor-Critic Algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4715203 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4315289 / rank
 
Normal rank
Property / cites work
 
Property / cites work: An analysis of temporal-difference learning with function approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Average cost temporal-difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Multivariate stochastic approximation using a simultaneous perturbation gradient approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: A one-measurement form of simultaneous perturbation stochastic approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4547446 / rank
 
Normal rank
Property / DOI
 
Property / DOI: 10.1007/S10626-006-0003-Y / rank
 
Normal rank

Latest revision as of 12:37, 19 December 2024

scientific article
Language Label Description Also known as
English
Reinforcement learning based algorithms for average cost Markov decision processes
scientific article

    Statements

    Reinforcement learning based algorithms for average cost Markov decision processes (English)
    0 references
    27 August 2007
    0 references
    actor-critic algorithms
    0 references
    two timescale stochastic approximation
    0 references
    Markov decision processes
    0 references
    policy iteration
    0 references
    simultaneous perturbation stochastic approximation
    0 references
    normalized Hadamard matrices
    0 references
    reinforcement learning
    0 references
    TD-learning
    0 references

    Identifiers