Target Network and Truncation Overcome the Deadly Triad in \(\boldsymbol{Q}\)-Learning (Q6148353): Difference between revisions

From MaRDI portal
Added link to MaRDI item.
ReferenceBot (talk | contribs)
Changed an Item
 
(One intermediate revision by one other user not shown)
Property / OpenAlex ID
 
Property / OpenAlex ID: W4389438905 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4257216 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A concentration bound for contractive stochastic approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3093261 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A distribution-free theory of nonparametric regression / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the Convergence of Stochastic Iterative Dynamic Programming Algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4595047 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3096132 / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Stochastic Approximation Method / rank
 
Normal rank
Property / cites work
 
Property / cites work: An upper bound on the loss from approximate optimal-value functions / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4626283 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Asynchronous stochastic approximation and Q-learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: An analysis of temporal-difference learning with function approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: \({\mathcal Q}\)-learning / rank
 
Normal rank

Latest revision as of 08:45, 23 August 2024

scientific article; zbMATH DE number 7786787
Language Label Description Also known as
English
Target Network and Truncation Overcome the Deadly Triad in \(\boldsymbol{Q}\)-Learning
scientific article; zbMATH DE number 7786787

    Statements

    Target Network and Truncation Overcome the Deadly Triad in \(\boldsymbol{Q}\)-Learning (English)
    0 references
    0 references
    0 references
    0 references
    11 January 2024
    0 references
    reinforcement learning
    0 references
    \(Q\)-learning
    0 references
    linear function approximation
    0 references
    finite-sample analysis
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references