Q-learning and policy iteration algorithms for stochastic shortest path problems (Q378731): Difference between revisions

From MaRDI portal
Created claim: Wikidata QID (P12): Q115147448, #quickstatements; #temporary_batch_1711407341029
Normalize DOI.
 
(2 intermediate revisions by 2 users not shown)
Property / DOI
 
Property / DOI: 10.1007/s10479-012-1128-z / rank
Normal rank
 
Property / cites work
 
Property / cites work: Asynchronous Iterative Methods for Multiprocessors / rank
 
Normal rank
Property / cites work
 
Property / cites work: Distributed dynamic programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Distributed asynchronous computation of fixed points / rank
 
Normal rank
Property / cites work
 
Property / cites work: Neuro-Dynamic Programming: An Overview and Recent Results / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4001523 / rank
 
Normal rank
Property / cites work
 
Property / cites work: An Analysis of Stochastic Shortest Path Problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4257216 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Projected equation methods for approximate solution of large linear systems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: (Approximate) iterated successive approximations algorithm for sequential decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Finite state Markovian decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: On Stationary Strategies in Borel Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the Convergence of Stochastic Iterative Dynamic Programming Algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4315289 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Asynchronous stochastic approximation and Q-learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5477860 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives / rank
 
Normal rank
Property / cites work
 
Property / cites work: Discrete Dynamic Programming with Sensitive Discount Optimality Criteria / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3698635 / rank
 
Normal rank
Property / cites work
 
Property / cites work: On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems / rank
 
Normal rank
Property / DOI
 
Property / DOI: 10.1007/S10479-012-1128-Z / rank
 
Normal rank

Latest revision as of 15:46, 9 December 2024

scientific article
Language Label Description Also known as
English
Q-learning and policy iteration algorithms for stochastic shortest path problems
scientific article

    Statements

    Q-learning and policy iteration algorithms for stochastic shortest path problems (English)
    0 references
    0 references
    0 references
    12 November 2013
    0 references
    Markov decision processes
    0 references
    Q-learning
    0 references
    approximate dynamic programming
    0 references
    value iteration
    0 references
    policy iteration
    0 references
    stochastic shortest paths
    0 references
    stochastic approximation
    0 references

    Identifiers