Q-learning and policy iteration algorithms for stochastic shortest path problems (Q378731): Difference between revisions

From MaRDI portal
Importer (talk | contribs)
Created a new Item
 
Normalize DOI.
 
(7 intermediate revisions by 7 users not shown)
Property / DOI
 
Property / DOI: 10.1007/s10479-012-1128-z / rank
Normal rank
 
Property / Mathematics Subject Classification ID
 
Property / Mathematics Subject Classification ID: 90C40 / rank
 
Normal rank
Property / Mathematics Subject Classification ID
 
Property / Mathematics Subject Classification ID: 90C39 / rank
 
Normal rank
Property / zbMATH DE Number
 
Property / zbMATH DE Number: 6225970 / rank
 
Normal rank
Property / zbMATH Keywords
 
Markov decision processes
Property / zbMATH Keywords: Markov decision processes / rank
 
Normal rank
Property / zbMATH Keywords
 
Q-learning
Property / zbMATH Keywords: Q-learning / rank
 
Normal rank
Property / zbMATH Keywords
 
approximate dynamic programming
Property / zbMATH Keywords: approximate dynamic programming / rank
 
Normal rank
Property / zbMATH Keywords
 
value iteration
Property / zbMATH Keywords: value iteration / rank
 
Normal rank
Property / zbMATH Keywords
 
policy iteration
Property / zbMATH Keywords: policy iteration / rank
 
Normal rank
Property / zbMATH Keywords
 
stochastic shortest paths
Property / zbMATH Keywords: stochastic shortest paths / rank
 
Normal rank
Property / zbMATH Keywords
 
stochastic approximation
Property / zbMATH Keywords: stochastic approximation / rank
 
Normal rank
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1007/s10479-012-1128-z / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2027855416 / rank
 
Normal rank
Property / Wikidata QID
 
Property / Wikidata QID: Q115147448 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Asynchronous Iterative Methods for Multiprocessors / rank
 
Normal rank
Property / cites work
 
Property / cites work: Distributed dynamic programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Distributed asynchronous computation of fixed points / rank
 
Normal rank
Property / cites work
 
Property / cites work: Neuro-Dynamic Programming: An Overview and Recent Results / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4001523 / rank
 
Normal rank
Property / cites work
 
Property / cites work: An Analysis of Stochastic Shortest Path Problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4257216 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Projected equation methods for approximate solution of large linear systems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: (Approximate) iterated successive approximations algorithm for sequential decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Finite state Markovian decision processes / rank
 
Normal rank
Property / cites work
 
Property / cites work: On Stationary Strategies in Borel Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the Convergence of Stochastic Iterative Dynamic Programming Algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4315289 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Asynchronous stochastic approximation and Q-learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5477860 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives / rank
 
Normal rank
Property / cites work
 
Property / cites work: Discrete Dynamic Programming with Sensitive Discount Optimality Criteria / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3698635 / rank
 
Normal rank
Property / cites work
 
Property / cites work: On Boundedness of Q-Learning Iterates for Stochastic Shortest Path Problems / rank
 
Normal rank
Property / DOI
 
Property / DOI: 10.1007/S10479-012-1128-Z / rank
 
Normal rank
links / mardi / namelinks / mardi / name
 

Latest revision as of 15:46, 9 December 2024

scientific article
Language Label Description Also known as
English
Q-learning and policy iteration algorithms for stochastic shortest path problems
scientific article

    Statements

    Q-learning and policy iteration algorithms for stochastic shortest path problems (English)
    0 references
    0 references
    0 references
    12 November 2013
    0 references
    Markov decision processes
    0 references
    Q-learning
    0 references
    approximate dynamic programming
    0 references
    value iteration
    0 references
    policy iteration
    0 references
    stochastic shortest paths
    0 references
    stochastic approximation
    0 references

    Identifiers