Approximate policy iteration: a survey and some new methods (Q2887629): Difference between revisions

From MaRDI portal
Import240304020342 (talk | contribs)
Set profile property.
Import241208061232 (talk | contribs)
Normalize DOI.
 
(2 intermediate revisions by 2 users not shown)
Property / DOI
 
Property / DOI: 10.1007/s11768-011-1005-3 / rank
Normal rank
 
Property / full work available at URL
 
Property / full work available at URL: https://doi.org/10.1007/s11768-011-1005-3 / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2124477018 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simulation-based optimization: Parametric optimization techniques and reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5425954 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simulation-based algorithms for Markov decision processes. / rank
 
Normal rank
Property / cites work
 
Property / cites work: Control Techniques for Complex Networks / rank
 
Normal rank
Property / cites work
 
Property / cites work: Approximate Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4315289 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Neuro-Dynamic Programming: An Overview and Recent Results / rank
 
Normal rank
Property / cites work
 
Property / cites work: Basis function adaptation in temporal difference reinforcement learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Projected equation methods for approximate solution of large linear systems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Practical issues in temporal difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the Convergence of Stochastic Iterative Dynamic Programming Algorithms / rank
 
Normal rank
Property / cites work
 
Property / cites work: An analysis of temporal-difference learning with function approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Average cost temporal-difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5477859 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3316508 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Error Bounds for Approximations from Projected Linear Equations / rank
 
Normal rank
Property / cites work
 
Property / cites work: Feature-based methods for large scale dynamic programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Performance Loss Bounds for Approximate Value Iteration with State Aggregation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Convergence Results for Some Temporal Difference Methods Based on Least Squares / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4257216 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Projection methods for variational inequalities with application to the traffic assignment problem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Monotone Operators and the Proximal Point Algorithm / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q3102800 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4106692 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4337625 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Technical update: Least-squares temporal difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Least squares policy evaluation algorithms with linear function approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Monte Carlo strategies in scientific computing / rank
 
Normal rank
Property / cites work
 
Property / cites work: Simulation and the Monte Carlo Method / rank
 
Normal rank
Property / cites work
 
Property / cites work: A Quasi Monte Carlo Method for Large-Scale Inverse Problems / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic approximation. A dynamical systems viewpoint. / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives / rank
 
Normal rank
Property / cites work
 
Property / cites work: 10.1162/1532443041827907 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Approximate Dynamic Programming via a Smoothed Linear Program / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5201298 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Learning Tetris Using the Noisy Cross-Entropy Method / rank
 
Normal rank
Property / cites work
 
Property / cites work: A tutorial on the cross-entropy method / rank
 
Normal rank
Property / cites work
 
Property / cites work: Contraction Mappings in the Theory Underlying Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Monotone Mappings with Application in Dynamic Programming / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic optimal control. The discrete time case / rank
 
Normal rank
Property / cites work
 
Property / cites work: Stochastic Games / rank
 
Normal rank
Property / cites work
 
Property / cites work: Distributed dynamic programming / rank
 
Normal rank
Property / DOI
 
Property / DOI: 10.1007/S11768-011-1005-3 / rank
 
Normal rank

Latest revision as of 03:15, 20 December 2024

scientific article
Language Label Description Also known as
English
Approximate policy iteration: a survey and some new methods
scientific article

    Statements

    Approximate policy iteration: a survey and some new methods (English)
    0 references
    1 June 2012
    0 references
    policy iteration
    0 references
    projected equation
    0 references
    aggregation
    0 references
    chattering
    0 references
    regularization
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references

    Identifiers