Approximate policy iteration: a survey and some new methods (Q2887629): Difference between revisions

@@ Property / DOI @@
-.1007/s11768-011-1005-3
@@ Property / DOI: 10.1007/s11768-011-1005-3 / rank @@
-Normal rank
@@ Property / full work available at URL @@
+https://doi.org/10.1007/s11768-011-1005-3
+Normal rank
@@ Property / OpenAlex ID @@
+W2124477018
@@ Property / OpenAlex ID: W2124477018 / rank @@
+Normal rank
@@ Property / cites work @@
+Simulation-based optimization: Parametric optimization techniques and reinforcement learning
+Normal rank
@@ Property / cites work @@
+Q5425954
@@ Property / cites work: Q5425954 / rank @@
+Normal rank
@@ Property / cites work @@
+Simulation-based algorithms for Markov decision processes.
+Normal rank
@@ Property / cites work @@
+Control Techniques for Complex Networks
@@ Property / cites work: Control Techniques for Complex Networks / rank @@
+Normal rank
@@ Property / cites work @@
+Approximate Dynamic Programming
@@ Property / cites work: Approximate Dynamic Programming / rank @@
+Normal rank
@@ Property / cites work @@
+Q4315289
@@ Property / cites work: Q4315289 / rank @@
+Normal rank
@@ Property / cites work @@
+Neuro-Dynamic Programming: An Overview and Recent Results
+Normal rank
@@ Property / cites work @@
+Basis function adaptation in temporal difference reinforcement learning
+Normal rank
@@ Property / cites work @@
+Projected equation methods for approximate solution of large linear systems
+Normal rank
@@ Property / cites work @@
+Practical issues in temporal difference learning
@@ Property / cites work: Practical issues in temporal difference learning / rank @@
+Normal rank
@@ Property / cites work @@
+On the Convergence of Stochastic Iterative Dynamic Programming Algorithms
+Normal rank
@@ Property / cites work @@
+An analysis of temporal-difference learning with function approximation
+Normal rank
@@ Property / cites work @@
+Average cost temporal-difference learning
@@ Property / cites work: Average cost temporal-difference learning / rank @@
+Normal rank
@@ Property / cites work @@
+Q5477859
@@ Property / cites work: Q5477859 / rank @@
+Normal rank
@@ Property / cites work @@
+Q3316508
@@ Property / cites work: Q3316508 / rank @@
+Normal rank
@@ Property / cites work @@
+Error Bounds for Approximations from Projected Linear Equations
+Normal rank
@@ Property / cites work @@
+Feature-based methods for large scale dynamic programming
+Normal rank
@@ Property / cites work @@
+Performance Loss Bounds for Approximate Value Iteration with State Aggregation
+Normal rank
@@ Property / cites work @@
+Convergence Results for Some Temporal Difference Methods Based on Least Squares
+Normal rank
@@ Property / cites work @@
+Q4257216
@@ Property / cites work: Q4257216 / rank @@
+Normal rank
@@ Property / cites work @@
+Projection methods for variational inequalities with application to the traffic assignment problem
+Normal rank
@@ Property / cites work @@
+Monotone Operators and the Proximal Point Algorithm
+Normal rank
@@ Property / cites work @@
+Q3102800
@@ Property / cites work: Q3102800 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4106692
@@ Property / cites work: Q4106692 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4337625
@@ Property / cites work: Q4337625 / rank @@
+Normal rank
@@ Property / cites work @@
+Technical update: Least-squares temporal difference learning
+Normal rank
@@ Property / cites work @@
+Least squares policy evaluation algorithms with linear function approximation
+Normal rank
@@ Property / cites work @@
+A generalized Kalman filter for fixed point approximation and efficient temporal-difference learning
+Normal rank
@@ Property / cites work @@
+Monte Carlo strategies in scientific computing
@@ Property / cites work: Monte Carlo strategies in scientific computing / rank @@
+Normal rank
@@ Property / cites work @@
+Simulation and the Monte Carlo Method
@@ Property / cites work: Simulation and the Monte Carlo Method / rank @@
+Normal rank
@@ Property / cites work @@
+A Quasi Monte Carlo Method for Large-Scale Inverse Problems
+Normal rank
@@ Property / cites work @@
+Stochastic approximation. A dynamical systems viewpoint.
+Normal rank
@@ Property / cites work @@
+Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming
+Normal rank
@@ Property / cites work @@
+Optimal stopping of Markov processes: Hilbert space theory, approximation algorithms, and an application to pricing high-dimensional financial derivatives
+Normal rank
@@ Property / cites work @@
+.1162/1532443041827907
@@ Property / cites work: 10.1162/1532443041827907 / rank @@
+Normal rank
@@ Property / cites work @@
+Approximate Dynamic Programming via a Smoothed Linear Program
+Normal rank
@@ Property / cites work @@
+Q5201298
@@ Property / cites work: Q5201298 / rank @@
+Normal rank
@@ Property / cites work @@
+Learning Tetris Using the Noisy Cross-Entropy Method
+Normal rank
@@ Property / cites work @@
+A tutorial on the cross-entropy method
@@ Property / cites work: A tutorial on the cross-entropy method / rank @@
+Normal rank
@@ Property / cites work @@
+Contraction Mappings in the Theory Underlying Dynamic Programming
+Normal rank
@@ Property / cites work @@
+Monotone Mappings with Application in Dynamic Programming
+Normal rank
@@ Property / cites work @@
+Stochastic optimal control. The discrete time case
+Normal rank
@@ Property / cites work @@
+Stochastic Games
@@ Property / cites work: Stochastic Games / rank @@
+Normal rank
@@ Property / cites work @@
+Distributed dynamic programming
@@ Property / cites work: Distributed dynamic programming / rank @@
+Normal rank
@@ Property / DOI @@
+.1007/S11768-011-1005-3
@@ Property / DOI: 10.1007/S11768-011-1005-3 / rank @@
+Normal rank