A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
From MaRDI portal
Publication:2887630
Recommendations
- Continuous state dynamic programming via nonexpansive approximation
- Perspectives of approximate dynamic programming
- Dynamic programming and value-function approximation in sequential decision problems: error analysis and numerical results
- A new algorithm for the continuous dynamic programming
- A new algorithm for multidimensional continuing dynamic programming
Cites work
- scientific article; zbMATH DE number 5957492 (Why is no real title available?)
- scientific article; zbMATH DE number 3126094 (Why is no real title available?)
- scientific article; zbMATH DE number 3148886 (Why is no real title available?)
- scientific article; zbMATH DE number 3594403 (Why is no real title available?)
- scientific article; zbMATH DE number 1241609 (Why is no real title available?)
- scientific article; zbMATH DE number 1321699 (Why is no real title available?)
- scientific article; zbMATH DE number 700091 (Why is no real title available?)
- scientific article; zbMATH DE number 3452897 (Why is no real title available?)
- scientific article; zbMATH DE number 795580 (Why is no real title available?)
- scientific article; zbMATH DE number 1392848 (Why is no real title available?)
- 10.1162/153244303768966102
- 10.1162/1532443041827907
- Adaptive optimal control for continuous-time linear systems based on policy iteration
- An analysis of temporal-difference learning with function approximation
- Approximate Dynamic Programming
- Approximations of Dynamic Programs, I
- Feature-based methods for large scale dynamic programming
- Finite-time bounds for fitted value iteration
- Functional Approximations and Dynamic Programming
- Generalized polynomial approximations in Markovian decision processes
- Kernel-based reinforcement learning
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- Linear least-squares algorithms for temporal difference learning
- Markov chains and stochastic stability
- Model-free Q-learning designs for linear discrete-time zero-sum games with application to H^ control
- Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems
- Neuro-Dynamic Programming: An Overview and Recent Results
- Policy Iterations on the Hamilton–Jacobi–Isaacs Equation for $H_{\infty}$ State Feedback Control With Input Saturation
- Practical issues in temporal difference learning
- Recursive estimation of regression functions by local polynomial fitting
- Regularized policy iteration with nonparametric function spaces
- Simulation-based algorithms for Markov decision processes.
- Simulation-based optimization: Parametric optimization techniques and reinforcement learning
- Some results on Tchebycheffian spline functions and stochastic processes
- Stochastic optimal control. The discrete time case
- The Kernel Recursive Least-Squares Algorithm
- The elements of statistical learning. Data mining, inference, and prediction
- The policy iteration algorithm for average reward Markov decision processes with general state space
- \({\mathcal Q}\)-learning
Cited in
(14)- A new algorithm for the continuous dynamic programming
- A new algorithm for multidimensional continuing dynamic programming
- Potential-based least-squares policy iteration for a parameterized feedback control system
- Temporal difference-based policy iteration for optimal control of stochastic systems
- Off-line approximate dynamic programming for the vehicle routing problem with a highly variable customer basis and stochastic demands
- Approximate dynamic programming via direct search in the space of value function approximations
- New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system
- A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces
- Perspectives of approximate dynamic programming
- A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs
- Convergence of deep fictitious play for stochastic differential games
- Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage
- Continuous state dynamic programming via nonexpansive approximation
- Approximate dynamic programming based on high dimensional model representation
This page was built for publication: A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2887630)