A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
DOI10.1007/S11768-011-0313-YzbMATH Open1249.90306OpenAlexW2044287460WikidataQ115144927 ScholiaQ115144927MaRDI QIDQ2887630FDOQ2887630
Authors: Warren Powell, Jun Ma
Publication date: 1 June 2012
Published in: Journal of Control Theory and Applications (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s11768-011-0313-y
Recommendations
- Continuous state dynamic programming via nonexpansive approximation
- Perspectives of approximate dynamic programming
- Dynamic programming and value-function approximation in sequential decision problems: error analysis and numerical results
- A new algorithm for the continuous dynamic programming
- A new algorithm for multidimensional continuing dynamic programming
Learning and adaptive systems in artificial intelligence (68T05) Approximation methods and heuristics in mathematical programming (90C59) Dynamic programming (90C39)
Cites Work
- The elements of statistical learning. Data mining, inference, and prediction
- Title not available (Why is that?)
- Title not available (Why is that?)
- Some results on Tchebycheffian spline functions and stochastic processes
- Title not available (Why is that?)
- Markov chains and stochastic stability
- \({\mathcal Q}\)-learning
- Title not available (Why is that?)
- Title not available (Why is that?)
- Stochastic optimal control. The discrete time case
- Title not available (Why is that?)
- Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems
- Policy Iterations on the Hamilton–Jacobi–Isaacs Equation for $H_{\infty}$ State Feedback Control With Input Saturation
- Approximate Dynamic Programming
- The Kernel Recursive Least-Squares Algorithm
- Feature-based methods for large scale dynamic programming
- The policy iteration algorithm for average reward Markov decision processes with general state space
- 10.1162/1532443041827907
- Neuro-Dynamic Programming: An Overview and Recent Results
- Linear least-squares algorithms for temporal difference learning
- Functional Approximations and Dynamic Programming
- An analysis of temporal-difference learning with function approximation
- Generalized polynomial approximations in Markovian decision processes
- Kernel-based reinforcement learning
- Simulation-based algorithms for Markov decision processes.
- Approximations of Dynamic Programs, I
- Adaptive optimal control for continuous-time linear systems based on policy iteration
- Finite-time bounds for fitted value iteration
- Simulation-based optimization: Parametric optimization techniques and reinforcement learning
- Title not available (Why is that?)
- Model-free \(Q\)-learning designs for linear discrete-time zero-sum games with application to \(H^\infty\) control
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- 10.1162/153244303768966102
- Title not available (Why is that?)
- Practical issues in temporal difference learning
- Recursive estimation of regression functions by local polynomial fitting
- Title not available (Why is that?)
- Regularized policy iteration with nonparametric function spaces
- Title not available (Why is that?)
Cited In (14)
- A new algorithm for the continuous dynamic programming
- A new algorithm for multidimensional continuing dynamic programming
- Potential-based least-squares policy iteration for a parameterized feedback control system
- Temporal difference-based policy iteration for optimal control of stochastic systems
- Off-line approximate dynamic programming for the vehicle routing problem with a highly variable customer basis and stochastic demands
- Approximate dynamic programming via direct search in the space of value function approximations
- New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system
- A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs
- A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces
- Perspectives of approximate dynamic programming
- Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage
- Convergence of deep fictitious play for stochastic differential games
- Continuous state dynamic programming via nonexpansive approximation
- Approximate dynamic programming based on high dimensional model representation
Uses Software
This page was built for publication: A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2887630)