A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
From MaRDI portal
Publication:2887630
DOI10.1007/s11768-011-0313-yzbMath1249.90306OpenAlexW2044287460WikidataQ115144927 ScholiaQ115144927MaRDI QIDQ2887630
Publication date: 1 June 2012
Published in: Journal of Control Theory and Applications (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s11768-011-0313-y
Learning and adaptive systems in artificial intelligence (68T05) Approximation methods and heuristics in mathematical programming (90C59) Dynamic programming (90C39)
Related Items (9)
Least squares policy iteration with instrumental variables vs. direct policy search: comparison against optimal benchmarks using energy storage ⋮ Potential-based least-squares policy iteration for a parameterized feedback control system ⋮ New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system ⋮ A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces ⋮ Perspectives of approximate dynamic programming ⋮ Convergence of deep fictitious play for stochastic differential games ⋮ Off-line approximate dynamic programming for the vehicle routing problem with a highly variable customer basis and stochastic demands ⋮ Temporal difference-based policy iteration for optimal control of stochastic systems ⋮ A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems
- Markov chains and stochastic stability
- Simulation-based algorithms for Markov decision processes.
- Model-free \(Q\)-learning designs for linear discrete-time zero-sum games with application to \(H^\infty\) control
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- Adaptive optimal control for continuous-time linear systems based on policy iteration
- Generalized polynomial approximations in Markovian decision processes
- Stochastic optimal control. The discrete time case
- Recursive estimation of regression functions by local polynomial fitting
- Kernel-based reinforcement learning
- Practical issues in temporal difference learning
- \({\mathcal Q}\)-learning
- Simulation-based optimization: Parametric optimization techniques and reinforcement learning
- Feature-based methods for large scale dynamic programming
- Some results on Tchebycheffian spline functions and stochastic processes
- 10.1162/153244303768966102
- Functional Approximations and Dynamic Programming
- Approximations of Dynamic Programs, I
- An analysis of temporal-difference learning with function approximation
- The policy iteration algorithm for average reward Markov decision processes with general state space
- 10.1162/1532443041827907
- Policy Iterations on the Hamilton–Jacobi–Isaacs Equation for $H_{\infty}$ State Feedback Control With Input Saturation
- Approximate Dynamic Programming
- The Kernel Recursive Least-Squares Algorithm
- Neuro-Dynamic Programming: An Overview and Recent Results
- The elements of statistical learning. Data mining, inference, and prediction
This page was built for publication: A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications