A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications
From MaRDI portal
Publication:2887630
DOI10.1007/s11768-011-0313-yzbMath1249.90306WikidataQ115144927 ScholiaQ115144927MaRDI QIDQ2887630
Publication date: 1 June 2012
Published in: Journal of Control Theory and Applications (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s11768-011-0313-y
68T05: Learning and adaptive systems in artificial intelligence
90C59: Approximation methods and heuristics in mathematical programming
90C39: Dynamic programming
Related Items
A perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costs, Potential-based least-squares policy iteration for a parameterized feedback control system, New approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution system, A perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces, Perspectives of approximate dynamic programming, Temporal difference-based policy iteration for optimal control of stochastic systems
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Neural network approach to continuous-time direct adaptive optimal control for partially unknown nonlinear systems
- Markov chains and stochastic stability
- Simulation-based algorithms for Markov decision processes.
- Model-free \(Q\)-learning designs for linear discrete-time zero-sum games with application to \(H^\infty\) control
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- Adaptive optimal control for continuous-time linear systems based on policy iteration
- Generalized polynomial approximations in Markovian decision processes
- Stochastic optimal control. The discrete time case
- Recursive estimation of regression functions by local polynomial fitting
- Kernel-based reinforcement learning
- Practical issues in temporal difference learning
- \({\mathcal Q}\)-learning
- Simulation-based optimization: Parametric optimization techniques and reinforcement learning
- Feature-based methods for large scale dynamic programming
- Some results on Tchebycheffian spline functions and stochastic processes
- 10.1162/153244303768966102
- Functional Approximations and Dynamic Programming
- Approximations of Dynamic Programs, I
- An analysis of temporal-difference learning with function approximation
- The policy iteration algorithm for average reward Markov decision processes with general state space
- 10.1162/1532443041827907
- Policy Iterations on the Hamilton–Jacobi–Isaacs Equation for $H_{\infty}$ State Feedback Control With Input Saturation
- Approximate Dynamic Programming
- The Kernel Recursive Least-Squares Algorithm
- Neuro-Dynamic Programming: An Overview and Recent Results
- The elements of statistical learning. Data mining, inference, and prediction