Approximate policy iteration: a survey and some new methods

From MaRDI portal
Revision as of 19:32, 3 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:2887629

DOI10.1007/S11768-011-1005-3zbMath1249.90179OpenAlexW2124477018MaRDI QIDQ2887629

Dimitri P. Bertsekas

Publication date: 1 June 2012

Published in: Journal of Control Theory and Applications (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1007/s11768-011-1005-3




Related Items (37)

Undiscounted control policy generation for continuous-valued optimal control by approximate dynamic programmingPotential-based least-squares policy iteration for a parameterized feedback control systemA partial history of the early development of continuous-time nonlinear stochastic systems theoryValue iteration and adaptive dynamic programming for data-driven adaptive optimal control designNew approximate dynamic programming algorithms for large-scale undiscounted Markov decision processes and their application to optimize a production and distribution systemApproximate dynamic programming for the dispatch of military medical evacuation assetsA perturbation approach to a class of discounted approximate value iteration algorithms with Borel spacesPerspectives of approximate dynamic programmingDiscrete-time dynamic graphical games: model-free reinforcement learning solutionHuman motor learning is robust to control-dependent noiseApproximate dynamic programming for the military inventory routing problemSimple and Optimal Methods for Stochastic Variational Inequalities, I: Operator ExtrapolationImproved value iteration for neural-network-based stochastic optimal control designRobust adaptive dynamic programming for linear and nonlinear systems: an overviewH optimal control of unknown linear systems by adaptive dynamic programming with applications to time‐delay systemsSmoothing policies and safe policy gradientsA Lyapunov-based version of the value iteration algorithm formulated as a discrete-time switched affine systemUnnamed ItemData-driven optimal control via linear transfer operators: a convex approachOn the Taylor Expansion of Value FunctionsTemporal difference-based policy iteration for optimal control of stochastic systemsApproximate dynamic programming for missile defense interceptor fire controlA Machine Learning Approach to Adaptive Robust Utility Maximization and HedgingProximal algorithms and temporal difference methods for solving fixed point problemsTracking control optimization scheme for a class of partially unknown fuzzy systems by using integral reinforcement learning architectureAn Approximate Dynamic Programming Algorithm for Monotone Value FunctionsEmpirical Dynamic ProgrammingA perturbation approach to approximate value iteration for average cost Markov decision processes with Borel spaces and bounded costsTime-varying Markov decision processes with state-action-dependent discount factors and unbounded costsBias-policy iteration based adaptive dynamic programming for unknown continuous-time linear systemsConvex optimization with an interpolation-based projection and its application to deep learningIncremental constraint projection methods for variational inequalitiesMultiply Accelerated Value Iteration for NonSymmetric Affine Fixed Point Problems and Application to Markov Decision ProcessesRobust Reinforcement Learning for Stochastic Linear Quadratic Control with Multiplicative NoiseApproximative Policy Iteration for Exit Time Feedback Control Problems Driven by Stochastic Differential Equations using Tensor Train FormatAllocating resources via price management systems: a dynamic programming-based approachUnnamed Item


Uses Software



Cites Work




This page was built for publication: Approximate policy iteration: a survey and some new methods