Least squares policy evaluation algorithms with linear function approximation

From MaRDI portal
Revision as of 11:43, 1 February 2024 by Import240129110113 (talk | contribs) (Created automatically from import240129110113)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Publication:1870310

DOI10.1023/A:1022192903948zbMath1030.93061MaRDI QIDQ1870310

Dimitri P. Bertsekas, Angelia Nedić

Publication date: 11 May 2003

Published in: Discrete Event Dynamic Systems (Search for Journal in Brave)




Related Items (22)

Approximate policy iteration: a survey and some new methodsPotential-based least-squares policy iteration for a parameterized feedback control systemAn online prediction algorithm for reinforcement learning with linear function approximation using cross entropy methodEfficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement LearningBatch mode reinforcement learning based on the synthesis of artificial trajectoriesA concentration bound for \(\operatorname{LSPE}( \lambda )\)Reinforcement learning algorithms with function approximation: recent advances and applicationsTemporal difference-based policy iteration for optimal control of stochastic systemsAsymptotic analysis of temporal-difference learning algorithms with constant step-sizesDynamic modeling and control of supply chain systems: A reviewAsymptotic analysis of temporal-difference learning algorithms with constant step-sizesReal-time reinforcement learning by sequential actor-critics and experience replayProximal algorithms and temporal difference methods for solving fixed point problemsKernel dynamic policy programming: applicable reinforcement learning to robot systems with high dimensional statesA note on linear function approximation using random projectionsA formal framework and extensions for function approximation in learning classifier systemsProjected equation methods for approximate solution of large linear systemsVariance Regularization in Sequential Bayesian OptimizationTransmission scheduling for multi-process multi-sensor remote estimation via approximate dynamic programmingFinite-Time Performance of Distributed Temporal-Difference Learning with Linear Function ApproximationAllocating resources via price management systems: a dynamic programming-based approachUnnamed Item







This page was built for publication: Least squares policy evaluation algorithms with linear function approximation