Proximal algorithms and temporal difference methods for solving fixed point problems
From MaRDI portal
Publication:721950
DOI10.1007/S10589-018-9990-5zbMATH Open1471.90159OpenAlexW2791413585MaRDI QIDQ721950FDOQ721950
Authors: Dimitri P. Bertsekas
Publication date: 20 July 2018
Published in: Computational Optimization and Applications (Search for Journal in Brave)
Full work available at URL: https://hdl.handle.net/1721.1/131865
Recommendations
- Proximal gradient temporal difference learning: stable reinforcement learning with polynomial sample complexity
- On the existence of fixed points for approximate value iteration and temporal-difference learning
- Linear least-squares algorithms for temporal difference learning
- Publication:3035147
- On the convergence of temporal-difference learning with linear function approximation
Cites Work
- A randomized Kaczmarz algorithm with exponential convergence
- Splitting Algorithms for the Sum of Two Nonlinear Operators
- Convex analysis and monotone operator theory in Hilbert spaces
- On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators
- Sampling algorithms for \(l_2\) regression and applications
- Relative-Error $CUR$ Matrix Decompositions
- Title not available (Why is that?)
- Faster least squares approximation
- Applications of a Splitting Algorithm to Decomposition in Convex Programming and Variational Inequalities
- Finite-Dimensional Variational Inequalities and Complementarity Problems
- Monotone Operators and the Proximal Point Algorithm
- Title not available (Why is that?)
- Title not available (Why is that?)
- Randomized methods for linear constraints: convergence rates and conditioning
- Title not available (Why is that?)
- Gradient-based algorithms with applications to signal-recovery problems
- Approximate Dynamic Programming
- Least squares policy evaluation algorithms with linear function approximation
- Incremental constraint projection methods for variational inequalities
- Approximate policy iteration: a survey and some new methods
- 10.1162/1532443041827907
- Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles
- Linear least-squares algorithms for temporal difference learning
- Approximate dynamic programming with a fuzzy parameterization
- Title not available (Why is that?)
- An Analysis of Stochastic Shortest Path Problems
- An analysis of temporal-difference learning with function approximation
- Dynamic programming and optimal control. Vol. 2
- Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix
- Algorithms for reinforcement learning.
- Title not available (Why is that?)
- Q-learning and enhanced policy iteration in discounted dynamic programming
- Projected equation methods for approximate solution of large linear systems
- Technical update: Least-squares temporal difference learning
- Q-learning and policy iteration algorithms for stochastic shortest path problems
- Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication
- On the method of multipliers for convex programming
- Convergence Results for Some Temporal Difference Methods Based on Least Squares
- Title not available (Why is that?)
- A Retrospective and Prospective Survey of the Monte Carlo Method
- A note on the behavior of the randomized Kaczmarz algorithm of Strohmer and Vershynin
- Error bounds for approximations from projected linear equations
- Title not available (Why is that?)
- Least squares temporal difference methods: An analysis under general conditions
- Stabilization of Stochastic Iterative Methods for Singular and Nearly Singular Linear Systems
- Temporal Difference Methods for General Projected Equations
- Title not available (Why is that?)
- Near-Optimal Column-Based Matrix Reconstruction
- Title not available (Why is that?)
Cited In (4)
Uses Software
This page was built for publication: Proximal algorithms and temporal difference methods for solving fixed point problems
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q721950)