Proximal algorithms and temporal difference methods for solving fixed point problems

DOI10.1007/S10589-018-9990-5zbMATH Open1471.90159OpenAlexW2791413585MaRDI QIDQ721950FDOQ721950

Publication date: 20 July 2018

Published in: Computational Optimization and Applications (Search for Journal in Brave)

Full work available at URL: https://hdl.handle.net/1721.1/131865

Recommendations

Proximal gradient temporal difference learning: stable reinforcement learning with polynomial sample complexity
On the existence of fixed points for approximate value iteration and temporal-difference learning
Linear least-squares algorithms for temporal difference learning
Publication:3035147
On the convergence of temporal-difference learning with linear function approximation

zbMATH Keywords

convex optimization dynamic programming fixed point problems proximal algorithm temporal differences

Mathematics Subject Classification ID

Convex programming (90C25) Dynamic programming (90C39)

Cites Work

A randomized Kaczmarz algorithm with exponential convergence
Splitting Algorithms for the Sum of Two Nonlinear Operators
Convex analysis and monotone operator theory in Hilbert spaces
On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators
Sampling algorithms for $l_2$ regression and applications
Relative-Error $CUR$ Matrix Decompositions
Title not available (Why is that?)
Faster least squares approximation
Applications of a Splitting Algorithm to Decomposition in Convex Programming and Variational Inequalities
Finite-Dimensional Variational Inequalities and Complementarity Problems
Monotone Operators and the Proximal Point Algorithm
Title not available (Why is that?)
Title not available (Why is that?)
Randomized methods for linear constraints: convergence rates and conditioning
Title not available (Why is that?)
Gradient-based algorithms with applications to signal-recovery problems
Approximate Dynamic Programming
Least squares policy evaluation algorithms with linear function approximation
Incremental constraint projection methods for variational inequalities
Approximate policy iteration: a survey and some new methods
10.1162/1532443041827907
Optimal adaptive control and differential games by reinforcement learning principles
Linear least-squares algorithms for temporal difference learning
Approximate dynamic programming with a fuzzy parameterization
Title not available (Why is that?)
An Analysis of Stochastic Shortest Path Problems
An analysis of temporal-difference learning with function approximation
Dynamic programming and optimal control. Vol. 2
Fast Monte Carlo Algorithms for Matrices II: Computing a Low-Rank Approximation to a Matrix
Algorithms for reinforcement learning.
Abstract dynamic programming
Q-learning and enhanced policy iteration in discounted dynamic programming
Projected equation methods for approximate solution of large linear systems
Technical update: Least-squares temporal difference learning
Q-learning and policy iteration algorithms for stochastic shortest path problems
Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication
On the method of multipliers for convex programming
Convergence Results for Some Temporal Difference Methods Based on Least Squares
Convex optimization algorithms
A Retrospective and Prospective Survey of the Monte Carlo Method
A note on the behavior of the randomized Kaczmarz algorithm of Strohmer and Vershynin
Error bounds for approximations from projected linear equations
Title not available (Why is that?)
Least squares temporal difference methods: An analysis under general conditions
Stabilization of stochastic iterative methods for singular and nearly singular linear systems
Temporal Difference Methods for General Projected Equations
Performance bounds for $\lambda $ policy iteration and application to the game of Tetris
Near-optimal column-based matrix reconstruction
Title not available (Why is that?)

Cited In (5)

Proximal gradient temporal difference learning: stable reinforcement learning with polynomial sample complexity
On the existence of fixed points for approximate value iteration and temporal-difference learning
A proximal algorithm with quasi distance. Application to habit's formation
Extension of $\lambda$-PIR for weakly contractive operators via fixed point theory
The prox-Tikhonov-like forward-backward method and applications

Uses Software

Approxrl

This page was built for publication: Proximal algorithms and temporal difference methods for solving fixed point problems

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q721950)