Simple and Optimal Methods for Stochastic Variational Inequalities, II: Markovian Noise and Policy Evaluation in Reinforcement Learning (Q5081106): Difference between revisions

Revision as of 04:18, 29 July 2024

scientific article; zbMATH DE number 7535638

Language	Label	Description	Also known as
English	Simple and Optimal Methods for Stochastic Variational Inequalities, II: Markovian Noise and Policy Evaluation in Reinforcement Learning	scientific article; zbMATH DE number 7535638

Statements

instance of

scholarly article

0 references

title

Simple and Optimal Methods for Stochastic Variational Inequalities, II: Markovian Noise and Policy Evaluation in Reinforcement Learning (English)

0 references

0 references

0 references

0 references

SIAM Journal on Optimization

0 references

publication date

1 June 2022

0 references

full work available at URL

https://arxiv.org/abs/2011.08434

0 references

zbMATH Keywords

variational inequality

0 references

operator extrapolation

0 references

acceleration

0 references

reinforcement learning

0 references

temporal difference learning

0 references

stochastic policy evaluation

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Q3997575

0 references

Stable Optimal Control and Semicontractive Dynamic Programming

0 references

Stochastic optimal control. The discrete time case

0 references

Q2934010

0 references

The minimax learning rates of normal and Ising undirected graphical models

0 references

Ergodic Mirror Descent

0 references

Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: A Generic Algorithmic Framework

0 references

Mixing time estimation in reversible Markov chains from a single sample path

0 references

Statistical Inference via Convex Optimization

0 references

Deterministic and stochastic primal-dual subgradient algorithms for uniformly convex minimization

0 references

OnActor-Critic Algorithms

0 references

Q4421713

0 references

10.1162/1532443041827907

0 references

First-order and stochastic optimization methods for machine learning

0 references

Q4595047

0 references

Markov Chains and Stochastic Stability

0 references

Information-based complexity of linear operator equations

0 references

Q4315289

0 references

Convergence Rates for Markov Chains

0 references

Introduction to Stochastic Search and Optimization

0 references

Q4626283

0 references

An analysis of temporal-difference learning with function approximation

0 references

Identifiers

zbMATH Open document ID

1493.90205

0 references

DOI

10.1137/20M1381691

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:5081106

@@ Property / cites work @@
+Q3997575
@@ Property / cites work: Q3997575 / rank @@
+Normal rank
@@ Property / cites work @@
+Stable Optimal Control and Semicontractive Dynamic Programming
+Normal rank
@@ Property / cites work @@
+Stochastic optimal control. The discrete time case
+Normal rank
@@ Property / cites work @@
+Q2934010
@@ Property / cites work: Q2934010 / rank @@
+Normal rank
@@ Property / cites work @@
+The minimax learning rates of normal and Ising undirected graphical models
+Normal rank
@@ Property / cites work @@
+Ergodic Mirror Descent
@@ Property / cites work: Ergodic Mirror Descent / rank @@
+Normal rank
@@ Property / cites work @@
+Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization I: A Generic Algorithmic Framework
+Normal rank
@@ Property / cites work @@
+Mixing time estimation in reversible Markov chains from a single sample path
+Normal rank
@@ Property / cites work @@
+Statistical Inference via Convex Optimization
@@ Property / cites work: Statistical Inference via Convex Optimization / rank @@
+Normal rank
@@ Property / cites work @@
+Deterministic and stochastic primal-dual subgradient algorithms for uniformly convex minimization
+Normal rank
@@ Property / cites work @@
+OnActor-Critic Algorithms
@@ Property / cites work: OnActor-Critic Algorithms / rank @@
+Normal rank
@@ Property / cites work @@
+Q4421713
@@ Property / cites work: Q4421713 / rank @@
+Normal rank
@@ Property / cites work @@
+.1162/1532443041827907
@@ Property / cites work: 10.1162/1532443041827907 / rank @@
+Normal rank
@@ Property / cites work @@
+First-order and stochastic optimization methods for machine learning
+Normal rank
@@ Property / cites work @@
+Q4595047
@@ Property / cites work: Q4595047 / rank @@
+Normal rank
@@ Property / cites work @@
+Markov Chains and Stochastic Stability
@@ Property / cites work: Markov Chains and Stochastic Stability / rank @@
+Normal rank
@@ Property / cites work @@
+Information-based complexity of linear operator equations
+Normal rank
@@ Property / cites work @@
+Q4315289
@@ Property / cites work: Q4315289 / rank @@
+Normal rank
@@ Property / cites work @@
+Convergence Rates for Markov Chains
@@ Property / cites work: Convergence Rates for Markov Chains / rank @@
+Normal rank
@@ Property / cites work @@
+Introduction to Stochastic Search and Optimization
+Normal rank
@@ Property / cites work @@
+Q4626283
@@ Property / cites work: Q4626283 / rank @@
+Normal rank
@@ Property / cites work @@
+An analysis of temporal-difference learning with function approximation
+Normal rank