Accelerated and Instance-Optimal Policy Evaluation with Linear Function Approximation

From MaRDI portal

Publication:5885838

Jump to:navigation, search

DOI10.1137/21M1468668MaRDI QIDQ5885838

Ashwin Pananjady, Tianjiao Li, Guanghui Lan

Publication date: 30 March 2023

Published in: SIAM Journal on Mathematics of Data Science (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/2112.13109

zbMATH Keywords

acceleration variance reduction policy evaluation temporal difference Markovian noise

Mathematics Subject Classification ID

Inference from stochastic processes and prediction (62M20) Analysis of algorithms and problem complexity (68Q25) Abstract computational complexity for mathematical programming problems (90C60) Stochastic programming (90C15) Estimation and detection in stochastic control theory (93E10)

Related Items

Improved estimation of relaxation time in nonreversible Markov chains

Uses Software

Saga

Cites Work

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:5885838&oldid=30751531"