Breaking the sample complexity barrier to regret-optimal model-free reinforcement learning
Publication:6039766
DOI10.1093/imaiai/iaac034zbMath1522.68473arXiv2110.04645OpenAlexW3206149081MaRDI QIDQ6039766
Yuejie Chi, Laixi Shi, Yuxin Chen, Unnamed Author
Publication date: 23 May 2023
Published in: Information and Inference: A Journal of the IMA (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2110.04645
variance reductionQ-learningupper confidence boundslower confidence boundsmemory efficiencymodel-free RL
Learning and adaptive systems in artificial intelligence (68T05) Markov and semi-Markov decision processes (90C40) Online algorithms; streaming algorithms (68W27) Computational aspects of data analysis and big data (68T09)
This page was built for publication: Breaking the sample complexity barrier to regret-optimal model-free reinforcement learning