Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis
From MaRDI portal
Publication:6198738
DOI10.1287/opre.2023.2450arXiv2102.06548MaRDI QIDQ6198738
Yuxin Chen, Yuting Wei, Yuejie Chi, Unnamed Author, Changxiao Cai
Publication date: 20 March 2024
Published in: Operations Research (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/2102.06548
lower boundoverestimationtemporal difference learningQ-learningminimax optimalitysample complexityeffective horizon
This page was built for publication: Is Q-Learning Minimax Optimal? A Tight Sample Complexity Analysis