scientific article; zbMATH DE number 6253929
From MaRDI portal
zbMath1280.68208MaRDI QIDQ5396665
Shin Ishii, Shin-ichi Maeda, Motoaki Kawanabe, Tsuyoshi Ueno
Publication date: 3 February 2014
Full work available at URL: http://www.jmlr.org/papers/v12/ueno11a.html
Title: zbMATH Open Web Interface contents unavailable due to conflicting licenses.
reinforcement learningestimating functionTD learningmodel-free policy evaluationsemiparametirc model
Asymptotic distribution theory in statistics (62E20) Nonparametric estimation (62G05) Learning and adaptive systems in artificial intelligence (68T05)
Related Items
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning, Online Bootstrap Inference For Policy Evaluation In Reinforcement Learning, Asymptotic analysis of value prediction by well-specified and misspecified models, On Generalized Bellman Equations and Temporal-Difference Learning