Policy space identification in configurable environments
From MaRDI portal
Publication:2163245
DOI10.1007/s10994-021-06033-3OpenAlexW3196519516MaRDI QIDQ2163245
Guglielmo Manneschi, Alberto Maria Metelli, Marcello Restelli
Publication date: 10 August 2022
Published in: Machine Learning (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1909.03984
likelihood ratio testreinforcement learningconfigurable Markov decision processespolicy space identification
Uses Software
Cites Work
- The philosophy of Bayes factors and the quantification of statistical evidence
- A tail inequality for quadratic forms of subgaussian random vectors
- Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path
- Rates of convergence for empirical processes of stationary mixing sequences
- Generalized inverses. Theory and applications.
- Concentration Inequalities
- Nonasymptotic sequential tests for overlapping hypotheses applied to near-optimal arm identification in bandit models
- Identification in Parametric Models
- The Large-Sample Distribution of the Likelihood Ratio for Testing Composite Hypotheses
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item