Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning (Q5060503): Difference between revisions

From MaRDI portal
Importer (talk | contribs)
Created a new Item
 
ReferenceBot (talk | contribs)
Changed an Item
 
(4 intermediate revisions by 4 users not shown)
Property / describes a project that uses
 
Property / describes a project that uses: OpenAI Gym / rank
 
Normal rank
Property / MaRDI profile type
 
Property / MaRDI profile type: MaRDI publication profile / rank
 
Normal rank
Property / OpenAlex ID
 
Property / OpenAlex ID: W2994709386 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q2925454 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4399904 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Basic properties of strong mixing conditions. A survey and some open questions / rank
 
Normal rank
Property / cites work
 
Property / cites work: Sieve Extremum Estimates for Weakly Dependent Data / rank
 
Normal rank
Property / cites work
 
Property / cites work: Double/debiased machine learning for treatment and structural parameters / rank
 
Normal rank
Property / cites work
 
Property / cites work: Doubly robust policy evaluation and optimization / rank
 
Normal rank
Property / cites work
 
Property / cites work: Efficient estimation of panel data models with sequential moment restrictions / rank
 
Normal rank
Property / cites work
 
Property / cites work: Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5624460 / rank
 
Normal rank
Property / cites work
 
Property / cites work: On the Markov chain central limit theorem / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5148951 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Irregular Identification, Support Conditions, and Inverse Weight Estimation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Consistent estimation of the influence function of locally asymptotically linear estimators / rank
 
Normal rank
Property / cites work
 
Property / cites work: Introduction to empirical processes and semiparametric inference / rank
 
Normal rank
Property / cites work
 
Property / cites work: 10.1162/1532443041827907 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Estimating Dynamic Treatment Regimes in Mobile Health Using V-Learning / rank
 
Normal rank
Property / cites work
 
Property / cites work: Markov Chains and Stochastic Stability / rank
 
Normal rank
Property / cites work
 
Property / cites work: Optimal Dynamic Treatment Regimes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Marginal Mean Models for Dynamic Regimes / rank
 
Normal rank
Property / cites work
 
Property / cites work: Least squares policy evaluation algorithms with linear function approximation / rank
 
Normal rank
Property / cites work
 
Property / cites work: Semiparametric efficiency bounds / rank
 
Normal rank
Property / cites work
 
Property / cites work: Estimation of Regression Coefficients When Some Regressors Are Not Always Observed / rank
 
Normal rank
Property / cites work
 
Property / cites work: Characterization of parameters with a mixed bias property / rank
 
Normal rank
Property / cites work
 
Property / cites work: Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q4626283 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Comment: Understanding OR, PS and DR / rank
 
Normal rank
Property / cites work
 
Property / cites work: Semiparametric theory and missing data. / rank
 
Normal rank
Property / cites work
 
Property / cites work: Q5396665 / rank
 
Normal rank
Property / cites work
 
Property / cites work: Asymptotic Statistics / rank
 
Normal rank
Property / cites work
 
Property / cites work: Least Squares Temporal Difference Methods: An Analysis under General Conditions / rank
 
Normal rank
Property / cites work
 
Property / cites work: On Generalized Bellman Equations and Temporal-Difference Learning / rank
 
Normal rank
links / mardi / namelinks / mardi / name
 

Latest revision as of 06:27, 31 July 2024

scientific article; zbMATH DE number 7640294
Language Label Description Also known as
English
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning
scientific article; zbMATH DE number 7640294

    Statements

    Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning (English)
    0 references
    0 references
    0 references
    10 January 2023
    0 references
    off-policy evaluation
    0 references
    Markov decision processes
    0 references
    infinite horizon
    0 references
    semiparametric efficiency
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references

    Identifiers