Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning (Q5060503): Difference between revisions

Latest revision as of 06:27, 31 July 2024

scientific article; zbMATH DE number 7640294

Language	Label	Description	Also known as
English	Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning	scientific article; zbMATH DE number 7640294

Statements

instance of

scholarly article

0 references

title

Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning (English)

0 references

0 references

0 references

0 references

10 January 2023

0 references

full work available at URL

https://arxiv.org/abs/1909.05850

0 references

zbMATH Keywords

off-policy evaluation

0 references

Markov decision processes

0 references

infinite horizon

0 references

semiparametric efficiency

0 references

describes a project that uses

OpenAI Gym

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Q2925454

0 references

Q4399904

0 references

Basic properties of strong mixing conditions. A survey and some open questions

0 references

Sieve Extremum Estimates for Weakly Dependent Data

0 references

Double/debiased machine learning for treatment and structural parameters

0 references

Doubly robust policy evaluation and optimization

0 references

Efficient estimation of panel data models with sequential moment restrictions

0 references

Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score

0 references

Q5624460

0 references

On the Markov chain central limit theorem

0 references

Q5148951

0 references

Irregular Identification, Support Conditions, and Inverse Weight Estimation

0 references

Consistent estimation of the influence function of locally asymptotically linear estimators

0 references

Introduction to empirical processes and semiparametric inference

0 references

10.1162/1532443041827907

0 references

Estimating Dynamic Treatment Regimes in Mobile Health Using V-Learning

0 references

Markov Chains and Stochastic Stability

0 references

Optimal Dynamic Treatment Regimes

0 references

Marginal Mean Models for Dynamic Regimes

0 references

Least squares policy evaluation algorithms with linear function approximation

0 references

Semiparametric efficiency bounds

0 references

Estimation of Regression Coefficients When Some Regressors Are Not Always Observed

0 references

Characterization of parameters with a mixed bias property

0 references

Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models

0 references

Q4626283

0 references

Comment: Understanding OR, PS and DR

0 references

Semiparametric theory and missing data.

0 references

Q5396665

0 references

Asymptotic Statistics

0 references

Least Squares Temporal Difference Methods: An Analysis under General Conditions

0 references

On Generalized Bellman Equations and Temporal-Difference Learning

0 references

Identifiers

DOI

10.1287/opre.2021.2249

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:5060503

@@ Property / describes a project that uses @@
+OpenAI Gym
@@ Property / describes a project that uses: OpenAI Gym / rank @@
+Normal rank
@@ Property / MaRDI profile type @@
+MaRDI publication profile
@@ Property / MaRDI profile type: MaRDI publication profile / rank @@
+Normal rank
@@ Property / OpenAlex ID @@
+W2994709386
@@ Property / OpenAlex ID: W2994709386 / rank @@
+Normal rank
@@ Property / cites work @@
+Q2925454
@@ Property / cites work: Q2925454 / rank @@
+Normal rank
@@ Property / cites work @@
+Q4399904
@@ Property / cites work: Q4399904 / rank @@
+Normal rank
@@ Property / cites work @@
+Basic properties of strong mixing conditions. A survey and some open questions
+Normal rank
@@ Property / cites work @@
+Sieve Extremum Estimates for Weakly Dependent Data
+Normal rank
@@ Property / cites work @@
+Double/debiased machine learning for treatment and structural parameters
+Normal rank
@@ Property / cites work @@
+Doubly robust policy evaluation and optimization
@@ Property / cites work: Doubly robust policy evaluation and optimization / rank @@
+Normal rank
@@ Property / cites work @@
+Efficient estimation of panel data models with sequential moment restrictions
+Normal rank
@@ Property / cites work @@
+Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score
+Normal rank
@@ Property / cites work @@
+Q5624460
@@ Property / cites work: Q5624460 / rank @@
+Normal rank
@@ Property / cites work @@
+On the Markov chain central limit theorem
@@ Property / cites work: On the Markov chain central limit theorem / rank @@
+Normal rank
@@ Property / cites work @@
+Q5148951
@@ Property / cites work: Q5148951 / rank @@
+Normal rank
@@ Property / cites work @@
+Irregular Identification, Support Conditions, and Inverse Weight Estimation
+Normal rank
@@ Property / cites work @@
+Consistent estimation of the influence function of locally asymptotically linear estimators
+Normal rank
@@ Property / cites work @@
+Introduction to empirical processes and semiparametric inference
+Normal rank
@@ Property / cites work @@
+.1162/1532443041827907
@@ Property / cites work: 10.1162/1532443041827907 / rank @@
+Normal rank
@@ Property / cites work @@
+Estimating Dynamic Treatment Regimes in Mobile Health Using V-Learning
+Normal rank
@@ Property / cites work @@
+Markov Chains and Stochastic Stability
@@ Property / cites work: Markov Chains and Stochastic Stability / rank @@
+Normal rank
@@ Property / cites work @@
+Optimal Dynamic Treatment Regimes
@@ Property / cites work: Optimal Dynamic Treatment Regimes / rank @@
+Normal rank
@@ Property / cites work @@
+Marginal Mean Models for Dynamic Regimes
@@ Property / cites work: Marginal Mean Models for Dynamic Regimes / rank @@
+Normal rank
@@ Property / cites work @@
+Least squares policy evaluation algorithms with linear function approximation
+Normal rank
@@ Property / cites work @@
+Semiparametric efficiency bounds
@@ Property / cites work: Semiparametric efficiency bounds / rank @@
+Normal rank
@@ Property / cites work @@
+Estimation of Regression Coefficients When Some Regressors Are Not Always Observed
+Normal rank
@@ Property / cites work @@
+Characterization of parameters with a mixed bias property
+Normal rank
@@ Property / cites work @@
+Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models
+Normal rank
@@ Property / cites work @@
+Q4626283
@@ Property / cites work: Q4626283 / rank @@
+Normal rank
@@ Property / cites work @@
+Comment: Understanding OR, PS and DR
@@ Property / cites work: Comment: Understanding OR, PS and DR / rank @@
+Normal rank
@@ Property / cites work @@
+Semiparametric theory and missing data.
@@ Property / cites work: Semiparametric theory and missing data. / rank @@
+Normal rank
@@ Property / cites work @@
+Q5396665
@@ Property / cites work: Q5396665 / rank @@
+Normal rank
@@ Property / cites work @@
+Asymptotic Statistics
@@ Property / cites work: Asymptotic Statistics / rank @@
+Normal rank
@@ Property / cites work @@
+Least Squares Temporal Difference Methods: An Analysis under General Conditions
+Normal rank
@@ Property / cites work @@
+On Generalized Bellman Equations and Temporal-Difference Learning
+Normal rank
@@ links / mardi / name / links / mardi / name @@
+Publication:5060503