Projected state-action balancing weights for offline reinforcement learning (Q6183753)

scientific article; zbMATH DE number 7783513

Language	Label	Description	Also known as
default for all languages	No label defined
English	Projected state-action balancing weights for offline reinforcement learning	scientific article; zbMATH DE number 7783513

Statements

instance of

scholarly article

0 references

title

Projected state-action balancing weights for offline reinforcement learning (English)

0 references

0 references

0 references

0 references

The Annals of Statistics

0 references

publication date

4 January 2024

0 references

full work available at URL

https://arxiv.org/abs/2109.04640

0 references

https://projecteuclid.org/journals/annals-of-statistics/volume-51/issue-4/Projected-state-action-balancing-weights-for-offline-reinforcement-learning/10.1214/23-AOS2302.full

0 references

zbMATH Keywords

infinite horizons

0 references

Markov decision process

0 references

policy evaluation

0 references

reinforcement learning

0 references

MaRDI profile type

MaRDI publication profile

0 references

cites work

Approximate residual balancing: debiased inference of average treatment effects in high dimensions

0 references

Some new asymptotic theory for least squares series: pointwise and uniform results

0 references

Globally efficient non-parametric inference of average treatment effects by empirical balancing calibration weighting

0 references

Optimal sup-norm rates and uniform inference on nonlinear functionals of nonparametric IV regression

0 references

Q4112369

0 references

Regularized policy iteration with nonparametric function spaces

0 references

Regularized least-squares regression: learning from a sequence

0 references

Q3655724

0 references

A distribution-free theory of nonparametric regression

0 references

Large Sample Properties of Generalized Method of Moments Estimators

0 references

Nonparametric estimation of an additive model with a link function

0 references

Personalized Policy Learning Using Longitudinal Mobile Health Data

0 references

Covariate balancing propensity score

0 references

Generalized optimal matching methods for causal inference

0 references

Double reinforcement learning for efficient off-policy evaluation in Markov decision processes

0 references

Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning

0 references

Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data

0 references

Dynamic treatment regimes: technical challenges and applications

0 references

Off-policy estimation of long-term average outcomes with applications to mobile health

0 references

Batch policy learning in average reward Markov decision processes

0 references

Estimating dynamic treatment regimes in mobile health using V-learning

0 references

Optimal Dynamic Treatment Regimes

0 references

Marginal Mean Models for Dynamic Regimes

0 references

Instrumental Variable Estimation of Nonparametric Models

0 references

Q4315289

0 references

Estimation of Regression Coefficients When Some Regressors Are Not Always Observed

0 references

High-dimensional \(A\)-learning for optimal dynamic treatment regimes

0 references

Optimal global rates of convergence for nonparametric regression

0 references

Q3996207

0 references

Quantile-optimal treatment regimes

0 references

Minimal dispersion approximately balancing weights: asymptotic properties and practical considerations

0 references

Kernel-based covariate functional balancing for observational studies

0 references

New statistical learning methods for estimating optimal dynamic treatment regimes

0 references

Identifiers

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:6183753