Doubly robust policy evaluation and optimization
From MaRDI portal
Publication:252797
DOI10.1214/14-STS500zbMath1331.62059arXiv1503.02834OpenAlexW3098679278WikidataQ62763562 ScholiaQ62763562MaRDI QIDQ252797
Dumitru Erhan, Lihong Li, Miroslav Dudík, John Langford
Publication date: 4 March 2016
Published in: Statistical Science (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1503.02834
Related Items
PAC-Bayesian lifelong learning for multi-armed bandits, Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning, Unnamed Item, Augmented direct learning for conditional average treatment effect estimation with double robustness, A Single-Index Model With a Surface-Link for Optimizing Individualized Dose Rules, Constructing effective personalized policies using counterfactual inference from biased data sets with many features, A multiagent reinforcement learning framework for off-policy evaluation in two-sided markets, Constrained Bayesian optimization with noisy experiments, Importance sampling in reinforcement learning with an estimated behavior policy, Nonparametric Causal Effects Based on Incremental Propensity Score Interventions, Unnamed Item, Learning When-to-Treat Policies, Selecting and Ranking Individualized Treatment Rules With Unmeasured Confounding, Unnamed Item, Optimal policy trees, Toward theoretical understandings of robust Markov decision processes: sample complexity and asymptotics, Batch policy learning in average reward Markov decision processes, Doubly Robust Crowdsourcing, Doubly robust policy evaluation and optimization
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Doubly robust policy evaluation and optimization
- Demystifying double robustness: a comparison of alternative strategies for estimating a population mean from incomplete data
- On tail probabilities for martingales
- Improving predictive inference under covariate shift by weighting the log-likelihood function
- Semiparametric regression estimation in the presence of dependent censoring
- Some results on generalized difference estimation and generalized regression estimation for finite populations
- Estimation of Regression Coefficients When Some Regressors Are Not Always Observed
- Marginal Mean Models for Dynamic Regimes
- Optimal Dynamic Treatment Regimes
- A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect
- The Nonstochastic Multiarmed Bandit Problem
- Semiparametric Efficiency in Multivariate Regression Models with Missing Data
- A Robust Method for Estimating Optimal Treatment Regimes
- Pattern-recognizing stochastic learning automata
- On model selection and model misspecification in causal inference
- A Generalization of Sampling Without Replacement From a Finite Universe
- Some aspects of the sequential design of experiments