Doubly robust policy evaluation and optimization

From MaRDI portal

Publication:252797

Jump to:navigation, search

DOI10.1214/14-STS500zbMath1331.62059arXiv1503.02834OpenAlexW3098679278WikidataQ62763562 ScholiaQ62763562MaRDI QIDQ252797

Dumitru Erhan, Lihong Li, Miroslav Dudík, John Langford

Publication date: 4 March 2016

Published in: Statistical Science (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1503.02834

zbMATH Keywords

causal inference contextual bandits doubly robust estimators

Mathematics Subject Classification ID

Sequential estimation (62L12) General considerations in statistical decision theory (62C05)

Related Items

PAC-Bayesian lifelong learning for multi-armed bandits, Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning, Unnamed Item, Augmented direct learning for conditional average treatment effect estimation with double robustness, A Single-Index Model With a Surface-Link for Optimizing Individualized Dose Rules, Constructing effective personalized policies using counterfactual inference from biased data sets with many features, A multiagent reinforcement learning framework for off-policy evaluation in two-sided markets, Constrained Bayesian optimization with noisy experiments, Importance sampling in reinforcement learning with an estimated behavior policy, Nonparametric Causal Effects Based on Incremental Propensity Score Interventions, Unnamed Item, Learning When-to-Treat Policies, Selecting and Ranking Individualized Treatment Rules With Unmeasured Confounding, Unnamed Item, Optimal policy trees, Toward theoretical understandings of robust Markov decision processes: sample complexity and asymptotics, Batch policy learning in average reward Markov decision processes, Doubly Robust Crowdsourcing, Doubly robust policy evaluation and optimization

Uses Software

Cites Work

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Publication:252797&oldid=12142366"