Penalized Q-learning for dynamic treatment regimens

DOI10.5705/ss.2012.364zbMath1415.62054arXiv1108.5338OpenAlexW2015687733WikidataQ40642766 ScholiaQ40642766MaRDI QIDQ2950196

Michael R. Kosorok, Donglin Zeng, Rui Song, Wei-wei Wang

Publication date: 8 October 2015

Published in: Statistica Sinica (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1108.5338

zbMATH Keywords

shrinkage Q-learning two-stage procedure multi-stage individual selection dynamic treatment regimen penalized Q-learning

Mathematics Subject Classification ID

Asymptotic properties of parametric estimators (62F12) Ridge regression; shrinkage estimators (Lasso) (62J07) Applications of statistics to biology and medical sciences; meta analysis (62P10)

Related Items (24)

Quantile-Optimal Treatment Regimes ⋮ High-dimensional inference for personalized treatment decision ⋮ Adaptive treatment and robust control ⋮ Dynamic treatment regimes: technical challenges and applications ⋮ Comment on ``Dynamic treatment regimes: technical challenges and applications ⋮ Fairness-Oriented Learning for Optimal Individualized Treatment Rules ⋮ Dynamic Causal Effects Evaluation in A/B Testing with a Reinforcement Learning Framework ⋮ Model-Assisted Uniformly Honest Inference for Optimal Treatment Regimes in High Dimension ⋮ Optimal Treatment Regimes: A Review and Empirical Comparison ⋮ A multiagent reinforcement learning framework for off-policy evaluation in two-sided markets ⋮ Transformation-Invariant Learning of Optimal Individualized Decision Rules with Time-to-Event Outcomes ⋮ Statistically Efficient Advantage Learning for Offline Reinforcement Learning in Infinite Horizons ⋮ Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection ⋮ Flexible inference of optimal individualized treatment strategy in covariate adjusted randomization with multiple covariates ⋮ Learning Non-monotone Optimal Individualized Treatment Regimes ⋮ Dynamic treatment regimes using Bayesian additive regression trees for censored outcomes ⋮ Resampling‐based confidence intervals for model‐free robust inference on optimal treatment regimes ⋮ A Sequential Significance Test for Treatment by Covariate Interactions ⋮ Unnamed Item ⋮ Regularized outcome weighted subgroup identification for differential treatment effects ⋮ Sequential Advantage Selection for Optimal Treatment Regimes ⋮ Proper Inference for Value Function in High-Dimensional Q-Learning for Dynamic Treatment Regimes ⋮ Generalization error bounds of dynamic treatment regimes in penalized regression-based learning ⋮ Personalized Policy Learning Using Longitudinal Mobile Health Data

This page was built for publication: Penalized Q-learning for dynamic treatment regimens