Q-learning with censored data
From MaRDI portal
Publication:450048
DOI10.1214/12-AOS968zbMATH Open1246.62206arXiv1205.6659WikidataQ34323965 ScholiaQ34323965MaRDI QIDQ450048FDOQ450048
Authors: Y. Goldberg, Michael R. Kosorok
Publication date: 3 September 2012
Published in: The Annals of Statistics (Search for Journal in Brave)
Abstract: We develop methodology for a multistage decision problem with flexible number of stages in which the rewards are survival times that are subject to censoring. We present a novel Q-learning algorithm that is adjusted for censored data and allows a flexible number of stages. We provide finite sample bounds on the generalization error of the policy learned by the algorithm, and show that when the optimal Q-function belongs to the approximation space, the expected survival time for policies obtained by the algorithm converges to that of the optimal policy. We simulate a multistage clinical trial with flexible number of stages and apply the proposed censored-Q-learning algorithm to find individualized treatment regimens. The methodology presented in this paper has implications in the design of personalized medicine trials in cancer and in other life-threatening diseases.
Full work available at URL: https://arxiv.org/abs/1205.6659
Recommendations
Censored data models (62N01) Applications of statistics to biology and medical sciences; meta analysis (62P10) Medical applications (general) (92C50)
Cites Work
- Weak convergence and empirical processes. With applications to statistics
- Introduction to empirical processes and semiparametric inference
- The Kaplan–Meier Estimator as an Inverse-Probability-of-Censoring Weighted Average
- Title not available (Why is that?)
- Support Vector Machines
- Estimation of Regression Coefficients When Some Regressors Are Not Always Observed
- Causal effect models for realistic individualized treatment and intention to treat rules
- Association, causation, and marginal structural models.
- \({\mathcal Q}\)-learning
- Reinforcement learning strategies for clinical trials in nonsmall cell lung cancer
- A generalization error for Q-learning
- Optimal Dynamic Treatment Regimes
- Optimal Structural Nested Models for Optimal Sequential Decisions
- Demystifying Optimal Dynamic Treatment Regimes
- Support vector censored quantile regression under random censoring
- Neural Network Learning
- A Dvoretzky-Kiefer-Wolfowitz type inequality for the Kaplan-Meier estimator.
- Feature-based methods for large scale dynamic programming
- Causal inference on the difference of the restricted mean lifetime between two groups
- Semiparametric efficient estimation of survival distributions in two-stage randomisation designs in clinical trials with censored data
- Title not available (Why is that?)
- Estimation of survival quantiles in two-stage randomization designs
- Estimation of survival distributions of treatment policies in two-stage randomization designs in clinical trials
- Restricted Mean Life with Covariates: Modification and Extension of a Useful Survival Analysis Method
- An exponential bound for Cox regression
- Q-learning with censored data
- On an exponential bound for the Kaplan-Meier estimator
Cited In (35)
- Flexible inference of optimal individualized treatment strategy in covariate adjusted randomization with multiple covariates
- Adaptive Algorithm for Multi-Armed Bandit Problem with High-Dimensional Covariates
- Threshold Estimation in Proportional Mean Residual Life Model
- Transformation-Invariant Learning of Optimal Individualized Decision Rules with Time-to-Event Outcomes
- Nearly Dimension-Independent Sparse Linear Bandit over Small Action Spaces via Best Subset Selection
- Functional feature construction for individualized treatment regimes
- Semiparametric single-index models for optimal treatment regimens with censored outcomes
- Adaptive contrast weighted learning for multi-stage multi-treatment decision-making
- Quantile-optimal treatment regimes
- Reflections on the concept of optimality of single decision point treatment regimes
- Q-learning with censored data
- Incorporating patient preferences into estimation of optimal individualized treatment rules
- Reinforcement learning strategies for clinical trials in nonsmall cell lung cancer
- Learning Non-monotone Optimal Individualized Treatment Regimes
- Multicategory Angle-Based Learning for Estimating Optimal Dynamic Treatment Regimes With Censored Data
- Finite sample variance estimation for optimal dynamic treatment regimes of survival outcomes
- Multi-threshold proportional hazards model and subgroup identification
- \(i\)Fusion: individualized fusion learning
- Constructing dynamic treatment regimes with shared parameters for censored data
- Quantifying treatment effects using the personalized chance of longer survival
- Robust method for optimal treatment decision making based on survival data
- Ascertaining properties of weighting in the estimation of optimal treatment regimes under monotone missingness
- A cure-rate model for Q-learning: estimating an adaptive immunosuppressant treatment strategy for allogeneic hematopoietic cell transplant patients
- Optimal dynamic treatment regimes with survival endpoints: introducing DWSurv in the R package DTRreg
- Privacy-preserving estimation of an optimal individualized treatment rule: a case study in maximizing time to severe depression-related outcomes
- A General Framework for Subgroup Detection via One-Step Value Difference Estimation
- Doubly robust estimation of optimal dynamic treatment regimes with multicategory treatments and survival outcomes
- Estimating optimal dynamic treatment regimes with survival outcomes
- Dynamic treatment regimes using Bayesian additive regression trees for censored outcomes
- Fairness-Oriented Learning for Optimal Individualized Treatment Rules
- Estimation for optimal treatment regimes with survival data under semiparametric model
- Assessing quantile prediction with censored quantile regression models
- Multi-Armed Angle-Based Direct Learning for Estimating Optimal Individualized Treatment Rules With Various Outcomes
- Multithreshold change plane model: estimation theory and applications in subgroup identification
- Optimal Treatment Regimes: A Review and Empirical Comparison
This page was built for publication: Q-learning with censored data
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q450048)