Constructing effective personalized policies using counterfactual inference from biased data sets with many features
From MaRDI portal
Publication:2425241
Abstract: This paper proposes a novel approach for constructing effective personalized policies when the observed data lacks counter-factual information, is biased and possesses many features. The approach is applicable in a wide variety of settings from healthcare to advertising to education to finance. These settings have in common that the decision maker can observe, for each previous instance, an array of features of the instance, the action taken in that instance, and the reward realized -- but not the rewards of actions that were not taken: the counterfactual information. Learning in such settings is made even more difficult because the observed data is typically biased by the existing policy (that generated the data) and because the array of features that might affect the reward in a particular instance -- and hence should be taken into account in deciding on an action in each particular instance -- is often vast. The approach presented here estimates propensity scores for the observed data, infers counterfactuals, identifies a (relatively small) number of features that are (most) relevant for each possible action and instance, and prescribes a policy to be followed. Comparison of the proposed algorithm against the state-of-art algorithm on actual datasets demonstrates that the proposed algorithm achieves a significant improvement in performance.
Recommendations
- More efficient policy learning via optimal retargeting
- Counterfactual reasoning and learning systems: the example of computational advertising
- Policy learning with observational data
- Efficient augmentation and relaxation learning for individualized treatment rules using observational data
- A reinforcement learning approach to personalized learning recommendation systems
Cites work
- scientific article; zbMATH DE number 6823187 (Why is no real title available?)
- scientific article; zbMATH DE number 6542809 (Why is no real title available?)
- 10.1162/153244303322753751
- A simple method for estimating interactions between a treatment and a large number of covariates
- Contextual bandits with similarity information
- Counterfactual reasoning and learning systems: the example of computational advertising
- Doubly robust policy evaluation and optimization
- Estimation and Inference of Heterogeneous Treatment Effects using Random Forests
- Exploration-exploitation tradeoff using variance estimates in multi-armed bandits
- Feature selection for unsupervised learning
- Feature selection via dependence maximization
- Pattern classification.
- Recursive partitioning for heterogeneous causal effects
- Simple statistical gradient-following algorithms for connectionist reinforcement learning
- The central role of the propensity score in observational studies for causal effects
- Theoretical and empirical analysis of ReliefF and RReliefF
- Use of the Logistic Model in Retrospective Studies
Cited in
(1)
This page was built for publication: Constructing effective personalized policies using counterfactual inference from biased data sets with many features
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2425241)