Sample Selection Bias Correction Theory
From MaRDI portal
Publication:3529909
Abstract: This paper presents a theoretical analysis of sample selection bias correction. The sample bias correction technique commonly used in machine learning consists of reweighting the cost of an error on each training point of a biased sample to more closely reflect the unbiased distribution. This relies on weights derived by various estimation techniques based on finite samples. We analyze the effect of an error in that estimation on the accuracy of the hypothesis returned by the learning algorithm for two estimation techniques: a cluster-based estimation technique and kernel mean matching. We also report the results of sample bias correction experiments with several data sets using these techniques. Our analysis is based on the novel concept of distributional stability which generalizes the existing concept of point-based stability. Much of our work and proof techniques can be used to analyze other importance weighting techniques and their effect on accuracy when using a distributionally stable algorithm.
Recommendations
- Automatic bias correction methods in semi-supervised learning
- Generalized sample selection bias correction under RUM
- Domain adaptation and sample bias correction theory and algorithm for regression
- Correcting classifiers for sample selection bias in two-phase case-control studies
- Bootstrap bias corrections for ensemble methods
Cited in
(22)- Supervised learning under sample selection bias from protein structure databases
- Domain adaptation and sample bias correction theory and algorithm for regression
- Adaptive KNN and graph-based auto-weighted multi-view consensus spectral learning
- Adaptation based on generalized discrepancy
- A theoretical framework for deep transfer learning
- A Bayesian approach to (online) transfer learning: theory and algorithms
- Small sample bias correction or bias reduction?
- Automatic bias correction methods in semi-supervised learning
- Binary surrogates with stratified samples when weights are unknown
- scientific article; zbMATH DE number 7415073 (Why is no real title available?)
- A theory of learning from different domains
- Local uncertainty sampling for large-scale multiclass logistic regression
- scientific article; zbMATH DE number 7370519 (Why is no real title available?)
- 2-step gradient boosting approach to selectivity bias correction in tax audit: an application to the VAT gap in Italy
- Correcting classifiers for sample selection bias in two-phase case-control studies
- Transition thresholds and transition operators for binarization and edge detection
- Relative deviation learning bounds and generalization with unbounded loss functions
- A no-free-lunch theorem for multitask learning
- Mismatched training and test distributions can outperform matched ones
- Correcting prevalence estimation for biased sampling with testing errors
- Tweedie’s Formula and Selection Bias
- Sampling correctors
This page was built for publication: Sample Selection Bias Correction Theory
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3529909)