Sample Selection Bias Correction Theory
From MaRDI portal
Publication:3529909
DOI10.1007/978-3-540-87987-9_8zbMATH Open1156.68524arXiv0805.2775OpenAlexW1853837125WikidataQ55888974 ScholiaQ55888974MaRDI QIDQ3529909FDOQ3529909
Authors: Corinna Cortes, Mehryar Mohri, Michael D. Riley, Afshin Rostamizadeh
Publication date: 14 October 2008
Published in: Lecture Notes in Computer Science (Search for Journal in Brave)
Abstract: This paper presents a theoretical analysis of sample selection bias correction. The sample bias correction technique commonly used in machine learning consists of reweighting the cost of an error on each training point of a biased sample to more closely reflect the unbiased distribution. This relies on weights derived by various estimation techniques based on finite samples. We analyze the effect of an error in that estimation on the accuracy of the hypothesis returned by the learning algorithm for two estimation techniques: a cluster-based estimation technique and kernel mean matching. We also report the results of sample bias correction experiments with several data sets using these techniques. Our analysis is based on the novel concept of distributional stability which generalizes the existing concept of point-based stability. Much of our work and proof techniques can be used to analyze other importance weighting techniques and their effect on accuracy when using a distributionally stable algorithm.
Full work available at URL: https://arxiv.org/abs/0805.2775
Recommendations
- Automatic bias correction methods in semi-supervised learning
- Generalized sample selection bias correction under RUM
- Domain adaptation and sample bias correction theory and algorithm for regression
- Correcting classifiers for sample selection bias in two-phase case-control studies
- Bootstrap bias corrections for ensemble methods
Cited In (22)
- Domain adaptation and sample bias correction theory and algorithm for regression
- Relative deviation learning bounds and generalization with unbounded loss functions
- Local uncertainty sampling for large-scale multiclass logistic regression
- Title not available (Why is that?)
- A no-free-lunch theorem for multitask learning
- Sampling correctors
- A Bayesian approach to (online) transfer learning: theory and algorithms
- Correcting prevalence estimation for biased sampling with testing errors
- 2-step gradient boosting approach to selectivity bias correction in tax audit: an application to the VAT gap in Italy
- Adaptive KNN and graph-based auto-weighted multi-view consensus spectral learning
- Automatic bias correction methods in semi-supervised learning
- Transition thresholds and transition operators for binarization and edge detection
- Supervised learning under sample selection bias from protein structure databases
- Tweedie’s Formula and Selection Bias
- A theoretical framework for deep transfer learning
- Adaptation based on generalized discrepancy
- Title not available (Why is that?)
- Binary surrogates with stratified samples when weights are unknown
- Mismatched training and test distributions can outperform matched ones
- A theory of learning from different domains
- Correcting classifiers for sample selection bias in two-phase case-control studies
- Small sample bias correction or bias reduction?
This page was built for publication: Sample Selection Bias Correction Theory
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3529909)