Sample Selection Bias Correction Theory

DOI10.1007/978-3-540-87987-9_8zbMATH Open1156.68524arXiv0805.2775OpenAlexW1853837125WikidataQ55888974 ScholiaQ55888974MaRDI QIDQ3529909FDOQ3529909

Authors: Corinna Cortes, Mehryar Mohri, Michael D. Riley, Afshin Rostamizadeh

Publication date: 14 October 2008

Published in: Lecture Notes in Computer Science (Search for Journal in Brave)

Abstract: This paper presents a theoretical analysis of sample selection bias correction. The sample bias correction technique commonly used in machine learning consists of reweighting the cost of an error on each training point of a biased sample to more closely reflect the unbiased distribution. This relies on weights derived by various estimation techniques based on finite samples. We analyze the effect of an error in that estimation on the accuracy of the hypothesis returned by the learning algorithm for two estimation techniques: a cluster-based estimation technique and kernel mean matching. We also report the results of sample bias correction experiments with several data sets using these techniques. Our analysis is based on the novel concept of distributional stability which generalizes the existing concept of point-based stability. Much of our work and proof techniques can be used to analyze other importance weighting techniques and their effect on accuracy when using a distributionally stable algorithm.

Full work available at URL: https://arxiv.org/abs/0805.2775

Recommendations

Mathematics Subject Classification ID

Learning and adaptive systems in artificial intelligence (68T05)

Cited In (22)

This page was built for publication: Sample Selection Bias Correction Theory

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3529909)