Unsupervised domain adaptation with non-stochastic missing data
From MaRDI portal
Abstract: We consider unsupervised domain adaptation (UDA) for classification problems in the presence of missing data in the unlabelled target domain. More precisely, motivated by practical applications, we analyze situations where distribution shift exists between domains and where some components are systematically absent on the target domain without available supervision for imputing the missing target components. We propose a generative approach for imputation. Imputation is performed in a domain-invariant latent space and leverages indirect supervision from a complete source domain. We introduce a single model performing joint adaptation, imputation and classification which, under our assumptions, minimizes an upper bound of its target generalization error and performs well under various representative divergence families (H-divergence, Optimal Transport). Moreover, we compare the target error of our Adaptation-imputation framework and the "ideal" target error of a UDA classifier without missing target components. Our model is further improved with self-training, to bring the learned source and target class posterior distributions closer. We perform experiments on three families of datasets of different modalities: a classical digit classification benchmark, the Amazon product reviews dataset both commonly used in UDA and real-world digital advertising datasets. We show the benefits of jointly performing adaptation, classification and imputation on these datasets.
Recommendations
- On the Hardness of Domain Adaptation and the Utility of Unlabeled Target Samples
- MapFlow: latent transition via normalizing flow for unsupervised domain adaptation
- Reducing bias to source samples for unsupervised domain adaptation
- PC-GAIN: pseudo-label conditional generative adversarial imputation networks for incomplete data
- Unsupervised domain adaptation in the wild via disentangling representation learning
Cites work
- scientific article; zbMATH DE number 1834445 (Why is no real title available?)
- A survey on concept drift adaptation
- A theory of learning from different domains
- Computational optimal transport. With applications to data sciences
- Domain adaptation and sample bias correction theory and algorithm for regression
- Flexible imputation of missing data
- Inference and missing data
- Learning from multiple sources
Cited in
(4)- NaCL: noise-robust cross-domain contrastive learning for unsupervised domain adaptation
- On the Hardness of Domain Adaptation and the Utility of Unlabeled Target Samples
- Reducing bias to source samples for unsupervised domain adaptation
- PC-GAIN: pseudo-label conditional generative adversarial imputation networks for incomplete data
This page was built for publication: Unsupervised domain adaptation with non-stochastic missing data
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2066667)