Unsupervised domain adaptation with non-stochastic missing data

DOI10.1007/S10618-021-00775-3MaRDI QIDQ2066667zbMATH OpenOpenAlexFDO

Authors Matthieu Kirchmeyer, Patrick Gallinari, Alain Rakotomamonjy, Amin Mantrach

Publication date 14 January 2022

Published in Data Mining and Knowledge Discovery (Search for Journal in Brave)

Full work available at URL https://arxiv.org/abs/2109.09505

deep learning missing data imputation semi-supervised learning unsupervised domain adaptation digital advertising

Classification and discrimination; cluster analysis (statistical aspects) (62H30) Artificial neural networks and deep learning (68T07) Missing data (62D10)

Abstract: We consider unsupervised domain adaptation (UDA) for classification problems in the presence of missing data in the unlabelled target domain. More precisely, motivated by practical applications, we analyze situations where distribution shift exists between domains and where some components are systematically absent on the target domain without available supervision for imputing the missing target components. We propose a generative approach for imputation. Imputation is performed in a domain-invariant latent space and leverages indirect supervision from a complete source domain. We introduce a single model performing joint adaptation, imputation and classification which, under our assumptions, minimizes an upper bound of its target generalization error and performs well under various representative divergence families (H-divergence, Optimal Transport). Moreover, we compare the target error of our Adaptation-imputation framework and the "ideal" target error of a UDA classifier without missing target components. Our model is further improved with self-training, to bring the learned source and target class posterior distributions closer. We perform experiments on three families of datasets of different modalities: a classical digit classification benchmark, the Amazon product reviews dataset both commonly used in UDA and real-world digital advertising datasets. We show the benefits of jointly performing adaptation, classification and imputation on these datasets.

Recommendations

Cites work

Cited in

(4)

Describes a project that uses

Uses Software

This page was built for publication: Unsupervised domain adaptation with non-stochastic missing data

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2066667)