Semi-supervised learning with density-ratio estimation
From MaRDI portal
Abstract: In this paper, we study statistical properties of semi-supervised learning, which is considered as an important problem in the community of machine learning. In the standard supervised learning, only the labeled data is observed. The classification and regression problems are formalized as the supervised learning. In semi-supervised learning, unlabeled data is also obtained in addition to labeled data. Hence, exploiting unlabeled data is important to improve the prediction accuracy in semi-supervised learning. This problems is regarded as a semiparametric estimation problem with missing data. Under the the discriminative probabilistic models, it had been considered that the unlabeled data is useless to improve the estimation accuracy. Recently, it was revealed that the weighted estimator using the unlabeled data achieves better prediction accuracy in comparison to the learning method using only labeled data, especially when the discriminative probabilistic model is misspecified. That is, the improvement under the semiparametric model with missing data is possible, when the semiparametric model is misspecified. In this paper, we apply the density-ratio estimator to obtain the weight function in the semi-supervised learning. The benefit of our approach is that the proposed estimator does not require well-specified probabilistic models for the probability of the unlabeled data. Based on the statistical asymptotic theory, we prove that the estimation accuracy of our method outperforms the supervised learning using only labeled data. Some numerical experiments present the usefulness of our methods.
Recommendations
Cites work
- A paradox concerning nuisance parameters and projected estimating functions
- Asymptotic Statistics
- Asymptotic theory for the semiparametric accelerated failure time model with missing data
- Covariate shift adaptation by importance weighted cross validation
- Density ratio estimation in machine learning. Foreword by Thomas G. Dietterich
- Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score
- Elements of Information Theory
- Estimation of Regression Coefficients When Some Regressors Are Not Always Observed
- Importance Sampling Via the Estimated Sampler
- Improving predictive inference under covariate shift by weighting the log-likelihood function
- Inferences for case-control and semiparametric two-sample density ratio models
- Information geometry of estimating functions in semi-parametric statistical models
- Semi-supervised learning on Riemannian manifolds
- Soft margins for AdaBoost
- Statistical analysis of kernel-based least-squares density-ratio estimation
- Text classification from labeled and unlabeled documents using EM
- The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter
Cited in
(22)- Semi-supervised logistic discrimination via labeled data and unlabeled data from different sampling distributions
- Efficient semiparametric estimation in two-sample comparison via semisupervised learning
- A General M-estimation Theory in Semi-Supervised Framework
- Semi-Supervised Linear Regression
- Doubly Robust Augmented Model Accuracy Transfer Inference with High Dimensional Features
- Unbiased generative semi-supervised learning
- Safe semi-supervised learning based on weighted likelihood
- Asymptotic comparison of semi-supervised and supervised linear discriminant functions for heteroscedastic normal populations
- Improve efficiency of doubly robust estimator when propensity score is misspecified
- Semi-supervised learning of class balance under class-prior change by distribution matching
- Finite-sample analysis of impacts of unlabeled data and their labeling mechanisms in linear discriminant analysis
- A novel semisupervised support vector machine classifier based on active learning and context information
- scientific article; zbMATH DE number 6253975 (Why is no real title available?)
- Density-sensitive semisupervised inference
- Semi-supervised inference: general theory and estimation of means
- Semi-supervised learning based on high density region estimation
- The use of unlabeled data in predictive modeling
- Efficient and adaptive linear regression in semi-supervised settings
- Robust and efficient semi-supervised learning for Ising model
- Semi-supervised learning via constraints
- On semi-supervised estimation using exponential tilt mixture models
- On semi-supervised linear regression in covariate shift problems
This page was built for publication: Semi-supervised learning with density-ratio estimation
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q374187)