Optimal Integrating Learning for Split Questionnaire Design Type Data
From MaRDI portal
Publication:6180730
DOI10.1080/10618600.2022.2118753arXiv2108.02905MaRDI QIDQ6180730FDOQ6180730
Authors:
Publication date: 22 January 2024
Published in: Journal of Computational and Graphical Statistics (Search for Journal in Brave)
Abstract: In the era of data science, it is common to encounter data with different subsets of variables obtained for different cases. An example is the split questionnaire design (SQD), which is adopted to reduce respondent fatigue and improve response rates by assigning different subsets of the questionnaire to different sampled respondents. A general question then is how to estimate the regression function based on such block-wise observed data. Currently, this is often carried out with the aid of missing data methods, which may unfortunately suffer intensive computational cost, high variability, and possible large modeling biases in real applications. In this article, we develop a novel approach for estimating the regression function for SQD-type data. We first construct a list of candidate models using available data-blocks separately, and then combine the estimates properly to make an efficient use of all the information. We show the resulting averaged model is asymptotically optimal in the sense that the squared loss and risk are asymptotically equivalent to those of the best but infeasible averaged estimator. Both simulated examples and an application to the SQD dataset from the European Social Survey show the promise of the proposed method.
Full work available at URL: https://arxiv.org/abs/2108.02905
Cites Work
- Bayesian model averaging: A tutorial. (with comments and a rejoinder).
- Title not available (Why is that?)
- Title not available (Why is that?)
- Flexible imputation of missing data
- Asymptotic optimality for \(C_ p\), \(C_ L\), cross-validation and generalized cross-validation: Discrete index set
- Title not available (Why is that?)
- Adaptive Regression by Mixing
- Missing-Data Methods for Generalized Linear Models
- Least squares model averaging by Mallows criterion
- Least Squares Model Averaging
- Jackknife model averaging
- Aligning Estimates for Common Variables in Two or More Sample Surveys
- Parametric or nonparametric? A parametricness index for model selection
- Adaptive Minimax Estimation over Sparse $\ell_q$-Hulls
- Combining Independent Regression Estimators From Multiple Surveys
- A weight-relaxed model averaging approach for high-dimensional generalized linear models
- A model-averaging approach for high-dimensional regression
- Parsimonious Model Averaging With a Diverging Number of Parameters
- Model averaging with covariates that are missing completely at random
- Integrating Multisource Block-Wise Missing Data in Model Selection
- Split questionnaire designs: collecting only the data that you need through MCAR and MAR designs
This page was built for publication: Optimal Integrating Learning for Split Questionnaire Design Type Data
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6180730)