Automated Selection of Post-Strata using a Model-Assisted Regression Tree Estimator
From MaRDI portal
Abstract: Auxiliary information can increase the efficiency of survey estimators through an assisting model when the model captures some of the relationship between the auxiliary data and the study variables. Despite their superior properties, model-assisted estimators are rarely used in anything but their simplest form by statistical agencies to produce official statistics. This is due to the fact that the more complicated models that have been used in model-assisted estimation are often ill suited to the available auxiliary data. Under a model-assisted framework, we propose a regression tree estimator for a finite population total. Regression tree models are adept at handling the type of auxiliary data usually available in the sampling frame and provide a model that is easy to explain and justify. The estimator can be viewed as a post-stratification estimator where the post-strata are automatically selected by the recursive partitioning algorithm of the regression tree. We establish consistency of the regression tree estimator and compare its performance to other survey estimators using the US Bureau of Labor Statistics Occupational Employment Statistics Survey.
Recommendations
- Tree-based models for fitting stratified linear regression models
- scientific article; zbMATH DE number 2062524
- Semi-automated simultaneous predictor selection for regression-SARIMA models
- Efficient and adaptive post-model-selection estimators
- Automatic model selection for partially linear models
- Model selection for (auto-)regression with dependent data
- Variable Selection and Interaction Detection with Bayesian Additive Regression Trees
Cited in
(11)- Statistical inference in the presence of imputed survey data through regression trees and random forests
- A review of tree-based methods for analyzing survey data
- Model-Assisted Estimation Through Random Forests in Finite Population Sampling
- On making valid inferences by integrating data from surveys and other sources
- Review of the third edition of sampling: design and analysis
- Comments on: ``Deville and Särndal's calibration: revisiting a 25 years old successful optimization problem
- mase
- Design-unbiased statistical learning in survey sampling
- Building consistent regression trees from complex sample data
- Model-assisted estimation in high-dimensional settings for survey data
- Active Sampling: A Machine-Learning-Assisted Framework for Finite Population Inference with Optimal Subsamples
This page was built for publication: Automated Selection of Post-Strata using a Model-Assisted Regression Tree Estimator
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q124129)