Improving estimation efficiency for two-phase, outcome-dependent sampling studies
From MaRDI portal
Publication:6158213
Abstract: Two-phase outcome dependent sampling (ODS) is widely used in many fields, especially when certain covariates are expensive and/or difficult to measure. For two-phase ODS, the conditional maximum likelihood (CML) method is very attractive because it can handle zero Phase 2 selection probabilities and avoids modeling the covariate distribution. However, most existing CML-based methods use only the Phase 2 sample and thus may be less efficient than other methods. We propose a general empirical likelihood method that uses CML augmented with additional information in the whole Phase 1 sample to improve estimation efficiency. The proposed method maintains the ability to handle zero selection probabilities and avoids modeling the covariate distribution, but can lead to substantial efficiency gains over CML in the inexpensive covariates, or in the influential covariate when a surrogate is available, because of an effective use of the Phase 1 data. Simulations and a real data illustration using NHANES data are presented.
Cites work
- scientific article; zbMATH DE number 1085997 (Why is no real title available?)
- A Generalization of Sampling Without Replacement From a Finite Universe
- A Pseudoscore Estimator for Regression Problems With Two-Phase Sampling
- A mean score method for missing and auxiliary covariate data in regression models
- A semiparametric empirical likelihood method for data from an outcome-dependent sampling scheme with a continuous outcome
- An Estimated Likelihood Method for Continuous Outcome Regression Models With Outcome-Dependent Sampling
- Case-control studies
- Empirical and conditional likelihoods for two‐phase studies
- Empirical likelihood
- Empirical likelihood and general estimating equations
- Empirical likelihood estimation using auxiliary summary information with different covariate distributions
- Empirical likelihood in missing data problems
- Estimation of Regression Coefficients When Some Regressors Are Not Always Observed
- Fitting regression models to case-control data by maximum likelihood
- Fitting regression models with response-biased samples
- Improving the Efficiency of Relative-Risk Estimation in Case-Cohort Studies
- Likelihood methods for regression models with expensive variables missing by design
- Logistic regression for two-stage case-control data
- Miscellanea. Combining parametric and empirical likelihoods
- More efficient estimators for case-cohort studies
- Score tests for association under response-dependent sampling designs for expensive covariates
- Semiparametric Methods for Response-Selective and Missing Data Problems in Regression
- Semiparametric maximum likelihood for missing covariates in parametric regression
- Statistical analysis with missing data
Cited in
(5)- Statistical Inference for a Two-Stage Outcome-Dependent Sampling Design with a Continuous Outcome
- Efficient use of a two-stage randomized response procedure
- Causal Inference in Outcome-Dependent Two-Phase Sampling Designs
- A semiparametric method for risk prediction using integrated electronic health record data
- Novel two‐phase sampling designs for studying binary outcomes
This page was built for publication: Improving estimation efficiency for two-phase, outcome-dependent sampling studies
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6158213)