Statistical inference for streamed longitudinal data
From MaRDI portal
Abstract: Modern longitudinal data, for example from wearable devices, measures biological signals on a fixed set of participants at a diverging number of time points. Traditional statistical methods are not equipped to handle the computational burden of repeatedly analyzing the cumulatively growing dataset each time new data is collected. We propose a new estimation and inference framework for dynamic updating of point estimates and their standard errors across serially collected dependent datasets. The key technique is a decomposition of the extended score function of the quadratic inference function constructed over the cumulative longitudinal data into a sum of summary statistics over data batches. We show how this sum can be recursively updated without the need to access the whole dataset, resulting in a computationally efficient streaming procedure with minimal loss of statistical efficiency. We prove consistency and asymptotic normality of our streaming estimator as the number of data batches diverges, even as the number of independent participants remains fixed. Simulations highlight the advantages of our approach over traditional statistical methods that assume independence between data batches. Finally, we investigate the relationship between physical activity and several diseases through the analysis of accelerometry data from the National Health and Nutrition Examination Survey.
Cited in
(10)- An online updating approach for estimating and testing mediation effects with big data streams
- Inference for high-dimensional streamed longitudinal data
- Real-time inference for smoothing quantile regression on streaming datasets with heterogeneity detection
- Online causal inference with application to near real-time post-market vaccine safety surveillance
- Online inference in high-dimensional generalized linear models with streaming data
- Online sequential leveraging sampling method for streaming autoregressive time series with application to seismic data
- Collaborative inference for accelerated failure time model using clinical center-level summary statistics
- Subsampled one-step estimation for fast statistical inference
- Online inference in high-dimensional regression with streaming clustered data
- The effect of the working correlation on fitting models to longitudinal data
This page was built for publication: Statistical inference for streamed longitudinal data
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6188735)