Distributed subdata selection for big data via sampling-based approach

DOI10.1016/J.CSDA.2020.107072OpenAlexW3082486183MaRDI QIDQ830596FDOQ830596

Authors: Haixiang Zhang, Haiying Wang

Publication date: 7 May 2021

Published in: Computational Statistics and Data Analysis (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1016/j.csda.2020.107072

Recommendations

Information-Based Optimal Subdata Selection for Big Data Linear Regression
Optimal subsample selection for massive logistic regression with distributed data
Information-based optimal subdata selection for big data logistic regression
Orthogonal subsampling for big data linear regression
Optimal subsampling algorithms for big data regressions

zbMATH Keywords

big data allocation sizes distributed subsampling optimal subsampling regression diagnostic

Mathematics Subject Classification ID

Statistics (62-XX)

Cites Work

Asymptotic Statistics
Title not available (Why is that?)
A statistical perspective on algorithmic leveraging
A partially linear framework for massive heterogeneous data
Fast approximation of matrix coherence and statistical leverage
Information-Based Optimal Subdata Selection for Big Data Linear Regression
Distributed testing and estimation under sparse high dimensional models
Distributed inference for quantile regression processes
A massive data framework for M-estimators with cubic-rate
Online updating method with new variables for big data streams
Optimal subsampling for large sample logistic regression
An online updating approach for testing the proportional hazards assumption with streams of survival data
More efficient estimation for logistic regression with optimal subsamples
Communication-efficient distributed statistical inference

Cited In (21)

Approximating Partial Likelihood Estimators via Optimal Subsampling
Orthogonal subsampling for big data linear regression
Subdata selection algorithm for linear model discrimination
Optimal decorrelated score subsampling for generalized linear models with massive data
Distributed optimal subsampling for quantile regression with massive data
Optimal subsampling for multiplicative regression with massive data
The COR criterion for optimal subset selection in distributed estimation
Information-based optimal subdata selection for big data logistic regression
Communication‐efficient low‐dimensional parameter estimation and inference for high‐dimensional Lp$$ {L}^p $$‐quantile regression
Information-Based Optimal Subdata Selection for Big Data Linear Regression
Unweighted estimation based on optimal sample under measurement constraints
Model-robust subdata selection for big data
Communication-efficient estimation for distributed subset selection
Optimal Poisson subsampling decorrelated score for high-dimensional generalized linear models
Optimal sampling algorithms for block matrix multiplication
Optimal subsampling for modal regression in massive data
A review on design inspired subsampling for big data
Sampling-based estimation for massive survival data with additive hazards model
Score-matching representative approach for big data analysis with generalized linear models
Selective priorities in processing of big data
Communication-efficient distributed estimation of partially linear additive models for large-scale data

This page was built for publication: Distributed subdata selection for big data via sampling-based approach

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q830596)