Information-Based Optimal Subdata Selection for Big Data Linear Regression

From MaRDI portal
Publication:5229921

DOI10.1080/01621459.2017.1408468zbMath1478.62196arXiv1710.10382OpenAlexW3099924168MaRDI QIDQ5229921

John Stufken, Hai Ying Wang, Min Yang

Publication date: 19 August 2019

Published in: Journal of the American Statistical Association (Search for Journal in Brave)

Full work available at URL: https://arxiv.org/abs/1710.10382




Related Items (59)

Optimal Distributed Subsampling for Maximum Quasi-Likelihood Estimators With Massive DataRobust active learning with binary responsesDistributed subdata selection for big data via sampling-based approachSequential online subsampling for thinning experimental designsOptimal subsample selection for massive logistic regression with distributed dataScore-matching representative approach for big data analysis with generalized linear modelsA two-stage optimal subsampling estimation for missing data problems with large-scale dataRandomized Spectral Clustering in Large-Scale Stochastic Block ModelsInversion-free subsampling Newton's method for large sample logistic regressionOptimal Sampling for Generalized Linear Models Under Measurement ConstraintsLowCon: A Design-based Subsampling Approach in a Misspecified Linear ModelOnline Updating of Survival AnalysisUnnamed ItemGaussian Process Prediction using Design-Based SubsamplingSurface temperature monitoring in liver procurement via functional variance change-point analysisModel Checking in Large-Scale Dataset via Structure-Adaptive-SamplingDivide and conquer for accelerated failure time model with massive time‐to‐event dataOptimal subsampling for large‐sample quantile regression with massive dataFast Calibration for Computer Models with Massive Physical ObservationsInformation-based optimal subdata selection for big data logistic regressionOptimal subsampling for multiplicative regression with massive dataOnline updating method to correct for measurement error in big data streamsSubsampling spectral clustering for stochastic block models in large-scale networksGlobal debiased DC estimations for biased estimators via pro forma regressionInformation-based optimal subdata selection for non-linear modelsOptimal subsampling design for polynomial regression in one covariateAccounting for outliers in optimal subsampling methodsA model robust subsampling approach for generalised linear models in big data settingsPredictive Subdata Selection for Computer ModelsSketched approximation of regularized canonical correlation analysisOptimal subsampling for softmax regressionSubdata selection based on orthogonal array for big dataGeneralized linear models for massive data via doubly-sketchingOptimal subsampling algorithms for composite quantile regression in massive dataOptimal sampling designs for multidimensional streaming time series with application to power grid sensor dataAdaptive iterative Hessian sketch via \(A\)-optimal subsamplingSubsampling in longitudinal modelsUnnamed ItemLIC criterion for optimal subset selection in distributed interval estimationExperimental Design Issues in Big Data: The Question of BiasCrawling subsampling for multivariate spatial autoregression model in large-scale networksRandomized sketches for kernel CCAOn greedy heuristics for computing D-efficient saturated subsetsUnnamed ItemOrthogonal subsampling for big data linear regressionOptimal subsampling for large-scale quantile regressionSurprise sampling: improving and extending the local case-control samplingOptimal designs for model averaging in non-nested modelsModel-robust subdata selection for big dataAccounting for Factor Variables in Big Data RegressionDivide-and-conquer information-based optimal subdata selection algorithmAscent with quadratic assistance for the construction of exact experimental designsParallel-and-stream accelerator for computationally fast supervised learningOptimal subsampling for composite quantile regression in big dataOptimal subsampling for least absolute relative error estimators with massive dataModel-free global likelihood subsampling for massive dataOn stochastic Kaczmarz type methods for solving large scale systems of ill-posed equationsComments on ``Data science, big data and statisticsSubdata selection algorithm for linear model discrimination


Uses Software


Cites Work


This page was built for publication: Information-Based Optimal Subdata Selection for Big Data Linear Regression