WONDER: weighted one-shot distributed ridge regression in high dimensions
From MaRDI portal
Publication:4969115
Abstract: In many areas, practitioners need to analyze large datasets that challenge conventional single-machine computing. To scale up data analysis, distributed and parallel computing approaches are increasingly needed. Here we study a fundamental and highly important problem in this area: How to do ridge regression in a distributed computing environment? Ridge regression is an extremely popular method for supervised learning, and has several optimality properties, thus it is important to study. We study one-shot methods that construct weighted combinations of ridge regression estimators computed on each machine. By analyzing the mean squared error in a high dimensional random-effects model where each predictor has a small effect, we discover several new phenomena. 1. Infinite-worker limit: The distributed estimator works well for very large numbers of machines, a phenomenon we call "infinite-worker limit". 2. Optimal weights: The optimal weights for combining local estimators sum to more than unity, due to the downward bias of ridge. Thus, all averaging methods are suboptimal. We also propose a new Weighted ONe-shot DistributEd Ridge regression (WONDER) algorithm. We test WONDER in simulation studies and using the Million Song Dataset as an example. There it can save at least 100x in computation time, while nearly preserving test accuracy.
Recommendations
Cites work
- scientific article; zbMATH DE number 996442 (Why is no real title available?)
- scientific article; zbMATH DE number 51132 (Why is no real title available?)
- scientific article; zbMATH DE number 1964693 (Why is no real title available?)
- scientific article; zbMATH DE number 6982986 (Why is no real title available?)
- scientific article; zbMATH DE number 3244317 (Why is no real title available?)
- A Deterministic Equivalent for the Analysis of Correlated MIMO Multiple Access Channels
- A partially linear framework for massive heterogeneous data
- A split-and-conquer approach for analysis of
- Algorithmic aspects of parallel data processing
- Communication lower bounds for statistical estimation problems via a distributed data processing inequality
- Communication-efficient algorithms for statistical optimization
- Communication-efficient distributed statistical inference
- Communication-efficient sparse regression
- Computational Limits of A Distributed Algorithm For Smoothing Spline
- Deterministic equivalents for certain functionals of large random matrices
- Distributed estimation of principal eigenspaces
- Distributed inference for linear support vector machine
- Distributed inference for quantile regression processes
- Distributed learning with regularized least squares
- Distributed optimization and statistical learning via the alternating direction method of multipliers
- Distributed testing and estimation under sparse high dimensional models
- Divide and conquer in nonstandard problems and the super-efficiency phenomenon
- Divide and conquer kernel ridge regression: a distributed algorithm with minimax optimal rates
- Flexible results for quadratic forms with applications to variance components estimation
- Free Random Variables
- High-dimensional asymptotics of prediction: ridge regression and classification
- Large sample covariance matrices and high-dimensional data analysis
- Learning theory of distributed regression with bias corrected regularization kernel network
- Lectures on the Combinatorics of Free Probability
- On high-dimensional misspecified mixed model analysis in genome-wide association study
- On the optimality of averaging in distributed statistical learning
- Parallel Programming
- Quantile regression under memory constraint
- REML estimation: Asymptotic behavior and related topics
- Random matrix methods for wireless communications.
- Random matrix theory in statistics: a review
- Ridge regression and asymptotic minimax estimation over spheres of growing dimension
- Spectral analysis of large dimensional random matrices
- Spectral convergence for a general class of random matrices
- Variance estimation in high-dimensional linear models
Cited in
(9)- scientific article; zbMATH DE number 7415114 (Why is no real title available?)
- Canonical thresholding for nonsparse high-dimensional linear regression
- High-Dimensional Analysis of Double Descent for Linear Regression with Random Projections
- Distributed estimation with empirical likelihood
- Distributed estimation and inference for semiparametric binary response models
- WONDER
- Distributed linear regression by averaging
- Estimation and inference in sparse multivariate regression and conditional Gaussian graphical models under an unbalanced distributed setting
- scientific article; zbMATH DE number 7415098 (Why is no real title available?)
This page was built for publication: WONDER: weighted one-shot distributed ridge regression in high dimensions
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q4969115)