Distributed linear regression by averaging (Q2039793)

From MaRDI portal

Jump to:navigation, search

scientific article

Language	Label	Description	Also known as
English	Distributed linear regression by averaging	scientific article

Statements

scholarly article

0 references

Distributed linear regression by averaging (English)

0 references

zbMATH Open document ID

0 references

10.1214/20-AOS1984

0 references

0 references

0 references

The Annals of Statistics

0 references

publication date

5 July 2021

0 references

full work available at URL

https://arxiv.org/abs/1810.00412

0 references

Distributed machine learning systems have been receiving increasing attentions for their efficiency to process large-scale data. The distributed regression via the divide and conquer approach consists of three stages. Firstly, the data is partitioned into multiple subsets. Then a base regression algorithm is applied to each subset to learn a local regression model. Finally the local models are averaged to generate the final regression model for the purpose of predictive analytics or statistical inference. This approach is computationally efficient because the second stage can be easily parallelized. Also, because the local model training does not require mutual communication between the computing nodes, it can largely preserve privacy and confidentiality. Recently, research on the statistical properties and learning performance of distributed regression has attracted increasing attention. The asymptotically minimax optimal learning rate in many situations has been verified for kernel ridge regression (see [\textit{Y. Zhang}, \textit{J. Duchi} and \textit{M. Wainwright}, ``Divide and conquer kernel ridge regression'', in: Conference on Learning Theory. 592--617 (2013), \url{https://proceedings.mlr.press/v30/Zhang13.html}] and [\textit{S.-B. Lin} et al., J. Mach. Learn. Res. 18, Paper No. 92, 31 p. (2017; Zbl 1435.68273)]), bias corrected regularization kernel network (see [\textit{Z.-C. Guo} et al., J. Mach. Learn. Res. 18, Paper No. 118, 25 p. (2017; Zbl 1435.68260)]), and distributed ridge regression with imperfect kernels (see [\textit{H. Sun} and \textit{Q. Wu}, J. Mach. Learn. Res. 22, Paper No. 171, 34 p. (2021; Zbl 07415114)]). In this paper, the distributed learning scheme is that linear regression is applied to each sample subset, the results are communicated by a weighted average of the parameters. By introducing a general linear functional framework, estimation and prediction is studied in a unified way. Some key phenomena are discovered. Firstly, the one-step averaging can not be optimal. Secondly, different learning and inference problems are affected differently by the distributed framework. Thirdly, the asymptotic efficiencies have simple forms which are often universal. Fourthly, sample iterative parameter averaging mechanisms can reduce the error efficiently.

0 references

0 references

Mathematics Subject Classification ID

0 references

0 references

0 references

0 references

0 references

0 references

zbMATH DE Number

0 references

zbMATH Keywords

distributed learning

0 references

high-dimensional linear regression

0 references

parallel computation

0 references

random matrix theory

0 references

describes a project that uses

0 references

0 references

MaRDI profile type

MaRDI publication profile

0 references

0 references

0 references

Spectral analysis of large dimensional random matrices

0 references

Divide and conquer in nonstandard problems and the super-efficiency phenomenon

0 references

Distributed testing and estimation under sparse high dimensional models

0 references

0 references

0 references

Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers

0 references

Communication lower bounds for statistical estimation problems via a distributed data processing inequality

0 references

Quantile regression under memory constraint

0 references

A split-and-conquer approach for analysis of

0 references

Random Matrix Methods for Wireless Communications

0 references

A Deterministic Equivalent for the Analysis of Correlated MIMO Multiple Access Channels

0 references

All convex invariant functions of hermitian matrices

0 references

WONDER: Weighted one-shot distributed ridge regression in high dimensions

0 references

High dimensional robust M-estimation: asymptotic variance via approximate message passing

0 references

Dual Averaging for Distributed Optimization: Convergence Analysis and Network Scaling

0 references

On robust regression with high-dimensional predictors

0 references

Distributed estimation of principal eigenspaces

0 references

Deterministic equivalents for certain functionals of large random matrices

0 references

Communication-Efficient Distributed Statistical Inference

0 references

0 references

Convex Analysis on the Hermitian Matrices

0 references

0 references

0 references

0 references

Distributed Subgradient Methods for Multi-Agent Optimization

0 references

Primal-dual subgradient methods for convex problems

0 references

Eigenvalue Distributions of Sums and Products of Large Random Matrices Via Incremental Matrix Expansions

0 references

On the optimality of averaging in distributed statistical learning

0 references

Spectral convergence for a general class of random matrices

0 references

0 references

A Massive Data Framework for M-Estimators with Cubic-Rate

0 references

0 references

Adaptive distributed methods under communication constraints

0 references

Distributed asynchronous deterministic and stochastic gradient optimization algorithms

0 references

Random Matrix Theory and Wireless Communications

0 references

Distributed inference for quantile regression processes

0 references

On early stopping in gradient descent learning

0 references

0 references

Divide and Conquer Kernel Ridge Regression: A Distributed Algorithm with Minimax Optimal Rates

0 references

A partially linear framework for massive heterogeneous data

0 references

0 references

Sitelinks

Mathematics(1 entry)

mardi Publication:2039793

Retrieved from "https://portal.mardi4nfdi.de/w/index.php?title=Item:Q2039793&oldid=37078138"