Distributed linear regression by averaging (Q2039793)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Distributed linear regression by averaging
scientific article

    Statements

    Distributed linear regression by averaging (English)
    0 references
    0 references
    0 references
    0 references
    5 July 2021
    0 references
    Distributed machine learning systems have been receiving increasing attentions for their efficiency to process large-scale data. The distributed regression via the divide and conquer approach consists of three stages. Firstly, the data is partitioned into multiple subsets. Then a base regression algorithm is applied to each subset to learn a local regression model. Finally the local models are averaged to generate the final regression model for the purpose of predictive analytics or statistical inference. This approach is computationally efficient because the second stage can be easily parallelized. Also, because the local model training does not require mutual communication between the computing nodes, it can largely preserve privacy and confidentiality. Recently, research on the statistical properties and learning performance of distributed regression has attracted increasing attention. The asymptotically minimax optimal learning rate in many situations has been verified for kernel ridge regression (see [\textit{Y. Zhang}, \textit{J. Duchi} and \textit{M. Wainwright}, ``Divide and conquer kernel ridge regression'', in: Conference on Learning Theory. 592--617 (2013), \url{https://proceedings.mlr.press/v30/Zhang13.html}] and [\textit{S.-B. Lin} et al., J. Mach. Learn. Res. 18, Paper No. 92, 31 p. (2017; Zbl 1435.68273)]), bias corrected regularization kernel network (see [\textit{Z.-C. Guo} et al., J. Mach. Learn. Res. 18, Paper No. 118, 25 p. (2017; Zbl 1435.68260)]), and distributed ridge regression with imperfect kernels (see [\textit{H. Sun} and \textit{Q. Wu}, J. Mach. Learn. Res. 22, Paper No. 171, 34 p. (2021; Zbl 07415114)]). In this paper, the distributed learning scheme is that linear regression is applied to each sample subset, the results are communicated by a weighted average of the parameters. By introducing a general linear functional framework, estimation and prediction is studied in a unified way. Some key phenomena are discovered. Firstly, the one-step averaging can not be optimal. Secondly, different learning and inference problems are affected differently by the distributed framework. Thirdly, the asymptotic efficiencies have simple forms which are often universal. Fourthly, sample iterative parameter averaging mechanisms can reduce the error efficiently.
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    distributed learning
    0 references
    high-dimensional linear regression
    0 references
    parallel computation
    0 references
    random matrix theory
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references