An Asymptotic Analysis of Random Partition Based Minibatch Momentum Methods for Linear Regression Models
From MaRDI portal
Publication:6180738
Abstract: Momentum methods have been shown to accelerate the convergence of the standard gradient descent algorithm in practice and theory. In particular, the minibatch-based gradient descent methods with momentum (MGDM) are widely used to solve large-scale optimization problems with massive datasets. Despite the success of the MGDM methods in practice, their theoretical properties are still underexplored. To this end, we investigate the theoretical properties of MGDM methods based on the linear regression models. We first study the numerical convergence properties of the MGDM algorithm and further provide the theoretically optimal tuning parameters specification to achieve faster convergence rate. In addition, we explore the relationship between the statistical properties of the resulting MGDM estimator and the tuning parameters. Based on these theoretical findings, we give the conditions for the resulting estimator to achieve the optimal statistical efficiency. Finally, extensive numerical experiments are conducted to verify our theoretical results.
Cites work
- scientific article; zbMATH DE number 4015993 (Why is no real title available?)
- scientific article; zbMATH DE number 7306878 (Why is no real title available?)
- A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems
- A Stochastic Approximation Method
- A proximal stochastic gradient method with progressive variance reduction
- A split-and-conquer approach for analysis of
- A statistical perspective on algorithmic leveraging
- Acceleration of Stochastic Approximation by Averaging
- Asymptotic and finite-sample properties of estimators based on stochastic gradients
- Deep learning
- Distributed Estimation for Principal Component Analysis: An Enlarged Eigenspace Analysis
- Distributed estimation of principal eigenspaces
- Distributed inference for linear support vector machine
- Distributed semi-supervised learning with kernel ridge regression
- Distributed simultaneous inference in generalized linear models via confidence distribution
- Distributed testing and estimation under sparse high dimensional models
- Divide and conquer local average regression
- Forward regression for ultra-high dimensional variable screening
- High-dimensional probability. An introduction with applications in data science
- High-dimensional statistics. A non-asymptotic viewpoint
- Learning Bounds for Kernel Regression Using Effective Data Dimensionality
- Lectures on convex optimization
- Linear Statistical Inference and its Applications
- Mathematical Statistics
- Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods
- One-step sparse estimates in nonconcave penalized likelihood models
- Online Covariance Matrix Estimation in Stochastic Gradient Descent
- Online bootstrap confidence intervals for the stochastic gradient descent estimator
- Optimization methods for large-scale machine learning
- Quantile regression under memory constraint
- Regularized estimation of large covariance matrices
- Renewable estimation and incremental inference in generalized linear models with streaming data sets
- Some methods of speeding up the convergence of iteration methods
- Statistical foundations of data science
- Statistical inference for model parameters in stochastic gradient descent
This page was built for publication: An Asymptotic Analysis of Random Partition Based Minibatch Momentum Methods for Linear Regression Models
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6180738)