On data preconditioning for regularized loss minimization
From MaRDI portal
Publication:285940
DOI10.1007/S10994-015-5536-6zbMATH Open1357.68190arXiv1408.3115OpenAlexW1518555605MaRDI QIDQ285940FDOQ285940
Authors: Tianbao Yang, Rong Jin, Shenghuo Zhu, Qihang Lin
Publication date: 19 May 2016
Published in: Machine Learning (Search for Journal in Brave)
Abstract: In this work, we study data preconditioning, a well-known and long-existing technique, for boosting the convergence of first-order methods for regularized loss minimization. It is well understood that the condition number of the problem, i.e., the ratio of the Lipschitz constant to the strong convexity modulus, has a harsh effect on the convergence of the first-order optimization methods. Therefore, minimizing a small regularized loss for achieving good generalization performance, yielding an ill conditioned problem, becomes the bottleneck for big data problems. We provide a theory on data preconditioning for regularized loss minimization. In particular, our analysis exhibits an appropriate data preconditioner and characterizes the conditions on the loss function and on the data under which data preconditioning can reduce the condition number and therefore boost the convergence for minimizing the regularized loss. To make the data preconditioning practically useful, we endeavor to employ and analyze a random sampling approach to efficiently compute the preconditioned data. The preliminary experiments validate our theory.
Full work available at URL: https://arxiv.org/abs/1408.3115
Recommendations
- Average stability is invariant to data preconditioning. Implications to exp-concave empirical risk minimization
- Dual space preconditioning for gradient descent
- Faster kernel ridge regression using sketching and preconditioning
- Weighted SGD for \(\ell_p\) regression with randomized preconditioning
- Accelerated proximal stochastic dual coordinate ascent for regularized loss minimization
Cites Work
- The elements of statistical learning. Data mining, inference, and prediction
- Pegasos: primal estimated sub-gradient solver for SVM
- Adaptive subgradient methods for online learning and stochastic optimization
- Title not available (Why is that?)
- Title not available (Why is that?)
- Introductory lectures on convex optimization. A basic course.
- On the use of stochastic Hessian information in optimization methods for machine learning
- Sparsity and incoherence in compressive sampling
- Iterative Solution Methods
- Improved analysis of the subsampled randomized Hadamard transform
- Revisiting the Nyström method for improved large-scale machine learning
- Weighted SGD for \(\ell_p\) regression with randomized preconditioning
- Erratum to: ``Minimizing finite sums with the stochastic average gradient
- A proximal stochastic gradient method with progressive variance reduction
- Stochastic dual coordinate ascent methods for regularized loss minimization
- Stochastic gradient descent, weighted sampling, and the randomized Kaczmarz algorithm
- ``Preconditioning for feature selection and regression in high-dimensional problems
Cited In (4)
- Sufficient dimension reduction for a novel class of zero-inflated graphical models
- Preconditioning meets biased compression for efficient distributed optimization
- Utilizing second order information in minibatch stochastic variance reduced proximal iterations
- Average stability is invariant to data preconditioning. Implications to exp-concave empirical risk minimization
Uses Software
This page was built for publication: On data preconditioning for regularized loss minimization
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q285940)