Optimization for deep learning: an overview
From MaRDI portal
Publication:2218095
DOI10.1007/s40305-020-00309-6zbMath1463.90212OpenAlexW3034315405MaRDI QIDQ2218095
Publication date: 12 January 2021
Published in: Journal of the Operations Research Society of China (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1007/s40305-020-00309-6
Related Items (5)
Initial state reconstruction on graphs ⋮ Linearly Constrained Nonsmooth Optimization for Training Autoencoders ⋮ Drift estimation for a multi-dimensional diffusion process using deep neural networks ⋮ Random-reshuffled SARAH does not need full gradient computations ⋮ Non-convex exact community recovery in stochastic block model
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Unnamed Item
- First-order methods of smooth convex optimization with inexact oracle
- Gradient descent optimizes over-parameterized deep ReLU networks
- Mean field analysis of neural networks: a central limit theorem
- Adaptive restart for accelerated gradient schemes
- Local minima and convergence in low-rank semidefinite programming
- A sensitive-eigenvector based global algorithm for quadratically constrained quadratic programming
- Efficiency of Coordinate Descent Methods on Huge-Scale Optimization Problems
- Reducing the Dimensionality of Data with Neural Networks
- Flat Minima
- Fast Curvature Matrix-Vector Products for Second-Order Gradient Descent
- Randomized Methods for Linear Constraints: Convergence Rates and Conditioning
- Generalization Error in Deep Learning
- Two-Point Step Size Gradient Methods
- Restart procedures for the conjugate gradient method
- Numerical Optimization
- Accelerated Methods for NonConvex Optimization
- Theoretical Insights Into the Optimization Landscape of Over-Parameterized Shallow Neural Networks
- Optimization Methods for Large-Scale Machine Learning
- A mean field view of the landscape of two-layer neural networks
- Spurious Valleys in Two-layer Neural Network Optimization Landscapes
- Effect of Depth and Width on Local Minima in Deep Learning
- Reconciling modern machine-learning practice and the classical bias–variance trade-off
- Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization
- Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview
This page was built for publication: Optimization for deep learning: an overview