Dynamics of stochastic gradient descent for two-layer neural networks in the teacher–student setup*
From MaRDI portal
Publication:5857458
DOI10.1088/1742-5468/abc61eOpenAlexW3113714439MaRDI QIDQ5857458
Madhu S. Advani, Andrew M. Saxe, Lenka Zdeborová, Florent Krzakala, Sebastian Goldt
Publication date: 1 April 2021
Published in: Journal of Statistical Mechanics: Theory and Experiment (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1906.08632
Related Items
Align, then memorise: the dynamics of learning with feedback alignment*, Towards interpreting deep neural networks via layer behavior understanding, Free dynamics of feature learning processes, High‐dimensional limit theorems for SGD: Effective dynamics and critical scaling, Symmetry \& critical points for a model shallow neural network
Uses Software
Cites Work
- Unnamed Item
- Unnamed Item
- Unnamed Item
- Gradient descent optimizes over-parameterized deep ReLU networks
- Mean field analysis of neural networks: a central limit theorem
- Statistical Mechanics of Learning
- Generalization in a linear perceptron in the presence of noise
- On-line backpropagation in two-layered neural networks
- 10.1162/153244303321897690
- Learning by on-line gradient descent
- A mean field view of the landscape of two-layer neural networks