On the Effect of the Activation Function on the Distribution of Hidden Nodes in a Deep Network
From MaRDI portal
Publication:5214413
DOI10.1162/NECO_A_01235zbMath1494.68243arXiv1901.02104OpenAlexW2980481748WikidataQ90718212 ScholiaQ90718212MaRDI QIDQ5214413
Publication date: 7 February 2020
Published in: Neural Computation (Search for Journal in Brave)
Full work available at URL: https://arxiv.org/abs/1901.02104
Related Items (1)
Cites Work
- Unnamed Item
- Unnamed Item
- Bayesian learning for neural networks
- Gradient descent optimizes over-parameterized deep ReLU networks
- A central limit theorem for convex sets
- On the Effect of the Activation Function on the Distribution of Hidden Nodes in a Deep Network
- Wide neural networks of any depth evolve as linear models under gradient descent *
This page was built for publication: On the Effect of the Activation Function on the Distribution of Hidden Nodes in a Deep Network