Flat Minima

From MaRDI portal
Publication:3123284

DOI10.1162/neco.1997.9.1.1zbMath0872.68150OpenAlexW2912811302WikidataQ34422981 ScholiaQ34422981MaRDI QIDQ3123284

Sepp Hochreiter, Jürgen Schmidhuber

Publication date: 6 March 1997

Published in: Neural Computation (Search for Journal in Brave)

Full work available at URL: https://doi.org/10.1162/neco.1997.9.1.1



Related Items

Deep networks on toroids: removing symmetries reveals the structure of flat regions in the landscape geometry*, Global optimization issues in deep network regression: an overview, ‘Place-cell’ emergence and learning of invariant data with restricted Boltzmann machines: breaking and dynamical restoration of continuous symmetries in the weight space, On Different Facets of Regularization Theory, Machine learning the kinematics of spherical particles in fluid flows, Archetypal landscapes for deep neural networks, The inverse variance–flatness relation in stochastic gradient descent is critical for finding flat minima, Lipschitzness is all you need to tame off-policy generative adversarial imitation learning, Unnamed Item, Geometric characterization of the Eyring-Kramers formula, Lotka-Volterra model with mutations and generative adversarial networks, Diametrical risk minimization: theory and computations, Optimization for deep learning: an overview, Unnamed Item, Unnamed Item, Interpretable machine learning: fundamental principles and 10 grand challenges, A spin glass model for the loss surfaces of generative adversarial networks, Noise-induced degeneration in online learning, Minimum description length revisited, Unnamed Item, Entropy-SGD: biasing gradient descent into wide valleys, Universal statistics of Fisher information in deep neural networks: mean field approach*, Wide flat minima and optimal generalization in classifying high-dimensional Gaussian mixtures, Structure-preserving deep learning, Hausdorff dimension, heavy tails, and generalization in neural networks*, Entropic gradient descent algorithms and wide flat minima*, Adaptive regularization parameter selection method for enhancing generalization capability of neural networks, Prediction errors for penalized regressions based on generalized approximate message passing



Cites Work